Raspberry Pi | Local TTS | High Quality | Faster Realtime with Piper TTS

Рет қаралды 31,288

Күн бұрын

Пікірлер: 105

@yeanangel6205 Жыл бұрын

🎯 Key Takeaways for quick navigation: 00:51 Piper *is a fast, locally running neural text-to-speech system optimized for Raspberry Pi 4.* 01:20 Piper *supports integration with home assistant, a popular smart home software, allowing for voice control.* 02:02 To *set up Piper, download and unzip the Piper executable, then download an international text-to-speech model of your choice.* 03:21 Using *Piper is straightforward; input text is taken from standard input, piped to the Piper process along with the selected model, and the synthesized output is saved.* 04:10 Piper *generates output audio faster than real-time, with a real-time factor (RTF) value less than one, showcasing its efficiency on small compute devices like the Raspberry Pi.* Made with HARPA AI

@ThorstenMueller Жыл бұрын

Thank you 😊. I'm not sure, but did you do this "Harpa AI" magic on another of my videos, too? As this is really helpful so comment is pinned.

@andododo 9 ай бұрын

Thank you man! I've tried a bunch of free TTS for my raspi project: espeak, flite, pyttsx3, and some other, they all sound robotic and unnatural for me. Piper TTS is just so good and it's surprisingly fast in the raspi 4. One thing to note, the voice model downloads doesn't come with the json file now so you have to grab it yourself.

@ThorstenMueller 9 ай бұрын

Thanks for your nice comment 😊. Hasn't it always been two downloads - onnx model and json config file?

@CarlinComm 20 күн бұрын

Wow, that's great, thanks for showing this! Subscribed :)

@ThorstenMueller 17 күн бұрын

Thanks for your nice feedback and welcome 😊.

@cyclicalobsessive Жыл бұрын

Thank you for posting the video. I am in the process of building a new robot and wanted a better quality TTS engine/voice for my WaLi: Wall follower Looking for Intelligence robot. I chose arctic-medium at 50% higher sample rate. (Current robots use espeak-ng voices.) Loving rhasspy/piper-tts!

@ThorstenMueller Жыл бұрын

You're welcome, i'm glad that you can use piper-tts for your robot 😊.

@wichawt3079 Жыл бұрын

oh dear down another rabbit hole i go

@ThorstenMueller Жыл бұрын

I'm sorry 😆

@toddd.8496 6 ай бұрын

This comment made me laugh so hard. Glad I'm not the only one! :)

@coding32111 2 ай бұрын

I’ve been down a rabbit hole for the last 6 hours cuz nothing is downloading right

@kumarbhatia6566 Жыл бұрын

Fantastic video. Learned a great deal about Linux and this tool. Thank you for posting it.

@ThorstenMueller Жыл бұрын

Thanks a lot for your nice feedback and glad you find it useful 😊.

@MyHeap Жыл бұрын

Would you consider a video using piper recording studio, to create your own voice recordings? I tried it but got stuck. Fortunately Mike took my recordings as a donation and may be available in the next release. But i would still like to see the full process. I am running Ubuntu 20.04 Thanks for the videos. I find them very helpful. Joe

@ThorstenMueller Жыл бұрын

Thanks for your voice contribution 👏. I've added Piper-Recording-Studio on my TODO list. In the meantime do you know Mimic-Recording-Studio? kzbin.info/www/bejne/aoq3aYqQicSrapo

@u_cuban Жыл бұрын

I second this 🙌🏻

@u_cuban Жыл бұрын

Also, how to use other datasets for piper? 1150 sentences is a big time commitment and i already have previous transcribed datasets I'd like to use if possible.

@Vito_0912 Жыл бұрын

Would you say it comes close to the quality of Coqui TTS? Coqui is good, but it takes a long time to initialize. For small short sentences that always have to be regenerated, rather bad. What about the German voice? Thanks ^^

@ThorstenMueller Жыл бұрын

Mostly depending on the models. Some Coqui TTS models are probably better than Piper models, some the other way around. But quality is subjective. I think, that longer time to synthesize for most Coqui TTS models is an argument for more quality. Piper supports multiple german models, including mine 😉.

@TheSolarString Жыл бұрын

Thanks for a good introduction video. Is there a way to make the speaker take a pause after a saying a word? For instance, if I want the speaker to give me a list of items: You need: A rope, (pause), Scissors, (pause), Paper, (pause), and a flashlight.

@ThorstenMueller Жыл бұрын

Thanks for your nice feedback 😊. I'm not sure if this works in Piper, but this should work in Mimic 3 by Mycroft AI (same developer as Piper). Mimic 3 supports SSML syntax. I've created a tutorial about Mimic 3, maybe it's useful for you. kzbin.info/www/bejne/mHS9nYZsfp1nfdE

@TheSolarString Жыл бұрын

@@ThorstenMueller Thanks, I'll have a look at it!

@hungph Жыл бұрын

Hello, would you be able to provide instructions on installing Piper TTS, and guiding through the process of training a voice using available data on Windows, please?

@ThorstenMueller Жыл бұрын

Thanks for your feedback and topic suggestion. Right now Piper is not officially supported on Windows. But maybe i can make a tutorial for Docker or WSL. What do you think?

@hungph Жыл бұрын

@@ThorstenMueller That’s great, I’m really looking forward to your step-by-step guide with Docker. Hope you will complete the video soon, thank you.

@fred1459 Жыл бұрын

Hallo Thorsten, danke für das super Video. Ich versuche gerade Piper in meiner Homeassistant installation mittels dem Addon Store zu installieren. Leider kommt immer die Fehlermeldung: Dieses Add-on ist nicht mit dem Prozessor oder Betriebssystem deines Geräts kompatibel. Verwende einen Raspberry 4 mit 2 GB Arbeitsspeicher. Eigentlich sollte es doch kompatibel sein. Das einzige was ich gesehen habe, ist das ich nur 32 bit habe. Kannst Du mir da vielleicht helfen?

@ThorstenMueller Жыл бұрын

Freut mich, dass Dir das Video gefällt 😊. Ob das Problem an einer 32 Bit Version liegt kann ich nicht beurteilen. Ich habe gerade mal bei den Piper Issues nach deinen Problem gesucht und einen Beitrag für eine spezielle 32 Bit Version gefunden. Habe nicht genauer reingeschaut, aber vielleicht hilft es Dir ja schon etwas weiter. github.com/rhasspy/piper/issues/67#issuecomment-1593594543

@fred1459 Жыл бұрын

@@ThorstenMueller hallo Thorsten, danke für deine schnelle Antwort und deine Hilfe. die 32 bit version die du mir geschickt hast, kann ich aber nicht als add on für HA installieren oder? Sorry bin nicht so fit in den sachen. Kann ich die Version denn "manuell" installieren und dann mit HA verknüpfen? Oder weisst Du ob ich den Raspberry "einfach" auf 64 bit bekomme?

@ThorstenMueller Жыл бұрын

@@fred1459 Gute Frage, als HA Addon habe ich die 32 Bit Variante noch nicht benötigt. Vielleicht kannst Du die Frage mal in der HA oder Rhasspy/Piper Community stellen - die können Dir sicher besser weiterhelfen, als ich das aktuell kann 😊.

@DoubleBob Жыл бұрын

This looks good! Could you make a tutorial on how to do the voice cloning/learning for Piper?

@ThorstenMueller Жыл бұрын

Thanks 😊 and yes - this will be one of my next video tutorial topics.

@DoubleBob Жыл бұрын

@@ThorstenMueller Glad to hear! Subscribed.

@ThorstenMueller Жыл бұрын

@@DoubleBob Thank you and welcome 😊.

@ThorstenMueller Жыл бұрын

Hi DoubleBob, just to keep you updated. The Piper TTS voice clone tutorial is now online 😊: kzbin.info/www/bejne/mJDalpKgosZlaJI

@DoubleBob Жыл бұрын

@@ThorstenMueller Very cool! Thanks for notifying me.

@ThomasKaufmann-w9j 11 ай бұрын

Hallo, ich Versuche gerade Piper auf einem Raspberry pi Zero zum laufen zu bringen. Beim Run bekomme ich Maschinenfehler bzw Speicherzugriffsfehler. Läuft das auf dem Zero nicht? Ich nutze die Piper armv7

@ThorstenMueller 11 ай бұрын

Hallo, das ist eine gute Frage. Bin nicht komplett sicher und habe auch keinen Pi Zero zum testen. Ich meine aber mal gelesen zu haben, dass es darauf nicht funktioniert.

@Trashpanda_404 Жыл бұрын

Does this work for a chatbot application? If it’s so small I wonder if you built an app could you offload the processing onto the smart phone or tablet. 🤔

@ThorstenMueller Жыл бұрын

Piper TTS models has nice performance, that could work for a chatbot. Depending on the performance of your computer you're running Piper TTS on. By now i guess Piper can not run on an android or ios device.

@gersonfer 8 ай бұрын

Thanks a lot! I think it will be useful on my slot car project. Any advice on how to obtain as fast as possible real time response (less than 1 second)?

@ThorstenMueller 8 ай бұрын

Most Piper TTS models can be used faster than real time and there's a stream feature available or coming next 😊.

@Rayterni 11 ай бұрын

Is it possible to run a Pipe server locally, so I can make it's voices available to be used with other programs, such as TTSVoiceWizard? Sorry if it's double comments, I think my other one got deleted because of the link. Thanks in advance for any help.

@ThorstenMueller 11 ай бұрын

That's a good question. I'll ask your question (Piper TTS server process) during my interview with Mike, so stay tuned for this video to (hopefully) get an answer 😊.

@Rayterni 11 ай бұрын

Awesome, thank you so very much! @@ThorstenMueller

@Bonk1971 Жыл бұрын

Would cloning make this more expressive or just copy the voice characteristics?

@ThorstenMueller Жыл бұрын

I'm not sure if i understand your question right. If you clone your voice (eg. with Piper TTS) it will try to copy your voice as you pre-recorded it in your audio voice dataset. It will not add any emotions or expressions. Is this what you meant? Btw. this is my tutorial on voice cloning with Piper TTS: kzbin.info/www/bejne/mJDalpKgosZlaJI

@Bonk1971 Жыл бұрын

@@ThorstenMueller I have been trying to train Tortoise but to no avail. I thought maybe this would be a good second option. I am just looking for a better cadence or prosody, so it does not sound so robotic. I tried to implement the steps you provided in your new tutorial on training Piper but I keep getting errors. I will take another stab at it next week. Thank you for all your work on these tutorials they have helped me.

@kevinschilling6813 Жыл бұрын

That was nice.. Thanks

@ThorstenMueller Жыл бұрын

You're welcome 😊.

@jimmys3575 Жыл бұрын

Piper TTS is a very good TTS for the pipeline I am developing, but I want to be able to edit the source code so it doesn't print out messages everytime it runs. Can you do a tutorial where you build it from source?

@ThorstenMueller Жыл бұрын

I've never tried building it manually myself, but as it's open source there should be a way to setup a build pipeline locally. But maybe it's worth to open an "issue" on the project to add a switch to turn on/off that message.

@Mortos592 Жыл бұрын

Hey @ThorstenMueller nice video, the voice is indeed quite decent, do you by any chance know or can point to a tutorial on how to configure speech dispatcher with piper?

@ThorstenMueller Жыл бұрын

Thanks for your nice feedback. I'm not sure what you mean with "speech dispatcher"?

@Mortos592 Жыл бұрын

Hey @@ThorstenMueller, to be honest, I'm not quite sure yet, I'm fairly new to the TTS world but from what I have found, it seems to be a Linux only thing, like a wrapper that allows to use any TTS engine in a uniform way for all the other programs, most of the programs use this to synthesize audio from text

@alexanderyang126 11 ай бұрын

Hallo Thorsten, ich habe all die Schritte im Video gefolgt. Nachdem ich den letzten Befehl ausgeführt hatte, kam eine unerwartete Meldung: -bash: ./piper: Is a directory Und ich habe nochmal geschaut, der heruntergeladene Folder heißt "piper", und darin enthält die binary Datei "piper". Das heißt, die benötigte Datei befindet sich im ~/piper/piper/piper. Die Struktur ist ein bisschen verwirrend. Und es gibt einen Fehler bei Benennung von Voice "en_GB-alba-medium" (nicht mit diesem Video zu tun, vielleicht kannst du an Michael Hansen sagen?): .onnx heißt "en_GB-alba-medium.onnx", aber die zwei anderen heißen "en_en_GB_alba_medium_MODEL_CARD" und "en_en_GB_alba_medium_en_GB-alba-medium.onnx.json". Es sieht nicht ganz richtig aus.

@alexanderyang126 11 ай бұрын

Übrigens, jetzt sollte in meinem Fall den Befehl zum Testen so aussehen: echo 'Welcome to the world of speech synthesis!' | ./piper/piper --model en_GB-alba-medium.onnx --config en_GB-alba-medium.onnx.json --output_file welcome.wav

@ThorstenMueller 11 ай бұрын

Ja, auf die verschachtelte Ordnerstruktur und dem Namen "piper" bin ich auch schon reingefallen 😆. Ich habe gerade mal im Huggingface Download der Piper Stimmen geschaut und da sehen die Dateinamen für mich richtig aus oder schaue ich an der falschen Stelle? huggingface.co/rhasspy/piper-voices/tree/v1.0.0/en/en_GB/alba/medium

@alexanderyang126 10 ай бұрын

Die Namen auf Website sind richtig. Mein PC hat die Namen automatisch geändert. Ich weiß nicht, warum das passiert. :\@@ThorstenMueller

@ThorstenMueller 10 ай бұрын

@@alexanderyang126 Das ist merkwürdig. Bei Sonderzeichen oder Leerzeichen im Dateinamen hätte ich mir vorstellen können, dass dein Computer die Pfade automatisch anpasst (maskiert), aber in den Dateinamen sollte das eigentlich nicht der Fall sein.

@4reasons4 11 ай бұрын

I followed your steps exactly and it did not run, i get piper is a directory which it is so i moved everything into piper/piper and get a different error. I even made the piper executable but then got a further error. Very frustrating, but thank you for posting this

@ThorstenMueller 11 ай бұрын

Can you run the piper executeable with an "--help" command line argument? Does this show the Piper help output?

@4reasons4 11 ай бұрын

@@ThorstenMueller yes i can run ./piper --help from within the piper directory and the help file file displays. The error indicates there is a problem with the JSON file

@ThorstenMueller 11 ай бұрын

@@4reasons4 A syntax error in JSON file or a wrong value for a specific key?

@4reasons4 11 ай бұрын

@@ThorstenMueller I'm not sure Ill have to check when I get chance, but ill raise the issue on the git pages, thanks

@MrZongwei Жыл бұрын

how to fix the mac install problem【ERROR: Could not find a version that satisfies the requirement piper-phonemize (from versions: none)】

@ThorstenMueller Жыл бұрын

I guess, i've seen your issue on their Github repo ;-). Have you already seen this issue with some maybe helpful ideas? github.com/rhasspy/piper/issues/27

@MrZongwei Жыл бұрын

thanks , i run it success in the ubuntu@@ThorstenMueller

@anujpai 4 ай бұрын

What are the minimum specification required to run this?? 1GB ram of raspberry Pi 4b enough??

@ThorstenMueller 4 ай бұрын

My tests have been some times ago, but imho i got acceptable/good performance results (depending on use case) on my Raspberry Pi 3b.

@InterstellarLord Жыл бұрын

How can we make a TTS read a pdf book for us?

@ThorstenMueller Жыл бұрын

That's a highly asked question and requested feature 😊. PDF input is not supported by now. Maybe you can save the PDF als text and then split it into smaller chunks that can be synthesized. But it's by now more like a workaround. Maybe a good idea to discuss that on Piper TTS Github community - maybe a feature for that will be implemented in future.

@ErdemYldrmer 11 ай бұрын

Do you think this will work on RPi Zero? What do you think the performance will be?

@ThorstenMueller 11 ай бұрын

From what i've read a RPI Zero is not supported by it's architecture, but as i don't have a Pi Zero i cannot try it myself.

@Jilianne99 11 ай бұрын

How do I use this in python instead of local command line?

@ThorstenMueller 11 ай бұрын

Piper TTS Python integration is not optimal by now - i've talked to Mike on this. But this question/topic will be part of my interview with him (creator of Piper TTS).

@mattye12 10 ай бұрын

Very useful video, thanks a lot! I wonder if there is something like this for Windows?

@ThorstenMueller 10 ай бұрын

Yes, Piper TTS runs fine on Windows, too 😊. I made a video on how to set it up too. Do you know this? kzbin.info/www/bejne/fXjZlaRpnM6cirMsi=iZGkYcOY2FjqdVGn

@joeyfaris231 3 ай бұрын

@@ThorstenMueller Does it work on MacOS?

@ThorstenMueller 2 ай бұрын

@@joeyfaris231 Yes. You can download the packages here: github.com/rhasspy/piper/releases/tag/2023.11.14-2

@jhin6324 2 ай бұрын

thanks a lot

@ThorstenMueller 2 ай бұрын

You're welcome :)

@redrox7657 Жыл бұрын

Wann kommt de_DE-thorsten-medium für Homeassistant? Kann man diese Version irgendwo herunterladen? Anhören geht, Download nirgends gefunden

@ThorstenMueller Жыл бұрын

Hi, ich stehe im Austausch mit Michael Hansen und möchte demnächst ein Piper "Thorsten (high)" trainieren. Danach kann ich ja mal die Aufnahme in Home Assistant klären.

@redrox7657 Жыл бұрын

@@ThorstenMueller uiii! Das ist ja perfekt. Ich freu mich drauf. Bislang definitiv die beste deutsche Stimme 👏🏻😍 War sehr begeistert als ich sie das erste mal gehört habe. Und das bereits auf „low“

@ThorstenMueller Жыл бұрын

@@redrox7657 Das freut mich natürlich sehr 😊. Nur um dich auf dem Laufenden zu halten. Das Thorsten-Voice "high" Model-Training läuft aktuell 🙂.

@b1ll1on_ai Жыл бұрын

Hello! How are you. I need to train a model with a Spanish accent from Argentina (I have good GPUs) any clue on how to achieve it? Thank you genius.

@ThorstenMueller Жыл бұрын

Thanks - i'm doing fine, hope you do too 😊. Do you have access to a useable voice dataset in spanish with argentinian accent?

@b1ll1on_ai Жыл бұрын

@@ThorstenMueller yes I have ! train many successful models with so-vits-svc , but another approach. work with between 15 and 45 minutes of samples to get good results. I am looking for how to make a high quality txt to speech in Argentine. which is the procedure? how long for samples? thank you .

@ThorstenMueller Жыл бұрын

@@b1ll1on_ai I guess i can't answer that. When you have good results with 15-45 minutes of training data maybe this might be a working value for an Argentinian model too.

@LeviAckerman-gt6ll 7 ай бұрын

how to use it in python code?

@ThorstenMueller 7 ай бұрын

I've talked with Mike Hansen about that feature recently and there's still work-in-progress when it comes to native python usage. But you can run it as extra process and use the results this way. Does this help you?

@baolitoumu 10 ай бұрын

great !

@ThorstenMueller 10 ай бұрын

Thanks 😊.

@andreasrobert4085 Жыл бұрын

absolutely terriffic ! Not only your video, but especially Piper. It's super easy to include in python code. (For me it took a while to find out how..) For my impression, the thorsten-medium voice runs a little bit too fast (word to word), I'd still like to know how to slow it down.

@ThorstenMueller Жыл бұрын

Thanks for your feedback :-). Maybe you can post process the output to reduce speed with ffmpeg or pydub. Might this be an option? Btw. i'm training a thorsten-high model at the moment.

@urishmueli284 Жыл бұрын

Hey andreas! I am quite struggling with including piper in python actually. I run the executable no problem, but I would like it to be incorporated in my python program so as the loading time would only happen once (upon startup), and keep it in memory. From your experience, is this indeed possible? If so, any chance of a short explanation? Thanks a ton :) Uri

@cyclicalobsessive Жыл бұрын

"it took a while to find out how" Please post a link to an example. My robot would like this. If the delay is too long, I'm thinking to search a folder of past TTS and only call Piper if the phrase has not already been generated

@andreasrobert4085 Жыл бұрын

@@urishmueli284 kzbin.info/www/bejne/aKKmhatshKumb5I

@andreasrobert4085 Жыл бұрын

@@cyclicalobsessive kzbin.info/www/bejne/aKKmhatshKumb5I