Bonjour, j'aimerais bien entrainer le modèle sur un dataset très très large (librispeech, qui fait plus de 100GO), comment pourrais-je faire ça sur le cloud ? Je pense que le streaming est compliqué j'ai rien rien compris au code original de l'entrainement...
@RaspiAudioКүн бұрын
Si vos fichiers sons sont déjà retranscris en texte il suffit de les mettre dans le bon format, autrement faire un Whisper Je pensais faire une vidéo pour faire ça dans le cloud, mais pour entraîner sur 100go ça coûtera très cher!
@lullu34678 сағат бұрын
@@RaspiAudio J'aimerais bien financer cela, seriez vous prêts à entrainer un modèle multilingual et multispeaker (avec language token, j'ai remarqué que le modèle avait du mal avec le cross lingual...) Avez vous un contact ?
Good video, Thanks for the contribution. One quick qs: This is used when you want to add a new language but suppose u want to use it for voice cloning, how will it work?
@SyamsQbattar9 күн бұрын
Unfortunately, it does not support Indonesian language.
@RaspiAudio8 күн бұрын
Find large audio books or audio file of minimum 10h in your language and train it
@circuitdotlt9 күн бұрын
how do you add a speaker to the server? Mine did not appear.
@jonathanoostenbrink678311 күн бұрын
I get in my info: transcribe complete samples : 0 path : C:\F5-TTS\F5-TTS\src\f5_tts\..\..\data\my_speak_char\wavs error files : 5
@RaspiAudio10 күн бұрын
your path is wrong
@mulagraphics15 күн бұрын
Don't waste your time F5-TTS is horrible I'm sorry
@RaspiAudio15 күн бұрын
@@mulagraphics it's not, what else do you recommend?
@bomar92014 күн бұрын
Actually I trained new language under 2 hours data . It’s very good 👍. I don’t know which script could do that
@christopherandrew172014 күн бұрын
@@bomar920 which language do you use? is it 1 speaker/multi?
@atomeprod18 күн бұрын
Bonjour M.Raspiaudio et merci grandement pour cette super vidéo explicative - je crois que vous êtes le seul à proposer un modèle en français (pour l'instant - donc vous êtes le PREMIER), Je vais vous parler français car je pense que c'est votre langue maternelle, comme moi ;) Comme je souhaite tester votre modèle - en local sur mon PC - avec Pinokio (et son module e2-F5-TTS), je ne comprends pas ou mettre le fichier Bernard_Reduced.py et comment configurer le truc pour que le modèle soit utilisable. Si vous avez le temps de me répondre, cela me serait d'une très haute utilité...
@windowszone159318 күн бұрын
Actually its not mac its just dummy as if it was mac then we could see the main board under the keyboard but galliums dissolves the aluminum and we saw its hollow use logic!!😇
@Pacifier122218 күн бұрын
Salut! Je suis en train de faire un training français avec le corpus Mozilla de 800k fichiers. J'ai 20 epoch sur 40 d'effectué. Je t'en donnerai des nouvelles. Par contre, F5-TTS contient certains bogues. J'ai dû créer des dossiers comme "french" quand j'avais déja french_char de créé.
@Pacifier122218 күн бұрын
J'ai aussi un sample de 8k de fichiers en quebecois pour être plus régional!
@RaspiAudio18 күн бұрын
@@Pacifier1222 ça serait vraiment cool si vous pouvez entraîner sur la base de mon checkpoint de cette manière on pourrait conjuguer les efforts plutôt que repartir de zéro à chaque fois
@Pacifier122218 күн бұрын
@@RaspiAudio En fait, j'avais déja 20 epoch de fait au final. J'ai décidé d'en refaire 20 autres. je trouvais qu'il y avait une tonalité sur certains mots incorrectes. J'ai déja 1 semaines de fait dessus, alors c'est sûr que je ne voudrais pas trop recommencer.
@RaspiAudio18 күн бұрын
What hardware are you using?
@Pacifier122218 күн бұрын
@@RaspiAudio Nvidia 3090, AMD 5950x et 64GB de ram
@NineSevenPictures18 күн бұрын
Bonjour. Merci pour cette vidéo très instructive, sans oublier cet accent bien de chez nous. ;-)
@RaspiAudio18 күн бұрын
Link updated. In the last version of F5tts in the web interface select "custom" and enter theses path: MODEL_CKPT: hf://RASPIAUDIO/F5-French-MixedSpeakers-reduced/model_last_reduced.pt VOCAB_FILE: hf://RASPIAUDIO/F5-French-MixedSpeakers-reduced/vocab.txt
@NineSevenPictures8 күн бұрын
@@RaspiAudio Merci beacoup.
@AndrasEliassen20 күн бұрын
Thank you for this video - very informative! I laughed so hard at the mistake: "stupid female voice" 🤣but I think it's probably safe from the "Internet police" 🚔 I will use your tutorial to see if I can train a new language with this tech 👍
@petrkolacek895820 күн бұрын
Thank you. Your video helped me a lot. Before I tried train language from scratch and I was not successful. So Ill try your guide.
@32112358020 күн бұрын
What are computer characteristics required to train model?
@RaspiAudio20 күн бұрын
I'm using an rtx 4090, but I would like to make a google collab so anyone could train in the cloud on a pay per use base
@32112358020 күн бұрын
@RaspiAudio OK thanks
@mauricio958120 күн бұрын
Great video and great explanation! I hope you do more tutorials like these in the future :) Would you say F5 is the best Open Source TTS in the market?
@RaspiAudio20 күн бұрын
I think so as it is a bit more flexible than xtts to add different tones, btw I'm not associated with F5tts team just a random guy trying to fin a good TTS
@par71al21 күн бұрын
Be honest, Whos here after the corridor video?
@Bryzerse21 күн бұрын
Corridor sent me - but whyyyy???????????
@BongoBaggins21 күн бұрын
Stop saying what.
@normioffi23 күн бұрын
Français originel?
@RaspiAudio23 күн бұрын
@@normioffi oui oui
@normioffi23 күн бұрын
Génial ça
@TheMame8224 күн бұрын
Thank you for this work. Seems your result is more close to zero shot voice cloning, than the one Jarod trained in his video tutorial (he used ~10 hours single speaker). Just to get it right, the 80k samples you used where all from the same reader (single speaker)? This would mean: 1) few hours, single speaker --> model speaks new language, but only for reference speaker from training data 2) many hours, single speaker --> model generalizes new language (zero shot capability) 3) many hours, multi speaker, multi language (as for base model) -> proper voice cloning, code switching within single text
@RaspiAudio23 күн бұрын
@@TheMame82 it's hard to make conclusion at that point as there is not enough data. After training with one speaker for 80k for a consistent learning I'm fine-tuning with 90k samples of multiple speakers hoping that it will help with zero shot flexibility, I will publish results.
@cyberbol24 күн бұрын
How long I need record my voice ? How you think ? Minimum training data ?
@RaspiAudio24 күн бұрын
@@cyberbol the reference recording (the voice to clone) could be very short like 10s. But if you need to train a new language you will need I think at least 20 hours of audio.
@cyberbol24 күн бұрын
@@RaspiAudio Ohh. Yes I wish train, Thank you. The problem with a clone is that it not working for other like EN and Chinese. I want use Polish so I don't have a option , need do model I think
@johnscottguggenheimeАй бұрын
Most impressive is understanding that accent ^^
@1will2000will1Ай бұрын
Person 2 comes back in the room: "I stole all of the laptops internals too!"
@exthus143Ай бұрын
what? What? What? WHAT? 0:34
@InCoGNITOXX595Ай бұрын
This cannot be gallium my friend. Gallium needs at least 38 degrees Celsius to become a liquid...
@tonytwostep_Ай бұрын
Excellent option, LMS is the best open-source multi-room solution, but having a dedicated PI or TV box as the squeezlite clients has always been tough to justify keeping them powered on 24/7, but with a micro-controller, I feel like the cost would be worth it. Does this board have an audio out port? Or is it only using a specific pin out for audio (quality would be poor)
@LilybellfanАй бұрын
Being an android and windows fan this video literally satisfies me 😅🤣🤣
@andrewyeates6811Ай бұрын
Bro how long does it take you to make a coffee 😂
@keenanandchasemonstertruck1026Ай бұрын
you are pure evil
@acizko922 ай бұрын
People: I hate Apple Gallium: I can help you with aluminium Titanium: Hold my beer
@3randomidiotsonyoutube2 ай бұрын
You broke a Mac for only 211,414 views and 4.2K likes? Liking and watching now
@tamildesan8372 ай бұрын
Thanks for sharing the configuration. Recently I made one using esp32 along with DAC. It also plays very nicely.
@james_ye4h17772 ай бұрын
is a *NEEOOOOOOOOOoo* a meme?
@okra30002 ай бұрын
She's Ai generated lol
@lucaswells33552 ай бұрын
o
@marcuscarana92402 ай бұрын
Oh its just a macbook casing.
@im_no_mad_official3 ай бұрын
Someone will do this to their worse. Make someone buy a new pricey mac, spill that in the box and leave for good.
@Mscool_123 ай бұрын
Noooooooooooooooooooooooooo
@bylescx3 ай бұрын
"Ever wondered how to elevate your raspberry pi experience?". I don't think 12 year olds on tiktok are your target audience so drop the crappy vertical video :/
@RaspiAudio3 ай бұрын
@@bylescx we love trying new things! But you are probably right,😊
@bylescx3 ай бұрын
@@RaspiAudiofair enough! I found the video on your website and was interested in the product but I clicked back as soon as I saw that, it was a real put off. sorry!!
@febianabisya53533 ай бұрын
where is the engine though
@UnknownUserSet4 ай бұрын
Did You Know That? Gallium Can Break Down Metals. (It Doesn't React On Your Skin)
@TheFallensChannel4 ай бұрын
Use windows
@Glorious_Yell0w4 ай бұрын
Ohhh whaaaat it ate all the inside to and made em disapear into thin air. Chicken shit not to use a real laptop instead of the uper cover.
@dingusboi70454 ай бұрын
now it's even more recyclable
@CNChillguy4 ай бұрын
"It's just a prank bro" The prank:
@ompoddar41204 ай бұрын
Aw jeez mann i accidently through Gallium on my macbook, what can i even do right now..???
@NightcoreEmpire4 ай бұрын
looks like it's a fake macbook, there's nothing in it, you didn't turn it on either