"I don't want to train my customized voice" I want a cool and reliable male voice to read my textbooks or novels. Preferably running on local lan and accesible via TTS server for android.
@davidtindell9503 ай бұрын
New Subscriber. Thank You for TTS Tutorial Vids!
@swannschilling4745 ай бұрын
So nice to have you back! 😊
@gab98475 ай бұрын
Wow, this EmoCTRL is just what I need
@megamayo25005 ай бұрын
To be fair, Bark did the whole emotion TTS a long time ago. I still consider Bark the best for emotion TTS. The problem is that no one uses Bark. There's so much potential there. I think the issue is the transformer that it was built around. It gives out random results frequently. This should be considered good progress. As, an unexpected emotion is better than a predictable emotion.
@NoidoDev5 ай бұрын
Suno is built on Bark. At least I remember having that read.
@mactheo25745 ай бұрын
Along with the random results that does not follow text, we can't control Bark's generation either. EmoCtrlTTS is absolutely amazing, it's like having controlnet (for stable diffusion) but for voice generation. Being able to generate laughter and emotion in a controlled manner is crazy.
@NoidoDev5 ай бұрын
@@mactheo2574 Thanks, I'll look into it. Didn't have time to watch the video yet.
@mactheo25745 ай бұрын
@@NoidoDev Unfortunately EmoCtrlTTS is made by microsoft and closed source without any plan of releasing. The vid mentioned that. But I'm sure there will be open source alternatives in a couple of years at most.
@Jarods_Journey5 ай бұрын
Bark was great, different use case though for sure. It was quite unstable and the quality wasn't great in many cases unfortunately. Something like emo-ctrl TTS I thing will be mostly utilized for redubbing as opposed to TTS, the ability to do this would be great
@agenticmark5 ай бұрын
Vall-E wont be released, so it cant be verified and we will never be able to use it.
@Jarods_Journey5 ай бұрын
The original system, true, I do believe the reproductions are faithful to the paper though so we'll have to see
@gregorymccollum91075 ай бұрын
Thanks for keeping us updated on TTS software. Nicely done. 😀
@rplgrime80065 ай бұрын
Thanks for the update!
@salmon_enjoyer5 ай бұрын
Could you make an u video on how to use Luna Hook instead of textractor for visual novels?
@donmarshal20705 ай бұрын
Can you convert Ai Voice into Sapi5 so you can make it default windows voice?? If you have procedure, let me know 🤗
@Jarods_Journey5 ай бұрын
Unfortunately, no that I'm aware of
@sownheard5 ай бұрын
:D yeah i love to learn more
@andreaaaaaaa5745 ай бұрын
Does anyone know of a FREE voice trainer that still works? MANGIO do not work anymore
@manymen3145 ай бұрын
Totally unrelated to your video :D but, i've trained a model with finetune XTTS, and i am happy with the results it mimics how the orginal speaks. now i've been trying to use RVC over the audio generated by XTTS to change the voice but it keep chaning the accent and how words are pronounced. am i doing something wrong? i just want to change the voice is RVC the wrong thing to use?
@agenticmark5 ай бұрын
you need to fine tune your rvc model on the wav files first i also use xtts for realistic results xtts -> rvc -> out
@manymen3145 ай бұрын
@@agenticmark the issue is my rvc model voice is different than the voice generated by XTTS i am currently using models i've downloaded from voicemodel
@Jarods_Journey5 ай бұрын
Well, you need to use the same wav files you used to train xtts to train an RVC model. If you just download an RVC model from online and try converting it, your going to lose many aspects of the original xtts voice
@Cocina_animal5 ай бұрын
What do you say about waveglow and tacotron2?
@Jarods_Journey5 ай бұрын
Well, those are old papers. A lot of newer stuff uses those as a reference and have higher quality output than those
@schmutz065 ай бұрын
I found the woosh effects extremely distracting. Thank you for the video, I recommend removing those transition sound effects!
@Jarods_Journey5 ай бұрын
Glad you noticed this, I rewatched and I also found it quite distracting, which is something I usually leave out 😅. Feedback accepted :)!
@schmutz065 ай бұрын
@@Jarods_Journey yep no hard feelings at all. Your video is very helpful, I've been playing with L3.1 8b locally and I've got my own mic > LLM > TTS setup with mid-sentence interruptions etc working and I have it 'commentating' on a breakout style game based on evaluating in-game performance and producing original feedback to the player - it's so much fun and I've barely started toying with it. I'm not a coder, but the fundamentals make sense to me have been able to make anything I want using Claude 3.5 sonnet and GPT 4o as 'copilots' - I think it's removed huge layers of 1. having to know the right syntax, and 2. knowing 'art of the possible' ; these two barriers stopped me ever wanting to dig into coding prior to LLMs. I'm really having fun with it. I've been using gTTS, a local TTS (which uses the basic microsoft voices) and elevenlabs. I'm looking all over for the most cutting edge and performant local TTS options. Running a 3080 Ti and will absolutely grab whatever Nvidia GPU comes next (5xxx) because the prospect of locally running accelerated and performant 70b and better type models and doing all sorts of stuff is the most exciting thing in a long time. With your video, I think I wanted to cleanly hear many samples with 'silence' in between. Silence so I have some seconds to process and reflect, I guess. The woosh just plugged that gap and made it feel congested! I'm also mindful that complaining about 'free' youtube content, in particular where you've clearly made this to HELP, is a sensitive game... but in the funniest way we are so spoilt for choice and resources these days, comes with that some strange sense of 'entitlement' to flag issues. No hard feelings again! absolutely grateful for this contribution and pulling together the video.
@Ravisidharthan4 ай бұрын
Man give me a professionally sounding tts,,,, offline,,,, pls....