A New State-of-the-Art Text-to-Speech Program - Install & Testing MaskGCT

  Рет қаралды 3,634

Jarods Journey

Jarods Journey

Күн бұрын

Links referenced in the video:
MaskGCT Demo Page - maskgct.github...
Phonemizer fix - github.com/ope...
Laptop that I use:
Lenovo Thinkpad P1 Gen 7 - bit.ly/4haIt46
Hardware for my PC:
Graphics Card - amzn.to/3pcREux
CPU - amzn.to/43O66Ir
Cooler - amzn.to/3p98TwX
RAM - amzn.to/3NBAsIq
SSD Storage - amzn.to/42NgMFR
Power Supply (PSU) - amzn.to/430bIhy
PC Case - amzn.to/447499T
Mother Board - amzn.to/3CziMXI
Alternative prebuilds to my PC:
Corsair Vengeance i7400 - amzn.to/3p64r22
MSI MPG Velox - amzn.to/42MnJHl
Cheapest and PC recommended:
Cyberpower 3060 - amzn.to/3XjtZoP
Come join The Learning Journey!
Discord - / discord
Github - github.com/Jar...
TikTok - / jarodsjourney
If you found anything helpful, please consider supporting me and the content I am trying to produce!
www.buymeacoff...

Пікірлер: 47
@Jarods_Journey
@Jarods_Journey 5 күн бұрын
Timestamps: 2:00 - Running Through Install 11:05 - Start of Testing the TTS
@dathuynh-l4k
@dathuynh-l4k 5 күн бұрын
Can gtx 1660 run this program?
@SuperSigma69
@SuperSigma69 4 күн бұрын
Brokie get to work n get a 4090
@rons96
@rons96 4 күн бұрын
for maskGCT i found that best results come with short reference audios. so, if you want to multiple emotion in one audio, generate it separately for each emotion the merge all chunks. your reference audio should be around 3 seconds and need to carry only one emotion.
@Jarods_Journey
@Jarods_Journey 4 күн бұрын
Awesome, thanks for the recommendation!
@StefanReich
@StefanReich 5 күн бұрын
Ah, these installation troubles. The hardest thing in computing today: Making pieces of software find the files they need. (I've tried mbrola etc too...)
@Jarods_Journey
@Jarods_Journey 2 күн бұрын
90% of my time especially when installing things from GitHub 😅
@SyntheticVoices
@SyntheticVoices Күн бұрын
Can't wait for fine-tuning. I've heard ChatGPT can generate training code if you provide the paper and code details.
@Jarods_Journey
@Jarods_Journey Күн бұрын
Mmph... It's a quite a bit more involved than that. You still have to architect it, it will fail to generate code that completely works without you knowing what to tweak
@4.0.4
@4.0.4 4 күн бұрын
You've been my recent source of TTS news! Really neglected is voice recognition. Did anything dethrone Whisper for Japanese?
@fulldivemedia
@fulldivemedia Күн бұрын
Thanks again for the great work, love your channel
@JustRayen
@JustRayen 5 күн бұрын
16:53 Bro sounded like he manifested the voices in his head for a second XD
@Jarods_Journey
@Jarods_Journey 5 күн бұрын
FR, or possibly, the model is trying to send us a subliminal message 😅
@megaaziib
@megaaziib 4 күн бұрын
the quality of the cloning is insanely good, but the vram usage is insane too. it using 15gb vram from my rtx 4060ti, the generation speed is quite fast 8-13 seconds.
@Jarods_Journey
@Jarods_Journey 4 күн бұрын
Yeah, the VRAM usage is prohibitively large unfortunately, if it was 11, it might be more attractive
@xbon1
@xbon1 Күн бұрын
we need one click windows installer... but thank you for being first tutorial for how to do this on windows. I might actually try to do this now. edit: oh god this is way more complex than I thought with vscode edits, any chance you can publish the edited files & full tutorial somewhere?
@dhillaz
@dhillaz 4 күн бұрын
1:00 sounds like the narrator of youtube channel "Branch Education" (the one that does CGI teardowns of various modern technology)
@flykiller
@flykiller 5 күн бұрын
Thanks again. If GPT-SoVITS weren't so undertrained (5k hours total for all 5 languages), it would probably be the best local model. It is great, but it struggles to pronounce difficult words or made-up ones like sci-fi alien words. Did you try training GPT-SoVITS further, as you did with F5?
@Jarods_Journey
@Jarods_Journey 5 күн бұрын
I did do light finetuning of GPT-SoVITS and it's one of the better options out there for dialogue IMO. Not as much training as F5 though
@jasonstetsonofficial
@jasonstetsonofficial 5 күн бұрын
Much better cloning of voice and audio quality
@rawkeh
@rawkeh 3 күн бұрын
Do you happen to know of any text-to-voice app/method that can change the voice? Like 50% Voice A, 30% Voice B, 20% Voice C? I want a voice that is not just a clone someone's voice
@dustinsuburbia
@dustinsuburbia 5 күн бұрын
The style transfer could be useful if it is higher qual than RVC for generation emotion samples for F5
@SyamsQbattar
@SyamsQbattar 17 сағат бұрын
What is the difference between Aphion and F5-TTS?
@AnasKhan-r8f8m
@AnasKhan-r8f8m Сағат бұрын
I am still waiting for your tutorial on train gpt-sovits in new language.
@speergz6452
@speergz6452 5 күн бұрын
this might be a dumb question but i remember someone (may have been jarod) playing around with tts software that would default to a random voice or at least had that capability if not used for voice cloning. I'm trying to find a tts software that can do random voices rather than cloning but i havent found anything like that yet.
@BeyondPC
@BeyondPC 4 күн бұрын
TortoiseTTS can do that.
@chronicallychill9979
@chronicallychill9979 4 күн бұрын
How much time did you waste trying to get this whole thing installed? Could have just made a VM or container, no?
@Jarods_Journey
@Jarods_Journey 4 күн бұрын
Hmm, probably like 2-3 hours just chasing packages around, mostly, just trying to get mbrola to work to no avail which is why I just skip it lol. It would've gone much smoother if I just installed it in wsl2 but I wanted to see what was wrong with windows
@stonedape4538
@stonedape4538 5 күн бұрын
Looking forward to finetuning it in another language 😀 Hope it will be faster than f5.
@JessicaSinclairDoomsday2024
@JessicaSinclairDoomsday2024 5 күн бұрын
which free ai tts that sounds like conversation instead of reading a book tts?
@py_man
@py_man 4 күн бұрын
Notebook lm
@jr-2nd
@jr-2nd 5 күн бұрын
Still sad that there are no free spanish TTS with that quality...
@John-zy8ut
@John-zy8ut 2 күн бұрын
can u please show us how its done with google collab or huggingfaces
@Random_person_07
@Random_person_07 5 күн бұрын
This is awesome!
@ItayHilel
@ItayHilel 5 күн бұрын
real time? finetuning new language?
@GraveUypo
@GraveUypo 5 күн бұрын
so would you rank the offline options nowadays?
@Jarods_Journey
@Jarods_Journey 5 күн бұрын
Will be ranking the Top 5 in December for local options
@YannMetalhead
@YannMetalhead 3 күн бұрын
Good video.
@coc1841
@coc1841 5 күн бұрын
Not too bad!
@BBZ101
@BBZ101 5 күн бұрын
Is rtx 3060 enough for this mdoel ?
@Jarods_Journey
@Jarods_Journey 5 күн бұрын
I don't think so due to maxing out at around 18 gb on inference when using it in the vid (22 was shown, but 4 was already being used so 18 just for MaskGCT)
@BBZ101
@BBZ101 5 күн бұрын
@Jarods_Journey thank you 😊
@vixxcelacea2778
@vixxcelacea2778 4 күн бұрын
Oh man, same here. I hope they make a version it so that it can still run. I don't care if it's slow to produce results. This is definitely one of the better ones I've heard.
@jonogrimmer6013
@jonogrimmer6013 5 күн бұрын
Change your duration
@Jarods_Journey
@Jarods_Journey 5 күн бұрын
Hmm, probably
Open Source AI Audiobook Maker - Installation and Usage
36:26
Jarods Journey
Рет қаралды 6 М.
NEW Open Source Model for Emotional Text to Speech
5:48
Jarods Journey
Рет қаралды 29 М.
快乐总是短暂的!😂 #搞笑夫妻 #爱美食爱生活 #搞笑达人
00:14
朱大帅and依美姐
Рет қаралды 13 МЛН
Farmer narrowly escapes tiger attack
00:20
CTV News
Рет қаралды 12 МЛН
This Game Is Wild...
00:19
MrBeast
Рет қаралды 192 МЛН
Why I Cant Stand IDE's After Using VIM | Prime Reacts
17:51
ThePrimeTime
Рет қаралды 380 М.
Runway Just Changed Video Forever...Again.
11:35
Theoretically Media
Рет қаралды 51 М.
Ollama on Kubernetes: ChatGPT for free!
18:29
Mathis Van Eetvelde
Рет қаралды 8 М.
PirateSoftware getting Trolled by TTS
21:04
FunnyFerret
Рет қаралды 751 М.
NVIDIA’s New AI: Stunning Voice Generator!
6:21
Two Minute Papers
Рет қаралды 100 М.
Anime Dere Voice Acting with Advanced ChatGPT Voice
22:30
Jarods Journey
Рет қаралды 35 М.
I Can't BELIEVE They Let Me in Here!
23:37
Linus Tech Tips
Рет қаралды 1,5 МЛН
Dear Game Developers, Stop Messing This Up!
22:19
Jonas Tyroller
Рет қаралды 732 М.
Andrew Tate Hacked: The Fallout
5:59
The PC Security Channel
Рет қаралды 39 М.
M4 Mac Mini CLUSTER 🤯
18:06
Alex Ziskind
Рет қаралды 293 М.