Voice Cloning For Any Language | Fine-Tuning Tortoise-TTS

Voice Cloning For Any Language | Fine-Tuning Tortoise-TTS | Part 2

Рет қаралды 2,840

Martin Thissen

Күн бұрын

Пікірлер: 16

@samasai8860 2 ай бұрын

Great work Martin! Please make a video on how to finetune Tortoise TTS for our own voice

@tempertephra Ай бұрын

Der Schritt ab 3:00 "Adjust Interference Code" fehlt in dem google colab link in der Videobeschreibung. Könntest du bitte deinen Tokenizer teilen, da du deine anderen Modelle ja auch geteilst hast?

@tempertephra Ай бұрын

Edit tokenizer ist im github fork

@Dheekshith-e3l 5 ай бұрын

Great work ! But i have a doubt on where is the voice cloning part there in this video ? Does the speaker name that u gave is fed to the model in the BTS or is that a pre trained voice of tortoise TTS?

@omaribrahim5519 9 ай бұрын

Bro this is great! Please watch a video on how to collect your own dataset for your language from public data

@ZYJGO 8 ай бұрын

Hi Martin, thank you for your great work! it really solved a lot of my confusion, I'd like to know what your final loss is, the model is speaking a completely incomprehensible language after training, and i didn't change any parameter

@nickk1039 7 ай бұрын

Hi @ZYJGO. Could you find the solution? I went the same way without any changes and after 6000 steps model still speaks the strange mix of languages. Will appreciate if you share the reason.

@ZYJGO 7 ай бұрын

@@nickk1039 yes, you just simply need to train more, around 20,000 steps will generate pretty good results, hope it helps you

@nickk1039 7 ай бұрын

@@ZYJGO thank you so much!

@iqrabatool1814 3 ай бұрын

Hey. @ZYJGO @nickkk1039 . I trained it for 10000 steps. For the first 5000 steps the output sounded like German language. After that it starts to sound like an incomprehensible language. Isn't it overfitting? Should I continue training or change dataset?

@ondatabletstore6116 7 ай бұрын

Hello my friend! in setting if I uncheck the "delete non final-output" option the individual audio files sound bad while the large combined one sounds good. I would like to know if there is a way to make the individual files also sound good?