AI Voice Cloning for Singing with RVC

AI Voice Cloning for Singing with RVC - Guide and Set-up

Рет қаралды 281,973

Жыл бұрын

Links referenced in the video:
RVC Github - github.com/RVC-Project/Retrie...
Curate and Record Data Samples - • Complete Guide: AI Voi...
Download UVR - • Complete Guide: AI Voi...
Come join The Learning Journey!
Discord - / discord
Github - github.com/JarodMica
TikTok - / jarodsjourney
If you found anything helpful, please consider supporting me and the content I am trying to produce!
www.buymeacoffee.com/jarodsjo... |
Hardware for my PC:
Graphics Card - amzn.to/3pcREux
CPU - amzn.to/43O66Ir
Cooler - amzn.to/3p98TwX
RAM - amzn.to/3NBAsIq
SSD Storage - amzn.to/42NgMFR
Power Supply (PSU) - amzn.to/3NBAsIq
PC Case - amzn.to/447499T
Mother Board - amzn.to/3CziMXI
Alternative prebuilds:
Corsair Vengeance i7400 - amzn.to/3p64r22
MSI MPG Velox - amzn.to/42MnJHl
Cheapest and minimum specs recommended:
Cyberpower 3060 - amzn.to/3XjtZoP

Пікірлер: 912

@TantuBeats 10 ай бұрын

so much respect to everyone who is making this work.. the amount of problems I'm running into is insane, haha. I hardly know where to start after hours of being into this.

@luqmanhaqim97 Жыл бұрын

Nice one, keep up the good work. Your instructions is very clear and helpful compared to others. 👍 ✨

@solm8212 5 ай бұрын

thank you sooo much, all the other tutorials were so confusing and this was simple and fast, encountered some problems while running the rvc command prompt since i dont have a gpu, but i installed cuda and python and that fixed it. its like now you need to know programming and stuff but this tutorial was easy, fast and simple. keep up the good work.

@TheDailyMemesShow 10 ай бұрын

I'm going crazy with Jarod's channel 😂 I'm that off the cliff with it that I'm running into rewatching old videos😂

@raykrislianggi 8 ай бұрын

For those of you looking for the "weights" folder in the main RVC directory, as of RVC1006, it's inside the "assets" folder.

@pingusmcdingus5124 8 ай бұрын

Nothing is placed here after training a model though. Do I manually copy the D_*.pth or G_*.pth over from logs, or something? If I try that and click Refresh Voice List and Index Path, the new model appears in the Inferencing Voice list, but when I select it I just see a red 'Error' all over the UI: i.imgur.com/QNQUpmq.png

@raykrislianggi 8 ай бұрын

@@pingusmcdingus5124 In my case, the .pth file is placed there automatically if it successfully finished the training without any errors. If it's not the case for you, there might be something wrong in the middle of the process. You might want to try retracing the steps or redo it from scratch. The one thing I did differently from this video is that my audio file for training is not split up into multiple short .wav files, but I just combine them into a single 20-minute file. I've compared both the cut and uncut audio and the result is much better with the uncut 20-minute audio.

@realon 7 ай бұрын

Thx for advice

@ohheyvoid 4 ай бұрын

thanks! :)

@wektorus Жыл бұрын

Finally a tutorial that even I can understand. It's so stupid that most of the tutorials are made as everyone was that tech savvy. Thank you so much.

@Jarods_Journey Жыл бұрын

Appreciate it 🤟🤟

@smokinmoose2 Жыл бұрын

I wish i could say the same. I'm just a singer. I want a program that installs, I hit the .exe file, it opens, I put the source files in and voila, new voice. Don't know why that should be so hard.

@linuxtuxvolds5917 Жыл бұрын

@@Jarods_Journey I can't stress enough how important it is to absolutely tell people that the training process will take a long time. I thought my progress was just stuck but no, it's just taking a long while!

@LovelyNyx7 Жыл бұрын

@@linuxtuxvolds5917I will wait as long as it takes. If it means I get to sound like someone's voice I really enjoy!

@paleguywithdonuts 11 ай бұрын

@@Jarods_Journey it says "No supported Nvidia GPU found, use CPU instead" but it still opened

@arhythwrith Жыл бұрын

For those who would like to know the harmony bit in 5:11 Harmony is when there's more than one note being sang at the same time It's kinda like chords but for vocals. HP5 Helps with separating harmony but it will be less clear on the voice compared to HP2. The newer RVC2 also has dereverb & deecho which I also highly recommend using to make the vocal separation even more clear for songs where the voice has a lot of reverb / echo. I'd say just mess around with it a bit and choose to your liking depending on the song. Anyways have nice day :D

@Tarbard Жыл бұрын

Thanks for the videos, they are fascinating.

@stevecommand77 Жыл бұрын

Well convinced after the preview. Hope you can have video on text to own vocal speech soon.😊

@matthewedwards904 Жыл бұрын

@8:03 if your process fails when you try to process the input data one possible explanation is that the path for your folder includes a space. That is what hung up my first couple of attempts. make sure your file path doesn't include any spaces for easiest handling.

@Hestia3332 11 ай бұрын

thank you! I took the spaces out of the song name and it worked for me!

@Primesky 11 ай бұрын

Thank you m8

@ChaseEverything 10 ай бұрын

Still not working for me. It says :( ['trainset preprocess_pipeline_print.py', 'C:\\RVC-beta-0528\\RVC- beta0717\\voice\\me', '40000', '12', 'C:\\RVC-beta-0528\\RVC- beta0717/logs/me', 'False'] C:\RVC-beta-0528\RVC-beta0717\voice\me/myself.m4a->Suc. end preprocess C:\RVC-beta-0528\RVC-beta0717\voice\me/myself.m4a->Suc. end preprocess

@ScorgeRudess 11 ай бұрын

Dude, you are amazing! Thanks for your great work!

@obamabinbiden9762 Жыл бұрын

This worked perfectly. Thank you.

@SplicerTv 11 ай бұрын

Thanks for the great tutorial! I found a couple things that might be helpful to others. For extracting the archive I use the official 7Zip software, its free and open source and will save you some hassle. Next thing, is regarding the batch size. I have a 3090ti which has 24GB of VRAM I find a value of 32 makes use of 21.7GB of the VRAM and leaves a bit for OS related stuff. You don't want to go overboard with batch size of 40, or the gpu will start swapping to system RAM, and significantly affect the time it takes to train even if you have fast RAM, it's still an I/O cycle you can avoid between GPU / System RAM. I recommend looking at task manager or using a tool like nvidia-smi to check the GPU VRAM use and experiment with batch size to find the best value for your card in order to get much faster training.

@JeanIbarz 7 ай бұрын

Thanks for sharing ! Small tip: using cut/paste instead of copy/paste allows moving the folder instantaneously ;)

@LucasMarak Жыл бұрын

RVC is best for me thanks Jarod take care

@RobertJene Жыл бұрын

12:42 1. Open file explorer to the folder that has a file who's path you want 2. Press Alt+D 3. Press End 4. Type a backslash \ 5. Start typing the name of the file, look for the autocomplete with the correct name, press down arrow until the correct file is highlighted 6. Press Ctrl+C

@Optimus97 Жыл бұрын

Or you could Shift-Rightclick to unhide "Copy As Path" option

@RobertJene Жыл бұрын

@@Optimus97 I prefer to use the mouse as little as possible

@fluffsquirrel Жыл бұрын

@@RobertJene I can kinda see what you're saying, especially with the delay of the context menu in Windows 10/11.

@RobertJene Жыл бұрын

@@fluffsquirrel any keyboard sequence you do will save time not reaching for the mouse

@fluffsquirrel Жыл бұрын

@@RobertJene I think this is generally true, although the less sequences the better, if possible.

@the3fe245 Жыл бұрын

thanks mate, all of the other people i looked up as tutorials were too complicated, a month ago i viewed your so vits svc fork tutorial too, you are one of the best teachers in the world, i can understand your videos perfectly and my native language isnt even english!

@gabrielmorgan3369 Жыл бұрын

For those who are having trouble choosing where the download goes you can right click it and choose save link as

@321Engage28 Жыл бұрын

It worked. Thanks so much!

@shaysilver203 Жыл бұрын

Great one! Finally works!

@RobertJene Жыл бұрын

9:33 when I train embeddings for stable diffusion (image generation) I have it save an embedding file every 50 steps so I can check the loss and strength of them with scripts and test a few

@Jarods_Journey Жыл бұрын

I've been finding with these speech models that the intermediary saves don't really exhibit abilities better than the final model, so I really just save the last one only in order to save space. I haven't found one yet that has been overtrained.

@Nangel2 Жыл бұрын

Thank you for taking the time to make this tutorial! It was so easy to follow. :) Could I ask you to make a comment or tutorial on how to re-train a previously trained voice? I can't find that information anywhere.

@Jarods_Journey Жыл бұрын

Let me know if this was what you were thinking about: kzbin.infoeO0gvi_RXTc?feature=share

@Nangel2 Жыл бұрын

@@Jarods_Journey That's exactly what I was looking for, tysm!

@DPIConnor Жыл бұрын

oh my god this is so awesome

@warsin8641 Жыл бұрын

This abosulte legend amongst men

@animeui_es Жыл бұрын

Great job!. I have a question for you... How many audios do you recommend me to generate the model, and they are not problem if the audios have some background sound?

@Jarods_Journey Жыл бұрын

10 minutes or more of high quality audio. You need to split the background from the audio samples and can check my latest video on that

@nycdweller4287 Жыл бұрын

Hi, thanks for your video. Are there already some pre-trained models for RBC? Also, is there a reason you prefer to train locally rather than on collab?

@Jarods_Journey Жыл бұрын

I'm not sure about fully pre-trained models, you'll have to take a look around the internet to see. Colab is a nightmare to work with for debugging, etc and unless you made the code, trying to debug it isn't that fun. If I can work locally, I much prefer it and my hardware allows for it.

@darksydeflow Жыл бұрын

niiiice thank you for the video :D

@androidgameplays4every13 Жыл бұрын

Thank you, thanks to your tutorial I finally succeed at creating my own models! even with only 4gb of memory in my gtx 1650 Super.

@Jarods_Journey Жыл бұрын

Awesome! They do say that it can work on smaller amounts of VRAM so glad that this worked!

@schoodst6095 Жыл бұрын

how did you get it to work on low ram? mine is eating up 6gb really quick and shuts down cause it run out, do I lower batch size?

@titrecords2294 Жыл бұрын

Mine keeps running out of memory how did you do it? Please help

@schoodst6095 Жыл бұрын

@@titrecords2294 lower the batch size, like a lot

@MohamedAdel-kw4hx 11 ай бұрын

Thx , but I can't find pth file after training.

@krysidian Жыл бұрын

That was very nice to follow along, thanks! Any interest in showcases bark ai? I think it's a pretty interesting way of doing tts but I don't think it's very well explained in many places or left out a lot that kinda confused me. Especially when it comes to getting decent results. Do think the prompting idea is really intriguing though

@Jarods_Journey Жыл бұрын

My quick experience with bark is that it's still in very early stages, excited to see where it goes though! I might have to do a more throughout test of it, but tortoise tts by far is the most promising and easiest to use

@krysidian Жыл бұрын

@@Jarods_Journey That's definititely true. Tortoise is incredible!. Really hope bark will update or get some cool successors with a similar but more stable approach. Making it generate laughs sighs etc. is spooky and very fun.

@Jarods_Journey Жыл бұрын

@@krysidian I'm definitely interested in the laughing part. That's one additional touch to AI that is lacking in voices and when that gets fleshed out, things are gonna get interesting xD!

@Cyborg11 Жыл бұрын

Thanks for your very good tutorial Jarod. I still have a question. What do the values "loss_disc", "loss_gen", "loss_fm", "loss_mel" and "loss_kl" mean when training? Which values are indicating a good trained model? Are lower values better?

@Jarods_Journey Жыл бұрын

A downloads slope on the graph is better, or lower values. You wanna look for total loss and train till that's as low as possible preferably

@paarthsingh Жыл бұрын

can someone plz fix this error , jarods plz tell thisError : ValueError: invalid literal for int() with base 10: 'voice' this error i get when i do process data its step2a error : when i put my local URL into path folder

@Snackbarry 2 ай бұрын

damn as a complete beginner coming to this channel to have it being explained like this was really..... interesting....

@michaelteuber7362 Жыл бұрын

Thanks a lot for the video! One question: 40kHz is a pretty unusual sample rate so I want to use 48kHz (which now also seems to work with v2). Also I slice up the training vocals manually with a DAW (Cubase) into up-to-10-seconds snippets. Do I have to export the snippets in 48 kHz already from the DAW or would the usual 44,1 kHz be alright and only the output (the resulting file) would be in 48 kHz?

@Jarods_Journey Жыл бұрын

It'll be fine, I believe RVC resamples your audio already using ffmpeg to the correct SR. I actually haven't verified this, but since it handles my datasets when using either 40k or 48k, that means it doesnt really matter :)

@michaelteuber7362 Жыл бұрын

@@Jarods_Journey Thanks for your fast reply! So there's a tiny bit of hope that if you feed it 48kHz already it might skip the resampling which could probably result in higher quality oputput 🙂.

@joemmaama Жыл бұрын

For the voice me folder, is it just audio recordings of my own voice? if so how many do i need to include and what length? Thanks In advance youre a massive help dude

@Jarods_Journey Жыл бұрын

Yup, as shown, make sure the folder contains all of the audio files without subfolders. Then just use that path for those and you should be fine

@SirMato Жыл бұрын

bro tysm my brain could not process how to do that on my own

@ericleigh007 8 ай бұрын

if you want to move the folder faster, just rename the top folder, then cut and paste the lower into the top-level. When you cut and paste the contents, explorer knows it only MOVES the folder, so no copy wait.

@denblindedjaligator5300 Жыл бұрын

i have some questions. When I download other people's voice modules, there is a file called something like traint.index, it's a file you have to use. The same goes for total_fea. I have also seen that there are pth files in the log folder itself.

@Jarods_Journey Жыл бұрын

These should go into the log folder underneath the "experiment" or "speaker" name that you want to use. So if the name is john, the john.pth goes into weights and the index goes into the log/ where you have to create a john directory and place the index into.

@denblindedjaligator5300 Жыл бұрын

@@Jarods_Journey but i meen the traint.index. And why is there a modul in the log folder and a detail file

@321Engage28 Жыл бұрын

Great tutorial! Unfortunately, I seem to be having a problem with step 2a: My attempt to process the data was unsuccessful, and the output message came up blank! What am I doing wrong?

@Yumegipsu Жыл бұрын

This happened to me too but it worked when I removed spaces from my experiment name. If it's not that then idk

@lalalala99661 7 ай бұрын

Quick tip in minute 13:03 you can do shift+ right click then a other menue pops up and you can click on copy path in the poped up menue

@welachutmelexcel 9 ай бұрын

Since i’m relatively new to this, how would you use this rvc for just cloning a voice? Do I just leave out the parts in model inference about the pitch and music related things?

@shep9194 Жыл бұрын

Have you tried the realtime voice changing? Ive been trying to get that working but had some issues, i think its an svc fork though

@Jarods_Journey Жыл бұрын

Have not gotten to try that yet on either repos unfortunately :/

@Retro-zn2jt Жыл бұрын

thanks for your video, there are a few other videos on the subject and I find that yours is better explained nevertheless I still have to deal with several errors. First I had "Cuda Out of memory", so I lowered the batch to the minimum, now I have another error which is: "RuntimeError GET was unable to find an engine to execute this computation". My audio samples are a bit long (a few minutes) and they are in 32Bits float at 44.1Khz but I only have 4 samples... should I divide them into several parts? thanks in advance. Editv1 : I tried many time and also to cut in differents parts, reduce the size and i still get the RuntimeError even with 2 small sample (16bit 44.1khz) than less than 10 secondes… i don’t understand Editv2: Also i wonder if you know how to text-to-speech with this tool ?

@Jarods_Journey Жыл бұрын

You might have to reinstall or make sure the CUDA being stalled is compatible with your GPU

@necrovolo 8 ай бұрын

I'm having the same issue.

@AImusikindo Жыл бұрын

Thanks bro, from Australia

@Jarods_Journey Жыл бұрын

That's awesome, appreciate it!

@AImusikindo Жыл бұрын

@@Jarods_Journey i just made some cup of coffee for you lol

@Jarods_Journey Жыл бұрын

@@AImusikindo Haha thank you, each coffee keeps me going! 🤟🤟

@el-bicente 10 ай бұрын

Thank you for this great tutorial! I was wondering if there was any tool to separate the vocals when they are different singers, because I want to apply several models. I can get clean vocals with UVR5, but I don't know what to do next. I tried to use whisperX but I think it's not really suitable for singing and overlapping voices...

@hariom2580 8 ай бұрын

I have succesfully trained voice but there is no index file in voice name folder, in weights folder pth file is there what to do...nice video

@Beary_TheBear Жыл бұрын

Hi, thanks for the tutorial. I got stuck at the training process. I received a message saying this: RuntimeError: The expanded size of the tensor (12800) must match the existing size (4040) at non-singleton dimension 1. Target sizes: [1, 12800]. Tensor sizes: [4040] Before I got this message, I was getting the "Cuda out of memory", even though I have 32GB of RAM. I cut the audio samples into smaller bits under 10 seconds, and now I have the expanded size of the tensor error. What did I do wrong?

@gabrielmorgan3369 Жыл бұрын

same issue

@gabrielmorgan3369 Жыл бұрын

it means that the if it finishes its going to take up too much space so just turn batch size down to fix

@MrSix-1 5 ай бұрын

Cuda Memory is VRAM Its different than regular RAM

@HyperbolicArachnid 11 ай бұрын

Finally, a tutorial that doesn't fly 5 miles over my head

@VongolaChouko 8 ай бұрын

Is 1000 epochs overkill? Will it have diminishing returns compared to just keeping it up to 300? I really don't see a standard recommended epoch total anywhere, the answer varies. I usually use 500, but I honestly don't know if that's fine since I just use RVC for SillyTavern and haven't tried it just on itself yet, hence I don't know how to evaluate if the results are better or not .___.

@SNYCHANNEL Жыл бұрын

Thank you for this video!! When i trying to train i get this error: sr = int(sys.argv[2]) ValueError: invalid literal for int() with base 10: 'Yona\\Desktop\\RVC-beta\\RVC-beta-v2-0528\\voice\\Me' You know howwhat im doing wrong?

@Jarods_Journey Жыл бұрын

RVC: Invalid Literal or File Not Found error

@M4rt1nX Жыл бұрын

Those high notes though!!! We love local!!!!!!!!!!!!!!

@Jarods_Journey Жыл бұрын

Haha I wishhhhh xD. Local installation yields less issues, and is much easier to debug lol.

@andivax Жыл бұрын

Thank you very much! My Inferencing voice list is empty. Where to put the downloaded voice models? And epochs. It's it worth to use 1000 epochs instead of 200 to increase the quality?

@Jarods_Journey Жыл бұрын

I believe downloaded voice models should go into the weights folder, as long as they're from RVC. As for epochs, if you get good results at 200, I don't see much reason to go to 1k. If you have enough voice samples, 200 should be relatively good. I would listen to them per 100 epochs and see what you think is best (as it's always dependent on your data and how much of it you have)

@EthanWinters176 11 ай бұрын

If you can read this: .pth files go in the folder "weights" .index and others go to "logs" under the voice name ex: Logs\EthanWinters

@scedolin Жыл бұрын

thx for this good tutorial Unfortunatly I had a an error after 2 s and I don't understand why I did wrong. if data.dtype in [np.float64, np.float32, np.float16]: AttributeError: 'NoneType' object has no attribute 'dtype'

@Jarods_Journey Жыл бұрын

Another commenter had this issue but I haven't encountered it yet and haven't found a way to reproduce it. You might be able to find others who are looking to get this issue resolved here: github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/issues?q=is%3Aissue+AttributeError%3A+%27NoneType%27+object+has+no+attribute+%27dtype%27+is%3Aopen Could be related to the training process, trying to find files, etc

@scedolin Жыл бұрын

@@Jarods_Journey I applied your remark on your good short File Not Found: feature_768, and I succeed to avoid this eror anymore. Thx alot - I start to follow your channel last few days put your subject are very interesting - great job

@CamelliaWings07 11 ай бұрын

Thank u. This is the best explaination video I've ever seen in YT. Very clear☺ I successfully make it because of your detailed contents! (I failed many times before Kkkkk)

@djdocq8963 8 ай бұрын

@13:28 you pick an index file v2, in my drop down box it only has 3 different v1 files to choose from? It doesnt seem to create an index file when I train my voice.

@KoalaTeaGuy Жыл бұрын

EDIT: I'm dumb. I was forgetting to include the .wav when putting the path of my vocals If I'm getting 'RuntimeError: Failed to load audio: ffmpeg error' is that an issue with the isolated vocals in the song I'm trying to use? or is it the trained model?

@aatkins2002 9 ай бұрын

The program after a while replaces certain fields, usually the big buttons and their output fields with "Error" and a popup appears in the top right saying "Connection Errored Out" The command console isn't reporting anything unusual, but when I tried to proceed like nothing was wrong, one click training didn't seem to react well. What's causing this "Connection Errored Out" message?

@Xivlex 11 ай бұрын

Hello thanks for the video. It piqued my curiosity and now I want to try RVC myself. Unfortunately, I'm running an AMD GPU (6800xt) but upon checking the releases an option for AMD users is present in updated0814v2. My problem now is that when I try to follow your steps, RVC does not detect my GPU. For example, at step 2b as in 8:21, the options to select a GPU are not present. The option to input a GPU index is there and I've tried putting in "0", "1", "2" and "0-1-2" but when pressing one-click training it says: "NO GPU DETECTED: falling back to CPU - this may take a while" Do you know a way for it to detect my GPU?

@Jarods_Journey 10 ай бұрын

I'm not too sure unfortunately, you might have to check on their githubs issue area to see if anyone else is running into it.

@Jefersen Жыл бұрын

Hello, thank you so much for the tutorial, everything worked fine except the last button: When i click one click training i get all this messages for every file: mp3_10.wav->Suc. and then it just stopps, nothing happens any more. any suggestions ?

@samphelps856 Жыл бұрын

Thank you

@djsaquib Жыл бұрын

While making dataset, if i am taking vocals from a singer! Do i keep to keep key of vocals same? Or i can add multiple audios of different songs to train model of particular singer?

@Jarods_Journey Жыл бұрын

As long as its the same singer, you can add as many songs from them as you like

@djsaquib Жыл бұрын

@@Jarods_Journey thank you for clarifying 🙏🏻

@trubyart6193 11 ай бұрын

im having a lot of trouble... opening the go-web file doesnt show the language option, and then has lots of stuff and at the end says to press any button to continue. After i do that it closes, and when i searched up the localhost:7897 it says i cant reach the page..

@rae8379 5 ай бұрын

Thanks for sharing. But now I run into a problem. Could I just use pretrained models instead of training models myself? But on RVC WebUI, I couldn't figure out how.

@LillianGreenHiLilly Жыл бұрын

Jarods Journey Why cant we just upload for example an existing split song file from inside the folder that is just the singing voice with no music. Also Why copy and paste the whole address? Please answer, because I dont usually get a response when i ask a simple question.

@kuroboticuse 9 ай бұрын

Thanks for the tutorial! Though slight question, is there any general advice on making the process of training towards the Epochs faster? Since for me, training for one epochs takes 11 minutes each and it would take a whole day and a half to reach 200 I have tried the method to "Train Model" and "One-Click Training" Though the rate of training towards Epochs are still the same Thank You

@gasparmxm 6 ай бұрын

A faster GPU

@DreamboyyHD Жыл бұрын

When i use a Vocals/Accompaniment it away show this message "clean_empty_cache" in my folder it have only one mp3 (I try to move it to another drive and try to make a new one and it still not work )or did i do some thing wrong?

@philerasmus 9 ай бұрын

Excellent tutorial. Running the gui I have found that the inference does use the GPU but the Vocal extraction task just relies on CPU. Is there a solution? Thanks

@AIAsiaSinger Жыл бұрын

thanks bro generous sharing! one quick question, when we restore the previous model, how can we continue the training? do we need to go through all step 1 to step 3 ? should we update the "Load pre-trained base model G path" ?

@Jarods_Journey Жыл бұрын

Check this short to see if it answers your question! kzbin.infoeO0gvi_RXTc?feature=share

@AIAsiaSinger Жыл бұрын

@@Jarods_Journey thank you so much!!

@AI_arab_world_maroc Жыл бұрын

Hi Jarod, should it train less with V2 48k , what is the best combination to train a model when it comes to V1 , V2 , 40k 48k? Thank you

@Jarods_Journey Жыл бұрын

V2 48k is the best quality

@looooool3145 5 ай бұрын

Hey man, thanks for the tutorial. I was wondering how to match the key of the instrumental to the output voice? I converted a male song to a female cover, but I don't know how to change the instrumental pitch to match with the female voice.

@TheAimax Жыл бұрын

following your advice to use RVC I have a question, if I stop training at 250 epochs, if I want to start training again to reach 500 epochs I must put a total training epochs of 250 and they will be added to the 250 that I already had or put 500 ? I know that maybe it's a silly question but the question really arose, thanks for the attention to each one

@Jarods_Journey Жыл бұрын

Gotta do 500. These models store checkpoints so if you did 250 (assuming you didn't delete them), it'll start up from the checkpoint

@universalator Жыл бұрын

I have a weaker GPU (GTX 1660 Ti) and its taking about half an hour for each epoch, i put the settings to match the reccomended starting settings (at 9:13 ), is this normal? Thanks

@denblindedjaligator5300 Жыл бұрын

can I make a recording with my screen reader where I try to explain what I do because I don't understand what happens after it has split my audio files, there is nothing in the weights folder. do you have to have a recording of a vocal before it saves the module or what? I just thought you could make a module for later use.

@Jarods_Journey Жыл бұрын

If you hopped in my discord, I could try and help you out there as you would be able to send recordings of the screen reader there. Towards this message, you have to specify what the output folder of the vocals and instrumentals is going to be. After it splits the vocals, they should be located in the path you specified

@Nangel2 Жыл бұрын

Hello again! Do you know what the issue could be when the preprocess stops going in the middle? No error message or anything shows up, after successfully processing many of the vocal samples it just stops going.

@Jarods_Journey Жыл бұрын

Check the logs folder, should have 0 and 1 and if there are contents in there, it finished. Not sure if it stops in the middle, it would output an error.

@Nangel2 Жыл бұрын

@@Jarods_Journey Ty for taking the time to reply again! The logs folder did have 0 and 1 in it and the feature extraction worked, so I tried to train the model but it never progressed past step 1 (ie it never reached the epoch count). I've trained several models before with no problem, so I'm not sure what the issue is. I'll try to experiment a bit and see if I can figure out what the issue is, and if I figure it out I'll report back.

@fsForward 3 ай бұрын

I *love* that he says "Don't trust me blindly", good! But now I do trust you blindly😂

@Tpizzle1313 Жыл бұрын

While extracting features Any solution for line 13, in version = sys.argv[6] IndexError: list index out of range

@luciovids9208 11 ай бұрын

Thanks for the great tutorial Jarod. In my case it seems that "One click training" doesn't work very well. The epochs are created at a very slow pace (2.5 hours each) but if I click "Train Model" from the outset, then it works perfectly (40 seconds per epoch).

@Jarods_Journey 11 ай бұрын

Interesting, well glad that "train model" works. That is what I use nowadays as well

@azfarmcalpha875 Жыл бұрын

I failed to get the myself python file in weights folder after one-click training. Should I reset the process at 7:19 and if so do I need to delete certain files? Correct me if I'm wrong but I think this is the error? I'm not familiar with coding. RuntimeError: Calculated padded input size per channel: (2). Kernel size: (3). Kernel size can't be greater than actual input size 98_1.wav-contains nan 9_2.wav-contains nan all-feature-done

@Jarods_Journey Жыл бұрын

I would rerun the preprocess again for all of your data and then try again, but check this out here: github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/issues/484

@TheBlueRage 3 ай бұрын

Thanks. I just have to make voice samples. I guess I am supposed to sing something. Is that correct? The UVR software works great. I was able to stem Suno ai. m also looking at Jen music ai and Lalal ai which uses celebrities. This was more intense than what I expected. I see that Mac has a download app. I just found an App of Google App Store. I will look through your other videos for more lessons. Thanks.

@gummywormee41 10 ай бұрын

Hello! One issue I've been having is that it says that it cannot find an NVDIA GPU and to use GPU instead, but then it says that there's no GPU to support training. Do you know what would be a good solution to this?

@Tvizleyenpasa Жыл бұрын

for stable diffusion I was using civitai for some pretrained models and examples. Anybody knows if Is there any website that I can use for RVC?

@POPMAGStudios Жыл бұрын

i put all the settings and used one-click training and i got: "added_IVF985_Flat_nprobe_1_IbrahemHefny_v2.index All processes have been completed!" but i couldnt find the model from the "Inferencing voice" slider even though i found the voice model in the log file please help

@KennaLovesGouda 11 ай бұрын

8:11 it will not process data. It starts but then stops and there is an orange line around the output any reason why?

@enricopileggi7909 9 ай бұрын

How can I solve the error message " Unfortunately, there is no compatible GPU available to support your training" in step 2b? (My GPU is MX250). Thank you

@denblindedjaligator5300 Жыл бұрын

i have to set my gpu index at 0 and it works. when i have trained my module i can not find it only wavfiles can i send you my project folder?

@mrdeadmemes Жыл бұрын

i've managed to go through the entire data training process, but i get an error at the very end when it attempts to create the file in weights. it's an 'unexpected pos' error, ("unexpected pos [long number] vs [long number]"). im not sure how to fix it i trained for 5 epochs instead of 200, and it created a file in pth (and was found in the model inference section), so this only seems to happen on higher epoch values. i'm not sure why

@Jarods_Journey Жыл бұрын

Hmm, I'm not too sure on why this might be happening. I can imagine that maybe something got corrupt or messed up somewhere along the line, causing the position to be wrong. Have you tried training a new model with all new folders?

@ElChapoDel8 6 ай бұрын

If i don't have any problems but i want to keep training my model i just do the same thing that you said on the minute 10:50 but increasing the epoch, right?

@Jarods_Journey 6 ай бұрын

Correct :)!

@brunito2514 10 ай бұрын

Would i be able to train with my 1050ti 4gbVram? Its taking too long and i don´t know if the batch size could be a reason

@pcgg-kb4eg Жыл бұрын

Thx for make this tutorial it took me forever so thx and Could I ask you make a comment on if you have trained voice( didn't trained in RVC) how do you use it in RVC

@Jarods_Journey Жыл бұрын

Drag the files to the folders they need to be in ie weights and then a folder for speaker name in logs

@ultimamage3 8 ай бұрын

thank you for the video, it's really informative but i have an issue: when training the voice it doesn't generate ".pth" files in the weights folder, any way to fix that?

@pingusmcdingus5124 8 ай бұрын

The checkpoints are under logs\[YourModelName] however if you copy them to assets\weights it won't load them properly, so ¯\_(ツ)_/¯.

@ImaCreepyCreeper Жыл бұрын

4:13 I can't seem to get into the localhost page, also, is localhost necessary to make the custom vocal models? I haven't really gone through the whole video more or less skimmed it just to see how to get the custom voice models. -_-

@Jarods_Journey Жыл бұрын

-_- To get the local host page, you'll need to instantiate it via the python script

@outlast2fan535 Жыл бұрын

Is it possible to stop training at like 100 epochs (I typed 200 epochs as goal in the web gui, as you suggested) and infer one of the checkpoint models to see if it's going well?

@Jarods_Journey Жыл бұрын

Well, as long as it's going to save at the 100th epoch. You can technically stop the training at any epoch, but you wanna make sure the it saved

@ShreyasDandekar 4 ай бұрын

Hi! I followed all the steps and trained a model but when I try to convert a vocals file with that trained model, the converted file has a lot of a noise. It sounds like a sawtooth wave embedded with the converted vocals file. Can anyone please help me rectify the problem?

@Winterbliss-sg7qg Жыл бұрын

Keep putting more tutorials!!!!

@mrbilly5217 11 ай бұрын

13:40 Im getting an error when I try to convert my audio, does anyone have an idea why? I have the vocals in a wave file and I have the correct location, as well as my inferencing voice as well.

@user-fg9nv9oh6z Жыл бұрын

hi thanks for your tutorials. i got this error: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 6.00 GiB total capacity; 5.11 GiB already allocated; what should i do?

@sreshkhsreshkh3872 7 ай бұрын

same error,any fixes?

@RuneLightLovely 18 күн бұрын

Why the "UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc0 in position 4: invalid start byte" error appeared in cmd.exe after I clicked "One-click training"?

@MistahJ100 Жыл бұрын

CAN SOMEONE PLEASE HELP ME, i am trying to train more audio but when i try to get back into the google colab, it says it cannot find the model falss or some nonsense, I did not change anything on my drive, its all where it should be but i can not get back in. I would like to train more voices and i dont understand why it wont work, This happens when i click on the web cell

@tachankafreeman3442 10 ай бұрын

for me for some reason it takes so long for the files to exoport and stuff and when it done well for my version it just shows a glimps of a small window appear and it closes as if it can open idk how to fix that

@OravinCZ 10 ай бұрын

Hello! I've started using EasyGUI RVC now, but, do you have any idea where the models I've already trained with this method should be stored? Don't know how to make it appear - they don't want to appear in interference at all... 😐

@Mago497 10 ай бұрын

In the folder you downloaded there should be one that says "weights" there's were your models are stored. At the time of writing this there were a few extra that came with the download.