Use AI to Clone ANY Voice & Sing ANY Song for FREE

Use AI to Clone ANY Voice & Sing ANY Song for FREE | RVC WebUI Tutorial

Рет қаралды 392,364

Күн бұрын

Пікірлер: 1 100

@spencereaston8292 Жыл бұрын

I made a V2 model of just a friend who does math tutoring videos. Only used 20 epochs with the v2 model. None of the training data had singing, but ran it on a old folk song and it was WAY better then I expected.

@NerdyRodent Жыл бұрын

Yup, it’s quick, easy and fun!

@navierproductions709 6 ай бұрын

@@NerdyRodent @NerdyRodent DUDE WILL IT WORK WITH ANY LANGUAGE?IM TRYING TO CLONE AN INDIAN ACTOR'S VOICE FOR MY MUSIC PRODUCTION...WILL IT WORK???

@WaterShowsProd Жыл бұрын

I've been doing dubbing work for a company that uses A.I. to change our voices so a small group of us is able to dub a wide number of characters in a TV series. It's interesting to finally see the nuts and bolts of the kind of system they are probably using. We had to do 4-hour training audio for their software when we began 2 years ago. The pace of development in this technology is astounding.

@NerdyRodent Жыл бұрын

Yup! Can’t wait to see what stuff people do in 12 months time!

@southcoastinventors6583 Жыл бұрын

If it crunchyroll tell them that there translations suck or are they bad intentionally

@WaterShowsProd Жыл бұрын

@@southcoastinventors6583 No, it isn't; it's another company. Actually, we handle the script adaptation ourselves. They focus on the technical wizardry. And getting the programs too, of course. I've found them to be very enjoyable projects.

@flyingstapler1241 Жыл бұрын

What language do you guys dub?

@WaterShowsProd Жыл бұрын

@@flyingstapler1241 I mostly do English dubbing, but I have done some dubbing in Thai for jobs requesting a foreigner speaking Thai, or doing ADR for foreign actors who couldn't speak Thai clearly enough during the shoot. The A.I. stuff I've done has all been English language, but the client has requested the studio assemble Thai, Vietnamese, and Indonesian teams as well.

@SwordofDay Жыл бұрын

This was amazing nerdy. Was cracking up at the abrupt cut from your singing moment. You needa shine bro, get that album! Anyways off to make my own project. THANKS!

@silassimus Жыл бұрын

Thank you. Because of you I was able to hear the voice of my recently passed grandma again... I appreciate it

@NerdyRodent Жыл бұрын

Wonderful!

@R457-j2g Жыл бұрын

I'm sorry for your loss...

@navierproductions709 6 ай бұрын

@@NerdyRodent DUDE WILL IT WORK WITH ANY LANGUAGE?IM TRYING TO CLONE AN INDIAN ACTOR'S VOICE FOR MY MUSIC PRODUCTION...WILL IT WORK???

@4slight Жыл бұрын

Should've sang the ponyo song yourself as a control to compare 😆

@kusog3 Жыл бұрын

this is a good suggestion. Come on, let's hear it!

@NerdyRodent Жыл бұрын

Step 1. Learn a new language 😆

@southcoastinventors6583 Жыл бұрын

@@NerdyRodent That what I was going to ask did you need to speak Japanese to train the model but I guess you answered it that turned out really well.

@synaestesia-bg3ew Жыл бұрын

@@NerdyRodent can it be installed and used offline?

@Freezo90967 Жыл бұрын

@@southcoastinventors6583 Some people have been making jp-jp models (e.g japanese dio voice actor) do english speaking, so it seems language doesn't affect too much

@Beetlebomb3D Жыл бұрын

Couldn't have chosen a better song to demonstrate this with! よくできました！素敵な声だと思います！

@NerdyRodent Жыл бұрын

Thanks! I like the song too :)

@4.0.4 Жыл бұрын

I tested it today. Took like 30 hours on a GTX 1060 6GB but got it (batch_size 3, 80 epochs, 2+ hours of audio). Was a stream type audio with background music, but the audio separation that this has took care of it. It friggin' works - surprised a friend with it and felt like I learnt something. Thanks mate.

@NerdyRodent Жыл бұрын

Fun isn’t it 😉

@urasam2 Жыл бұрын

@@NerdyRodent I got at far as convert, then it seemed to just sit there doing nothing. . Checked CPU and GPU in Task Manager and both were on 0%, which didn't look right to me, but maybe it was just taking its time. Any ideas?

@NerdyRodent Жыл бұрын

@@urasam2 I’d say run through the steps again, checking for any errors each time

@urasam2 Жыл бұрын

@@NerdyRodent Thanks. Didn’t see any error messages. Maybe it’s that my system isn’t powerful enough

@NerdyRodent Жыл бұрын

@@urasam2 Could be. Probably worth running your performance monitor while it’s going to confirm

@AiPress347 Жыл бұрын

Absolutely fantastic explanation and visualization about what’s happening! Thank you 🙏🏻

@NerdyRodent Жыл бұрын

You are so welcome!

@navierproductions709 6 ай бұрын

@@NerdyRodent @NerdyRodent DUDE WILL IT WORK WITH ANY LANGUAGE?IM TRYING TO CLONE AN INDIAN ACTOR'S VOICE FOR MY MUSIC PRODUCTION...WILL IT WORK???

@Dante02d12 Жыл бұрын

Any luck making it work with 6GB VRAM? EDIT : YES, IT WORKS! The key parameter is the "batch size for every GPU" : increasing the value also increases the VRAM used during training. I used 6 and it was filling 4.5GB with the 32k model, so I guess we can go up to 10. With the 40k model though, we can't go higher than a batch size of 5. Everything is slower of course, but it's tolerable. For a training of 50 epochs, it took an hour for me (although a higher batch size should make things a bit faster). Creating the voice sample also takes three minutes instead of seconds (EDIT : when using "harvest" ; but "pm" makes a 3min song in a single minute, and frankly, the quality is fine with the 40k model). For reference, I'm on a laptop with RTX 3060 and 6GB VRAM. Also, fun facts: 1) If something bad happens in the middle of training, you can just restart it and it will take off from where you were. In my case, I accidentally closed my laptop during training and let it run. I came back and saw an error. I was at 40 epochs out of 50, lol. I thought I had wasted an hour of my time, but it turned out the training restarted at this epoch. This is thanks to the "save frequency" feature. 2) I accidentally converted the instrumental track instead of the voice. It gave me insane beats, lol. 3) I tried using a vocaloid song, hoping I'd get a more natural song. Sadly, the cloned voice keeps the robotic vibes T_T. It makes sense, but I was dreaming of making natural covers of vocaloid songs... Oh well, I'l lwait for someone to create perfect autotune. It should only take a month or two, lol. EDIT2 : BE VERY CAREFUL! Current software has a sort of "storage leak issue". I had 40GB on my SSD after install, I've used it a few hours, and at some point the disk started to get filled up for absolutely no reason. It filled it up completely! I don't know what caused this, but I do know the 40GB was inside the "log" folder. I'll tell the devs this happened. Check out your storage space regularly. If it happens to you, delete the log folder. It stores the "indexes" (I don't know what it is) but the rest will work fine, you won't lose your models. EDIT3 (can't stop...) : Ok, remember the save frequency function? That was the culprit. Back ups stack up. Next to it there is another feature, called "Whether to save only the latest ckpt file to save disk space". Enable that feature, it will only save the latest back up. Also, if you delete the entire log folder, you wont't be able to train any voice anymore (there are some important files in this folder, lol), you'll have to download the missing files again on github.

@Jojo2 Жыл бұрын

So what exactly would be the perfect settings for the 40k model? I also have a 3060

@Dante02d12 Жыл бұрын

@@Jojo2 Well I used 50 epochs and a batch size of 5. For the rest I just followed this video. Always use "harvest", whether for training or for building ann audio track. It's the best quality and doesn't take too long. It gave me wonderful results, except for whenever there are echoes/reverb in the song I try to copy. The voice itself is brillant. I think I'll upload a few songs I made with it.

@rully8093 Жыл бұрын

so that means it is useless for me with 1050Ti 4GB VRAM

@elel5368 Жыл бұрын

Hey, I have the exact same specifications as you, laptop with RTX 3060, 6GB VRAM. Did you ever get a CUDA out of memory error when trying to do inference? Because I managed to do one song perfectly, but the next one I got this error and I don't know how to fix it. torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 526.00 MiB (GPU 0; 6.00 GiB total capacity; 3.75 GiB already allocated; 0 bytes free; 5.12 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF Did you ever get this error? And even if you didn't do you know how to fix it? I don't know how to set what it's telling me.

@AmirSn-h7i Жыл бұрын

@@elel5368 hey i have a same problem.could you fix it?

@mjanek20 Жыл бұрын

OH MY GOD !! Voice to Voice conversion works like something I've never seen/heard before. If you do it correctly it's almost perfect.

@waltwise2456 Жыл бұрын

A few tips I've learned in getting this to work: 1. If you are getting CUDA out of memory errors even with a batch size of 1 and especially using Version 2...update your GPU driver. I could not get Version 2 to work (and could only get Version 1 to work with a batch size of 1) with CUDA OOM errors and just chalked it up to my video card (Nvidia 3050 Laptop) only being 4GB...however, after updating the GPU driver on this brand new laptop (the previous driver wasn't that old), Version 2 now works and even works with higher batch sizes. 2. As I believe is mentioned elsewhere, don't leave spaces in your folder and file names...you'll get an error. 3. Make sure you have the ".wav" extension in your file path...you'll get an error without it. On Windows machines the extensions are there but hidden. 4. If you are trying to train a model and have restarted the training at some point and you get a runtime error in the command window, try using a new Experiment name and rerunning it. I had restarted a training and got an error in the command window and fixed it by simply changing the Experiment name (I added a "2" to the end of the name) and rerunning the training. 5. I'm not sure if "One-Click Training" has a glitch or if it's just coincidence, but I have had better luck hitting the "Train Model" button and then when it finishes hitting the "Train Feature Index" when using Version 2. Don't forget to hit the "Train Index Feature" or you will not have an Index file.

@gkiss2030 Жыл бұрын

Do you recommend and nVidia card (with CUDA support) or an AMD one (with OpenCL)? If I want to train with my own voice, is a few minutes sample enough, or 2+ hours is needed? Thank you!

@MWcrazyhorse 10 ай бұрын

Very good. He forgot point 5 in his tuturial. So it wasn't working. You fixed it.

@jurandfantom Жыл бұрын

Love you man :) you bring constantly something fresh to table. Thanks for digging something else than LLM. Would be possible to make update about voice2text and text2voice? (voice2voice surprised me tbh )

@LoneRanger.801 Жыл бұрын

In your experience, how is this compare with so-vits-svc? In terms of training and also in terms of the output audio quality and voice match? By the way, thanks so much for sharing this and creating all these awesome TTS and SD videos. ❤

@stylestep111 Жыл бұрын

I find so-vits has better quality, especially for singing, but the training also takes longer as well. Results are also heavily dependent on the quality of your dataset.

@artesgraficasxpl Жыл бұрын

I'm using so-vits. I've just finished to train my own Freddie Mercury model... 48 hours of training 😂. But the result is very very impressive

@BGP00 Жыл бұрын

@@artesgraficasxpl how much data did you use ? like how many minutes/hours of voice samples

@artesgraficasxpl Жыл бұрын

@@BGP00 182 samples for a total of about 30 minutes, trained in 1300 epochs

@malcolmreynolds4099 Жыл бұрын

@@artesgraficasxpl can you share your results? :)

@PrisonerFlake Жыл бұрын

found out part way through I can't do it because I have a low end machine and my GPU wasn't compatable sadly, but I wont bash you for that. very great to have videos like this, you're doing a great service :)

@Daring2Win Жыл бұрын

Thank you! My choir track are going to be ridiculous. And, at least for private enjoyment, I'll be able resurrect all of my favorite singers to lead.The world is collapsing, so they've released the perfect distraction for me!

@NerdyRodent Жыл бұрын

Sounds like fun!

@AdamantMindset Жыл бұрын

lmao i genuinely chuckled for a bit when I heard your voice singing at the end. Thanks for the tutorial. can't wait to listen my own voice over BABYMETAL songs xD

@Jarods_Journey Жыл бұрын

Awesome stuff, just when people keep telling me to look into RVC, I find a solid tutorial for it! Will have to allocate some time to set this up myself and get it up an running 🙏🙏

@NerdyRodent Жыл бұрын

@carlosedubarreto Жыл бұрын

Amazing solution and tutorial, thanks a lot for that. I've got amazing results in my own language (Portuguese-BR)

@NerdyRodent Жыл бұрын

Great stuff! :)

@prophetofthesingularity Жыл бұрын

Excellent! I like this one better than the other similar programs, this is much easier to use and quicker than others I have tried.

@NerdyRodent Жыл бұрын

Yup, nice and fast!

@fischer74 Жыл бұрын

how long does it take to train a voice until you turn into a new song?

@NerdyRodent Жыл бұрын

@@fischer74 just 1 hour is usually good!

@UnchartedWorlds Жыл бұрын

I made it, it is working amazingly! What a tech! Thank you Master Rodent!

@NerdyRodent Жыл бұрын

Nice work!

@chaoticdreemurr9861 Жыл бұрын

Where it says "Inferencing voice" it says "Inferencing timbre" for me instead and when I go through all the steps in the Train section and go back the Model Inference section, nothing pops up in the dropdown, even after hitting the refresh button. Which also says Refresh timbre list and index. If anyone knows how I can fix this I would really appreciate it.

@giovanniferrarisdicelle Жыл бұрын

same for me

@hoi4timelapser443 Жыл бұрын

Hey, I have a question. I did everything correctly, but for some reason, I have no option to choose from in "Inferencing voice:" section. I don't know what to do. Please help.

@NerdyRodent Жыл бұрын

Make sure you didn’t have any errors during training and that your resulting weights are in the weights directory

@hoi4timelapser443 Жыл бұрын

@@NerdyRodent thanks I will check it out

@hoi4timelapser443 Жыл бұрын

@@NerdyRodent there was no trouble with training, everything went successfully. Every time I did training it added new logs to "Auto-detect index path and select from the dropdown:" section. But I am not sure what you mean by "weights directory", because in files there is nothing in folder "weights" other than .gitignore

@myenethes1554 Жыл бұрын

Thanks for the clear tutorial! This seems fascinating and seems like a much less tedious/less error-prone process compared to So-vitsc! Do you know of a way to "improve"/train your existing model further with new data using this? Or would you have to restart the whole process all over again?

@NerdyRodent Жыл бұрын

Almost the whole process again, but you could continue from a different model

@mrguiltyfool Жыл бұрын

@@NerdyRodent is there a guide on how to get the gui running under linux?

@سعوديبلوج 9 ай бұрын

Honestly, I appreciate your valuable info of such valuable subject❤

@estrangeiroemtodaparte Жыл бұрын

Great video, as always! How would you say does this compare to so-vits?

@NerdyRodent Жыл бұрын

Much faster to train!

@dennisliebelt3951 Жыл бұрын

Hey Mr Rodent, thx for another interesting video. Im always happy when i see a new video from you popping up. I started playing with voice2voice after i saw your sovits video. I spent a few days and tried to push it as far as i can, so in case people struggle also on certain things here is a short summary about my experiments. My goal was to create a Bessie Smith voice model(Blues singer, died 1937). Although its easier and quite a lot faster to use rvc i personally had better results with sovits. My best guess for the main reason is the lack of crepe integration for pitch extraction and inference by now. There is a fork with crepe out there but i wasnt able to get it running by now correctly. The audio seperation on rvc is working okish but only on a basic level. In my experiments it is better to install UVR5 locally because there the ensemble mode can be used which means different seperation models can combined and weighted. A list of good combinations of models can be found here: mvsep.com/quality_checker/leaderboard.php. Another big thing is the training data. No clean data set = poor quality model! So for example if someone wants to train on a certain singer or public figure, the voice needs the best seperation that can be achieved from instruments or background noise. Reverbs needs to be removed or it will sound weird! There are quite a few good free plugins that can be used with audacity like Goyo to achieve that. Another thing is that the training data and most songs are normally in stereo, but the voice model is mono. To fake back in some pseudo stereo i tried other free plugins inside of audacity (vocal doubler and wider were quick out of the box solutions). I hope this helps :)

@Cyber_Official Жыл бұрын

The UI changed a ton, but i was still able to follow. Thank you for a guide, i was running around like a headless chicken tryin to figure out what to do.

@NerdyRodent Жыл бұрын

Yes, they've added a new option to allow selecting crepe in the inference tab now. Cool :)

@sneedtube Жыл бұрын

Hey Nerdy how does this stack against the old so-vits-svc-fork workflow, result quality wise? What are its strenght/weakness compared to that? Btw thanks for keeping us updated with your cherry picked pieces of new AI tech, you manage to do the trick everytime!

@NerdyRodent Жыл бұрын

Much faster to train :)

@sneedtube Жыл бұрын

@@NerdyRodent Super! Will definitely give it a try, I only hope that the result won't be worse so I can pick this as my go-to audio-magic-thing :D

@PythonAndy Жыл бұрын

I had the same question. Thanks @9cento777 !

@sneedtube Жыл бұрын

@@PythonAndy you're welcome!

@NightfallShadow Жыл бұрын

I want to use this but github whatever that is has the only files I can find but it's broken up into like 30 files. How do I download this thing?

@NerdyRodent Жыл бұрын

To download the zip file, you can either select it from the GitHub page releases link, or from the hugging face space link as shown in the video and linked in the video description

@volu9913 Жыл бұрын

9:45 I don't see any option at all there. Can somebody please help me??? I've refreshed multiple times and I am still seeing NOTHING.

@chaoticdreemurr9861 Жыл бұрын

Yea I'm having the same issue here 😢

@GeorgeKarmas 11 ай бұрын

“My list is absolutely huge, like all girls would agree” legend 😂❤ amazing video thanks for this

@NerdyRodent 11 ай бұрын

;) Thanks for watching!

@raheemhussain9191 Жыл бұрын

Hi, my training completes successfully, yet there's no inferencing timbre. Done this twice now and still not working. any fix? Thanks

@sukhpalsukh3511 Жыл бұрын

Tell more , i will help

@volu9913 Жыл бұрын

Same problem here.

@dogsushienjoyer Жыл бұрын

after a lot of issues (no whitespace in dic or file name) i finally made it to the last page but now i dont get the issue.. console gives me this: Traceback [...] ffmpeg._run.Error: ffmpeg error (see stderr output for detail) RuntimeError: Failed to load audio: ffmpeg error (see stderr output for detail) AttributeError: 'NoneType' object has no attribute 'dtype' (I let out the traceback stuff) What does it mean? The audios are all correct...

@NerdyRodent Жыл бұрын

NoneType object typically means there is no data

@cowrac1337 Жыл бұрын

did you figured it out? I have the same issue

@dogsushienjoyer Жыл бұрын

@@cowrac1337 i didnt try it yet, i wanted to reinstall the whole thing and try it again from the very beginning since i now know what to do in the first steps (i deleted some old voice folders in the process which caused some troubles too, so im hoping i just deleted too much and it will work after reinstalling)

@cowrac1337 Жыл бұрын

@@dogsushienjoyer actually I just solved the problem. I put the value of "Protect the artifact of voiceless consonant and breath. 0.5: don' use it. The number smaller, the stronger protection." to 0.5 and that worked for me

@dogsushienjoyer Жыл бұрын

@@cowrac1337 cool thank you! i'll try it out later

@asciikat2571 Жыл бұрын

First again, this is amazing, can I clone your voice Mr Rodent and make a rap battle with you and Lil Pump?

@NerdyRodent Жыл бұрын

I’m sure I’d sound terrible! Go ahead and clone, but do no evil 😉

@asciikat2571 Жыл бұрын

@@NerdyRodent No evil... Mawhaaaaa... ;)

@yaminrasheed Жыл бұрын

im having a problem with step 2a, when I start to process it fails and shows this message (start preprocess ['trainset_preprocess_pipeline_print.py', 'E:\\RVC\ awfiles\\', '40000', '27', 'C:\ vc\\Mangio-RVC-v23.7.0/logs/bakurube', 'False'] end preprocess)

@NerdyRodent Жыл бұрын

That’s a different program, so it could be worth trying with this one?

@fjrevoredo Жыл бұрын

damn, the results are amazing. You definitely earned a sub

@NerdyRodent Жыл бұрын

Why thanks :) Hope you enjoy!

@DunceInAwhile Жыл бұрын

If you need an explanation on what a filepath is, you probably shouldn't be watching this tutorial.

@pogiman Жыл бұрын

wow.. when you sang at the end.. I was immediately mindblown

@jackjr2694 Жыл бұрын

I am getting an error (M1) The error message "AttributeError: module 'sys' has no attribute '_MEIPASS'" suggests that the 'sys' module does not have the '_MEIPASS' attribute. This attribute is typically used in PyInstaller-generated executables to determine the path to the bundled resources.

@shalirezaii4696 10 ай бұрын

any luck yet?

@coucoul38 Жыл бұрын

I will try to use this to make my streamer friend finally sing a song as he had promised for a donation goal. Hope I can get it working to see his reaction XD

@koolturaSTEAM Жыл бұрын

Hahahaha!!! 2:14, love the jab at Windows!

@NerdyRodent Жыл бұрын

;-)

@theyakkomanguy7522 Жыл бұрын

How exactly do you get this .pth file of your model? I've spent a good few minutes figuring this out but I can't because it feels way too confusing.

@twilightfilms9436 Жыл бұрын

Can it be used for just regular talking/conversation?

@edwin951031 Жыл бұрын

I am stuck at 2A. When I clicked "process data", it does not do anything i.e. the output message is empty. There is an error written on CMD:"ValueError: invalid literal for int() with base 10: 'PC\\Desktop\\Retrieval-based-Voice-Conversion-WebUI-main\\train\\mymodel'". How to solve this?

@NerdyRodent Жыл бұрын

The command says it is expecting an integer value, but you’ve passed it text. Check for odd characters in your absolute path, or just use the relative path like in the video.

@hughsilva5655 Жыл бұрын

Nothing is appearing in my referencing voice tab......

@vdbyUA Жыл бұрын

How to install dark theme?

@johnpope1473 Жыл бұрын

00:01:40 - nailed it.

@raheemhussain9191 Жыл бұрын

i dont have a .pth. I've tried making a model 3 times now and everything is successful. But still, no inferencing timbre. Any fix? Thanks!

@sukhpalsukh3511 Жыл бұрын

You have to train, or download pre trained models and put them in weight folder

@ChristianIce Жыл бұрын

Ok, I want to know the guy who wants to use AI technology for voice to voice cloning *and* doesn't know what a "directory" is. :D

@NerdyRodent Жыл бұрын

I do try to cater for all levels, noob to pro! 😉

@AllisonGhost Жыл бұрын

wish this tutorial covered more of what to do when it isnt working, the "process data" button isnt working at all, it just stops partway through with the text "error" and then resets, basically impossible to debug with no information as to what is going wrong.

@NerdyRodent Жыл бұрын

Never had it not work as yet!

@AllisonGhost Жыл бұрын

@@NerdyRodent coming back to say that I got it working, apparently it will not run if you install it to a folder that has a space in its name! Removing the space fixed it. Might be useful to add as a side note in the video

@braiazito Жыл бұрын

@@AllisonGhost you are a life saver!! Thanks!

@ahminlaffet3555 Жыл бұрын

Wonder what happens if you use a church organ as training data and sing some song to turn it into a church organ.. and if that works with whole philharmonic orchestras. Great video, i feel ambitioned to test it soon.

@NerdyRodent Жыл бұрын

Certainly worth a go! 😄

@MWcrazyhorse 10 ай бұрын

Good job brah!!! Got it to work!

@Charlton222 Жыл бұрын

To anyone who has gotten this to work correctly, maybe you could tell me what I'm doing wrong. I already have a pre-trained model (the *.pth file). I've already separated the vocals from the music using another program. So how do I get RVC to find that *.pth file inside the Inferencing voice tab? I've refreshed multiple times. Nothing. I placed the correct file path inside the "Path to training folder" box in the Train tab and then refreshed. Nothing. Any help would be appreciated! ONE HOUR LATER... OK, figured it out. Place the *.pth file into the "weights" folder. Place the *.index file into the "logs" folder, then the voice model will be found in the referencing voice tab.

@giovanniferrarisdicelle Жыл бұрын

im having the same problem

@Charlton222 Жыл бұрын

@@giovanniferrarisdicelle Placing those files in the correct folders was the fix for me so give it a try.

@giovanniferrarisdicelle Жыл бұрын

i solved that problem, now i have another one: it keeps loading it was at 1400/107 edit: yes, I’m a windows user 😅

@djflapjack4744 Жыл бұрын

where can I find the .pth file and the .index file? I have the folders but where did you find the files to put in the folders?

@Charlton222 Жыл бұрын

@@giovanniferrarisdicelle What's 1400/107 mean? I'm using Windows too, so hopefully you can get yours up and running.

@eduardomartin8510 Жыл бұрын

Greetings, complete noob here, does this upload anything or is it all local processing/storage etc? Thanks!

@NerdyRodent Жыл бұрын

I'm running locally, but if you want remote then try google colab!

@DruuzilTechGames Жыл бұрын

8:42 - There is no "opt" folder. I created folders of my own to place the vocals and instrumentals, but it just errors out and claims those file paths don't exist even though they do. So this step would seem to be broken in the software.

@thw1pster Жыл бұрын

he said you can change it if you like, meaning you can just make another one anywhere

@skyboybenji9185 Жыл бұрын

I get errors when I use a another path then C , try it in C

@DruuzilTechGames Жыл бұрын

@@thw1pster I did that. it didn't solve the problem. The problem is the tool. You have to upload the file to the tool itself rather than supplying it a file path, then it works.

@Smashachu Жыл бұрын

This.. just changed harmonization in the music industry. For the 2-3 months it has left before AI is able to copy and emulate the human voice better than even the best singers we have available.. then all of our entertainment will shift from watching human beings to watching perfected interpretations of human beings. And finally our population slowly starts to decline over time because of lack of reproduction with AI waifus's taking over the market.

@sammymesnaoui7157 Жыл бұрын

The video was playing in the background and I had to rewind to make sure I heard what I heard 9:42 :D

@NerdyRodent Жыл бұрын

😉

@IzeIzeBaby Жыл бұрын

Today i learned: even if ai generated, i can't sing :D

@NerdyRodent Жыл бұрын

@agent6599 Жыл бұрын

Hey, whenever I upload all of the correct files and paths, I press process data and the process gets stuck on “converting” and never produces and output. It got up to 10 minutes before I retried and it does this every time. Any idea what the solution could be? Appreciate it!

@NerdyRodent Жыл бұрын

Not sure, try using Ubuntu 22.04?

@vidokk77 Жыл бұрын

Ponyo❤

@makinganoise6028 Жыл бұрын

I am a singer, tbh these tools dont threaten me, as, like auto tune, people use them cant sing well to start with, so are kind of stuffed when it comes to singing live, many acts simply mime along now anyway, if anything, they stop people actually doing the work to learn to sing, I use AI to remove the vocals from songs to create backing tracks to practice, but those guys who sell backing tracks cant be happy right now.

@switchpp1266 Жыл бұрын

hi when i click on inferencing voice i dont have any thing on it please help

@okanyontar Жыл бұрын

me too

@tejasathakur Жыл бұрын

how much time does it take for v2 harvest 20 epochs? i’ve started the training and the 1st epoch took around 6 hours. and it’s still going on. my specifications: asus tuf gaming fx505dt gpu- nvidia gtx 1650 ram- 16gb cpu- ryzen 5 3300h i’ve used the training data of 50 minutes (24 mp3 files), is there something wrong? or it takes this much time on usual?

@NerdyRodent Жыл бұрын

That would typically be around 30 mins on a modern GPU

@AIStarCompany Жыл бұрын

Hello. I have tried following your instructions on colab platform. But it doesn't work. help me

@foslerfer Жыл бұрын

Quick question, How do i know when the processes are done? Will it say "end process" every time or does it only say that for step 2a? For step 2b it tells me "all-features-done" in the command prompt.

@NerdyRodent Жыл бұрын

Yeah, I just look at the console output too

@jazmaan Жыл бұрын

I finally got it to run by removing all spaces from my folder names and filenames. I trained it on Jimi Hendrix speaking voice using 10 mins of interviews, then substituted Jimi's voice for John Lennon's voice on "Norwegian Wood". Here's the problem. By retaining all of Lennon's timing and pronunciation, all it really does is give Lennon a huskier timbre. No way anyone would recognize it as Jimi. They'd just think Lennon had a cold. It does do a good job of isolating vocal stems though. Also this would be more useful if you could use the trained voice to recite inputted text. Substituting one singer for another seems not very useful, especially given the limitations I described of it retaining the original singer's timing and pronunciation.

@Jajareto Жыл бұрын

My dude, thank you for pointing out the problem with the spaces.

@ScrimaGames1 Жыл бұрын

if the process button for my wav audio files does not work, does that mean there are not enough wav audio files or enough time? I assume it needs many wav audio files if that's the reason it's not working?

@NerdyRodent Жыл бұрын

I've never had it not work myself!

@Ixanol Жыл бұрын

Most probably you have spaces or some special characters either to the path of the software or to the path of the audio files, try changing the folders names that contain them to simple characters

@ScrimaGames1 Жыл бұрын

@@Ixanol thank you! I will give that a try later today!

@SupSupa10 Жыл бұрын

Nice. Is there any requirement about CPU and GPU ???

@sukhpalsukh3511 Жыл бұрын

8gb vram sufficient

@SandyGoneByeBye Жыл бұрын

You scare me at times. *winks and runs*

@NerdyRodent Жыл бұрын

Only at times? :( I will try harder...

@dimasky8671 Жыл бұрын

Step 3 When I click on the train model button there, i don't get any progress indication / update in my command prompt. I just get a "use gpus : python train_nsf_sim_cache_sid_load_pretrain. py -e model -sr 40k -f0 1 -bs 40 -te 200 -se 10 -pg pretrained_v2/f0G40k. pth -pd pretrained_v2/f0D40k.pth -l 1 -c 0 -sw 1 -v v2" About the webui output message, i get a message telling me that "training is over, you can check the console training log or train. log under the experiment folder" in japanese. When I go for the Inference step, and that i refresh my voice list and index path, I don't get any voice to choose. Please help me

@WatchNoah Жыл бұрын

Thank you for the video. One question tho: Do the audio files for training need to be in English?

@duck-headedllama9991 6 ай бұрын

Do you need a pre-trained model by language? I cloned my voice and tested it, and I sound like I'm Chinese. Specially with the R/L sound. So, I'm wondering if I need some extra files to match the target language (English and Spanish in my case).

@IfOnly60sand70s 8 ай бұрын

Finally got my RVC up and running. I have very a good recording set-up and painstakingly chop out any noise etc. I always do 10mins as a WAV file with 300 - 500 Epochs and try different recording levels on my own voice and trying it with and without EQ, with and without Compression and make sure no reverb is present etc but on the three voice models I've trained myself, they have all turned out noisy... a definite hiss which only appears when the "voice speaks" but on some of the voice PTH models I've downloaded, they turn out perfect. Any ideas what I'm doing wrong? Is this software only "reasonable" because it's free (I've tried mucking about with all the settings too) Cheers

@atomhero2830 Жыл бұрын

Wah damn nice, good explaination i want to give u more than 1 like. Thanks for sharing!

@NerdyRodent Жыл бұрын

Glad you enjoyed the things!

@janslavik5284 Жыл бұрын

Awesome, works like a charm :D

@gabditullio 10 ай бұрын

Great video! I have a question: if I want to use the music and my voice, but change the lyrics of that song?

@NerdyRodent 10 ай бұрын

As it is voice to voice, the word output will be the same as the word input, only in the new voice. This means that in order to change the output words, simply change the words you input!

@bladecial9394 Жыл бұрын

I'm having problems with getting the voice be processed. After clicking on "process data" and having a good path, the code says that it succeeded with having "->Suc." but the process never ends, meaning that it doesn't show up within the program. If anything, it seems like it freezes as it's processing the data and idk why. Is there any information about this happening? Any way to unfreeze whenever the voice is being processed?

@musicxtn Жыл бұрын

Omg this is great! Thank you! I have an older Mac Pro. Would you recommend a newer PC or Mac as best optimal choice for this please? Thank you!

@NerdyRodent Жыл бұрын

For the best in AI, you can’t beat Ubuntu + Nvidia - that Linux PC will be far more compatible than Mac, allowing you to play with all the machine learning apps 😀

@basspig Жыл бұрын

Would be nice to see a step-by-step video on how to install this software and get it running.

@NerdyRodent Жыл бұрын

Sure thing! I can understand zip files aren’t everyone’s cup of tea. I’ll run through doing a normal “nerd” install too 😉

@mcworks6822 Жыл бұрын

I want the next video asap! Lol. Thisnis crazy! Looking foward for the TT an old song but quit popular so we can listen how the picth behave with that particular song.

@SephConUno Жыл бұрын

This Whole thing is Exciting

@TheJsimes Жыл бұрын

Hi, Can you tell me please, on this one did you used clips of you singing or was it only talking, or was it a mix of both?

@NerdyRodent Жыл бұрын

A mix

@pr2lit458 Жыл бұрын

To make more than one model training you just do the same steps right? Don’t want to mess anything if I do another one, just wondering

@NerdyRodent Жыл бұрын

Yup! Just be sure to give it a new name 😉

@LydianMelody Жыл бұрын

Yo that track is metal as fuck

@NerdyRodent Жыл бұрын

For sure! :D

@senaprasena168 Жыл бұрын

ah ponyo song!... I give you one like even before I finish the tutorial...

@UnchartedWorlds Жыл бұрын

Hi Nerdy, me again, at 9:53 you mention we should only see one option, and this is first time I'm following this tutorial, I see all the epochs in format name_I_picked_eXX.pth, where XX ranges from 10-200. I also see name_I_Picked.pth, what do I pick? name_e200.pth or name_pth? I just finished training the model on 200 epoches and separated vocals from the instruments.

@southcoastinventors6583 Жыл бұрын

What video card are you using and what would be the minimum size VRam to be viable I feel like I need to upgrade to use many of these new tools locally ?

@canoksas2979 Жыл бұрын

helloo, awesome video man!!! i have one question tho is this also possible if i dont want to make the ai voice sing a song but juist talk. or do i need another voice cloning software for that? thank u :))

@NerdyRodent Жыл бұрын

Yup, you can all of the things described in the video!

@rolfeyM8 Жыл бұрын

With the audio wav files that I use to train the model, is it recommended to get samples of the voice talking, singing, or a combination of both? I would imagine this also depends on the application, whether it's being used for singing or talking itself

@andivax Жыл бұрын

Amazing tutorial! My Inferencing list is empty. How to add models to this list? I want to try already trained models from my SSD. Thank you!

@NerdyRodent Жыл бұрын

If you’re getting models from elsewhere weights go into the weights directory and index in the logs - just like with training

@everydaycereal2787 Жыл бұрын

NEED HELP: The Separation Vocals tab, it doesnt work it says "clean empty cache" how can I get it to work?

@trueandobjective 7 ай бұрын

Can you kindly tell me how can we replace the vocal or the instrumental section of a song with a piano voice ? I really appreciate your reply🙏❤️

@dwsel Жыл бұрын

Beautiful singing in tune sir 😁👍

@NerdyRodent Жыл бұрын

Thanks 😆

@baraobuu Жыл бұрын

I have a RTX 3060 Dual 12GB OC, which batch file size should I choose? On default when I open, it's 6, should I increase more?

@NerdyRodent Жыл бұрын

Use all the vram!

@makimariami 6 күн бұрын

I've the changed lyric of the famous song, and I want to have it with the famous singers' voice. How is it easier to do? I can't sing by myself. Is there an free option to achieve what I need? I have the background music separated (instrumental version).Could you advice me free AI.

@landscape1570 Жыл бұрын

Do you actually need to record voice samples in the original song language? 😁

@NerdyRodent Жыл бұрын

I can only speak English, but if you can sing in other languages then go right ahead!

@Elfie_Ev 5 күн бұрын

I tried to make AI Voiceover for Cyberpunk 2077 and I failed. Can't find Training model or any file for Inferencing voice. I don't know, maybe I'm dumb. Waste of time. I'm sad :(

@visivoakavalerie Жыл бұрын

hi before i install on my pc, i have an AMD, can the software run with it or i need an nvidia?

@rolgnav Жыл бұрын

What's the best voice training material to use? I recorded some reading and some singing, but on some of the inference outputs, my voice just sounds like i lost my voice and couldn't sound out anything. I assumed putting some singing recordings would help teach it since how you sing is usually different from how you speak?

@NerdyRodent Жыл бұрын

I used both singing and talking myself.

@seansmall3401 Жыл бұрын

So is this something that needs to be run on some kind of paid virtual machine service or can I run this locally? I'm a bit confused on how to do this and keep getting caught up on steps where it wants to run a .exe on my mac which makes me think that this cannot be done locally on a mac?

@NerdyRodent Жыл бұрын

Yeah you can run it locally, exactly like I do in the video, or use google colab (which is the best option on mac)!