Training a LoRA Model of a Character| LoRA training Guide

Training a LoRA Model of a Character| LoRA training Guide | stable diffusion Koyass A1111

Рет қаралды 140,255

Күн бұрын

Пікірлер: 346

@JerryAyres687 Жыл бұрын

I just had a bit of a breakthrough with my own LoRA training and thought I would share. I was getting some OK to poor results up until this point. After watching this deep dive I changed up my settings and got results that looked nothing like my subject, but the overall body was spot on. I was using a series of 2500 regulation images generated by my source model, so I probably do not have the ratio correct. After removing the regulation images for now, the outcome is quite good. I like using the runwayml/stable-diffusion-v1-5 model as my source because it seems to be very flexible using a diverse model set to work from. So far I have tested this with 'Photon V1' and 'Reliberate V 10'. Wonderful results. I will try to list my settings and technique best I can, if I don't mention something I didn't change it from stock settings in Koya_ss . My two datasets were 225 images and 79 images. The only difference is the 225 set I had 10 repeats, and 79 I had 20 repeats. I found I was not having enough repeats trying to keep the process under 3500 steps. More repeats really got things moving in the right direction. I am training on a RTX3080 and I have 32GB of system memory. I ditched buckets as I don't mind prepping the images very closely. I carefully pull images that resemble what I want the outcome to look like, I also try to select 'strange' angles if possible to give context. Once I have my images I use a program called 'Inbac' to crop to 512,512. I know buckets exists, but I am a control freak and want to make sure the parts of the image I want get looked at, and I also think it helps it to run smoother. Inbac is great because it was built for speed and you can set your aspect and resolution to get what you want FAST. I can run through 225 images in minutes. Then I use the utilities to caption using WD14 and just let it rip. I don't try to exclude anything. I use BooruDataset Manager to edit the tags. I cut out everything I don't want to change, and leave everything else in. I don't actually edit the tags on every image, just the overall tags. I especially cut out the eye color, hair color, etc. because it does get it wrong a lot and it just confuses things. Also some weird tags get in there like 'horror(theme)' which I definitely want to get rid of. After that it's training time! In Dreambooth LorRA I set my folders, make sure my 'class' tag is in the images directory. (man, woman, etc.) Again, I used a repeat of 10 for a 225 image set, and 20 for a 79 image set. OK. Settings. Here we go. I will list the things I changed. Train batch size: 4 -- I am using a 3080 so I have enough VRAM. If you train at 1 I would reduce the learning rate or decrease the repeats. However I believe more epochs is better so I suggest keeping those high. You can always use an earlier epoch if need be. Mixed precision: bf16 -- again using a RTX 3080, if you are using something 20xx or lower use fp16 Epoch: 20 -- I don't use 20 Epochs, but I want to make sure I have enough. You really do need to see which one is the sweet spot at the end. Save every N epochs: 1 -- You want to have choice at the end. Learning rate / Unet learning rate: .0001 (I use that for both) LR Scheduler: cosine with restarts Network Rank: 64 Network Alpha: 32 --try lower if you want, this worked for me Turn buckets off if you have 512,512 images Max Resolution: 512,512 Then in Advanced -- LR number of cycles: 12 That's it. I don't have it generate and image at every Epoch because it is using the base model and it always looks crazy for me. I will test the Epochs on another computer as they complete if I am in a rush for some reason, but mostly I just wait for it to be done then start trying. I usually start with Epoch 10 and if there are artifacts work back, and if it isn't cooked enough I work forward. Happy training!

@AI-HowTo Жыл бұрын

Thank you for sharing, great tips, for me I found that regularization sets work better for most of the cases for realistic characters, but not in all cases, it makes them more flexible under new settings such as new backgrounds/clothes ... but in general, trying to change hair color for instance can in most of the cases mess the picture and push it away from the subject, I also found that using 64 for network dimension and alpha of 32/16/8 produces better results, lowering alpha makes results finer, needs more epochs, but results are better... using 128 is fine for large number of images and quick results but overfits faster... great tips, thank you for sharing

@zizaart1199 Жыл бұрын

Thank you for mentioning BooruDatasetTagManager, I was wondering how should I google search this kind of software but then gave up.

@AkshaySood1 Жыл бұрын

This comment deserves much more likes and praises. Not everyone shares such detailed information of what worked for them and what didn't. You sir, have done a noble work 👏

@K-A_Z_A-K_S_URALA 11 ай бұрын

the question for a large number of sd 1.5 Rank and alpha images is about how much who puts??? 128-32 will it be ok?

@AI-HowTo 11 ай бұрын

128x32 is generally good if you have large number of images ... for lower number of images such as 40 and less, using 64x32 or 64x16 could be better to provide smoother training ... this is entirely experimental, Data is the most important part .... higher the rank usually faster the overfit happens, higher alpha also faster the overfitting occurs .... you can also check my newest training video at kzbin.info/www/bejne/rHKVp2V_qsx-bNk which might contain some better concepts/info and more accurate info.

@gurilagardnr2688 Жыл бұрын

The first tutorial on lora training that actually goes into detail with not just what settings to us, but WHY. Excellent and very useful. That''s an easy L & S .

@AI-HowTo Жыл бұрын

Hopefully it can be useful for some

@activemotionpictures Жыл бұрын

This is the best LORA tutorial out there. Thank you for not dabbling with mumbo-jumbo explanations, but getting straight to the point. Great video!

@AI-HowTo Жыл бұрын

Glad it was useful... you may also be interested in watching kzbin.info/www/bejne/rHKVp2V_qsx-bNk which I believe is better and more concise and shows SDXL and SD 1.5 training results for higher resolution images, it may be useful to you if you are interested in this area.

@Cadmeus Жыл бұрын

Thank you for showing the differences between different values in detail. Very helpful! It seems to take a lot of experimentation to get good results.

@AI-HowTo Жыл бұрын

You are welcome, and Thank you. true, while it works well from first attempt in some cases, in others, it could take 10s of trials with different values to get some good results, SD is general is experimental on its own, hopefully the new SDXL version would be better and save us more time in the future.

@Spironic 10 ай бұрын

Really appreciate you putting your money where your mouth is by showcasing your impressive results at the beginning of the video rather than having us just trust that you know what you're talking about like most KZbinrs. There are so many inexperienced people offering bad/mediocre advice and it's so hard to weed out the good from the bad without examples.

@AI-HowTo 10 ай бұрын

Thank you for the feed back, it is really useful.

@xdsp Жыл бұрын

Out of the four step-by-step tutorial guides I've watched, including Olivo's, your's is by far the best and most comprehensive. Much like you, I used a subject with 440+ image dataset, and 128 Network and 64 Alpha, 10 steps, 20 epochs and I only noticed the best results were at Epochs 4 and 5, with 6 -10 getting overfitting, and 11-20 outright breaking with lots of "skin fragmentation". Thanks for this, was pulling my hair out in frustration. I now might give it another try with a 32 Alpha and higher steps just to eff around and find out, experiment.

@xdsp Жыл бұрын

Might I also add I created a rather massive Regularization folder using standard 1.5 pruned, 5000+ images of random women, high res, low res, different outfits and settings.

@AI-HowTo Жыл бұрын

great to hear, the process is full of experimentation, but when we keep trying, we get something right eventually, the process will be much simpler in SDXL, I also suggest that you test with LyCORIS, it can produce some good results too, best of luck.... I tested hundreds of LoRAs, the best results were obtained for larger number of images such as 100+ compared to 40 and less, the model captures more details... having good data set with different face/body poses is also the most important factor even with lower number of images.

@AI-HowTo Жыл бұрын

Kohya uses image count x number of repeats and ignores the rest, so for 440 images and 10 steps you would need 4400 reg images, so having a large folder can help to cover the majority of the cases.

@abcde12a3 Жыл бұрын

This is one of the most interesting videos I have seen in weeks. At the same time, I realised I will need to read more about what stable diffusion, Lora etc even actually is and then watch the video again.

@chappie__ Жыл бұрын

This was the video that made certain aspects so much easier to follow and understand. Subbed 😉

@AI-HowTo Жыл бұрын

Glad it was helpful!

@user-nr2vw4yq5c Ай бұрын

Thanks for sharing detailed tutorial and better understand how to make good Image! wish you all the best!

@AI-HowTo Ай бұрын

Thank you, you as well

@minimal3734 Жыл бұрын

The modeling business is going to be hit hard.

@011egis Ай бұрын

no because AI makes too many mistakes

@SantoValentino Жыл бұрын

Also, Roop changed the game. You can describe any person and add the face you want. Thanks for the tutorial!

@AI-HowTo Жыл бұрын

roop works good for most of the cases, but for perfect resemblence, LoRA is still required, I think in the future, tools like Roop will replace LoRA completely, because LoRA can take a long time to develop if we plan to make something really good

@pastuh Жыл бұрын

Changed, but try to match hair style or shape of body.. It will be impossible :)

@irokzIA Жыл бұрын

@@AI-HowTo That's why you can combine LORA to make the same hairstyle, same body ratio, and ROOP for the perfect face replacement

@ddra9446 Жыл бұрын

@@AI-HowTo They need to develop a lora + roop merged extension which will create a rough face shape using sample images but not like lora taking up lots of time but instead, using a person's face model like embedding

@LeKhang98 Жыл бұрын

@@AI-HowTo I think Roop is good for Realistic images only. I tried it with 3D & Anime models and the results are not good. How do you solve that please?

@justinsmith2227 Жыл бұрын

Wow even models could be put out of business. A few pictures and the photographer could create whatever shot he wants.

@nikolaysorokin139 Жыл бұрын

Thanks for sharing. After numerous training, I've also concluded that the data set is key, and 20 images are not enough. Currently, I use 100+ images to train using Fine-Tuning method (I also used Dreambooth, lora, lycoris, but fine-tuning has given better results so far). Moreover, I use the same techniques - I am using a chain of training (384,512,768,1024). I am still searching for a better training setup, I will definitely try Lora one more time. But if I want to have realistic images with a face/body with all small details like small face scars - fine-tuning works well without breaking styles. You have to try Lycoris. It is not supposed to be used for human training but it works even better speaking about the person's likeness.

@AI-HowTo Жыл бұрын

thank you, I have tried LyCoris as well, but less people use it, I may create video comparing both later... in general, I also found that having larger resolution training data set produces more flexibility and better results, the problem is that it takes longer to train, so time is a big issue unfortunately...lots of experimentation required in this area, because it seems there is no exact science here, all about experimenting.

@EddADP Жыл бұрын

What is the fine- tuning method?

@AI-HowTo Жыл бұрын

in LoRA, it means just trying different parameters that may improve results such as different learning rates, optimization algorithms such as Lion ...etc. or taking an initial suitable trained LoRA model trained previously and tune with its existing weights with new data for instance... there is no one good recipe that fits all use cases though...lots of experimentation is required.

@nikolaysorokin139 Жыл бұрын

@@EddADP I meant fine-tuning for general purposes. there are multiple implementations, but I prefer EveryDream2Trainer, it has more control on that, but I am pretty sure kohya_ss also has its own version of general fine tuning. Dreambooth is not working with mid and large dataset, it is overfitting too fast. Also after the training, it is possible to extract Lora/Lycoris from a fine-tuned model. It is working pretty well, but I would rather try to train as author of this video said - more data, more steps. EveryDream2Trainer allows to get rid of the class and random token, and train using just normal words like Nikolay Sorokin, and that will be the trigger word. Also, it is possible to put multiple people into one model. Unfortunately, it does not work properly with safetensors so far, and you have to convert models on your own. And ckpt that it produces has a bug with DICTI, but it is simple to resolve (use automatic1111 to merge the base model and fine-tuned model with 1 multiplier, it will fix DICTI), but it needs only if you want to extract LORA or train further with kohya_ss. moreover, this type of training requires 24gb of VRAM and quite of lot of time - 2-3 hours with multiple configurations in sequence. But as a result, a more accurate human appearance. It is so accurate sometimes that it is hardly seen as a difference from a real photo. My models have published some photos on Instagram and nobody noticed that it is generated. Of course, it depends on the audience, if they are familiar with AI, they could notice small patterns.

@Chad_Thundernuts Жыл бұрын

Thanks, I've just started with LoRA training and I had doubts when people would say you don't need more than 30 images for a good LoRA. I mean sure, you can get acceptable results, but I want impeccable results, and my intuition told me that of course a higher quality large data set would produce a good model, though I had yet to try it for myself so far. Do you happen to know what the upper limit of a data set can be?

@SantoValentino Жыл бұрын

When I did my own I just did my face. Then I prompt a better looking body 😂

@AI-HowTo Жыл бұрын

yes, the images with the produced body can be used to feed back a new LoRA model that is more flexible than the face alone and can produce better full body shots and simpler prompts

@KeenyNewton Жыл бұрын

Excellent detailed video. Will give this a try

@Bloodysugar 5 ай бұрын

When it come to source image preparation I get great results by opening it in Photoshop, croping the image to have all the character in it without much around, automatically selecting the subject, copy/paste it in a new calc, placing it above an other calc filled with one solid color that hasn't any equivalent in the model, correcting a bit with the stamp tool if needed, rescaling it to an optimal size (512x512, 1024x1024 etc, like describe by Stability AI's documentation). By adding in the description the nature of the background (example : solid blue background), Kohya lose no time to generate useless noise in the Lora, and results are way more consistent.

@AI-HowTo 5 ай бұрын

Thank you for the tips, yes indeed I agree, I saw that to be more effective as well especially using smaller datasets, I think it did something similar in some other videos here in the object training in kzbin.info/www/bejne/rXu7ZZV4qamlb7s or kzbin.info/www/bejne/rHKVp2V_qsx-bNk , having simpler backgrounds in training data set seems more effective, but having complex backgrounds in the regularization set was helpful to make the LoRA rich...I did some models that worked really well with help of refinement in photoshop, for example i used backround removal tool and neural filters to filter the skin of real characters to make them look prettier and smoother, but it kept the features very well...it was fun to experiment with these things, it takes a sometime to fix the images properly, but once you have a good dataset, it becomes very likely to obtain a good LoRA model.

@Bloodysugar 5 ай бұрын

@@AI-HowTo Thanks for your answer. :) Many people fear AI taking jobs, but the more I'm into it the more I see how to have great results it not only takes to master AI softwares, but also classical 2D apps like photoshop, and even 3D apps. To me the combo Stable Diffusion + Photoshop + DAZ 3D + Unreal can offer insane possibilities none could give by its own. The more tool the more time it takes to learn and use, but the reward is awesome.

@AI-HowTo 5 ай бұрын

true, one tool on its own is not sufficient. As for Jobs, just like anything else, AI will create a whole new line of jobs overtime and all people will just use it to help them with their current idea or jobs even, man will adapt and make best use of it over time, and with proper regulations, the future can be better. wish you the best with your endeavour.

@Bloodysugar 5 ай бұрын

@@AI-HowTo Thanks, so do I !

@paveloconnor7759 Жыл бұрын

thank you brother, vould be great if you explained how SD works inside, because I can follow yoe (or someone else's) video and train LoRA's but I have no idea why it work and what is being done. Please make a video with a detailed explanation what is SD and how it works, what is a model, how the two are connected etc., you will get a lot of views because I cant find a video like that

@AI-HowTo Жыл бұрын

thank you for the tip, the problem is that SD is really complex, I will try to include a simplified explanation about it in some other video, that doesnt go through the details, hopefully people would gain better understanding of its general concept without going into details.

@imDanoush Жыл бұрын

Very beautiful dataset I must say! :)

@WhySoBroke Жыл бұрын

Excellent video!! Many thanks amigo!!

@AI-HowTo Жыл бұрын

Thank you too

@banninghamma 2 ай бұрын

I like the mentioning of "real person" when "Olivia Casta" is not a real person (it is a face-swapped character of a real, older model) LOL!

@AI-HowTo 2 ай бұрын

:), ya, i got that note from so many peopel, to be honest at that time i didnt know she was an Artifical character herself, I just wanted a sample data set with so many images that looks real, any way, things still apply.

@hairy7653 Жыл бұрын

This really helped my training, great vid. Hoping you might do a dreambooth model vid next.. and maybe a textural inversion . great stuff thanks

@AI-HowTo Жыл бұрын

great to hear, thank you

@wendten2 Жыл бұрын

0:40 no Stable diffusion is not a large language model, its a diffusion model.. Chatgpt is an LLM as an LLM has a text output

@AI-HowTo Жыл бұрын

you are correct, thank you for correcting me, in general, I was thinking of anything learning on Large amounts of data as LLM, but like you said, it is a diffusion model, thank you.

@xaiyeon_xiuzhen Жыл бұрын

ty sir, the video is long but quite long, not sure if able to make it like 20min, but watching it real time was great :D

@AI-HowTo Жыл бұрын

Thank you, true, i wanted to show more details, but indeed, it is very long, later on, i will do a shorter one that includes LoRA and LyCoris comparison, will make it shorter for sure.

@TheDragoy Жыл бұрын

It would be interesting to get a training video from you, but with training on a particular style (not people)

@cedrickrause5469 5 ай бұрын

Awesome tutorial thank you :)

@AI-HowTo 5 ай бұрын

You're very welcome!

@santasongs2298 11 ай бұрын

Fantastic video, helped me improve my results a fair deal. My hat is off to you sir, keep up the fantastic work! (also, consider yourself subscribed)

@AI-HowTo 11 ай бұрын

your words are much appreciated, thank you.

@banksrobber Жыл бұрын

thank you sir, very informative

@AI-HowTo Жыл бұрын

You are welcome

@goldenknowledge5914 Жыл бұрын

A how to account that is actually useful 😂

@malcolmreynolds4099 Жыл бұрын

this is a good tutorial, left you a like :)

@AI-HowTo Жыл бұрын

Thank you :)

@ainosekai Жыл бұрын

You shouldn't using chillout, it's really outdated model. And also, most realistic model is good at clipskip 1. Is 33.000 steps not too high? As you can see, it produced weird faces at last number epoch. As my experiment, I use about 40 images, and steps just about 2.500 steps. It can produced really good result. Nice tutorial btw

@AI-HowTo Жыл бұрын

thank you for your input, true, for real people clipskip 1 could produce even better results, I applied 2 in testing this video but mentioned that 1 could be better in audio only :) ... i think that when applying regularization, the training takes too many more steps to get good results based on my experiments, while without regularization, it can produce good results sometimes in few thousand steps but seems to generalize less...the more images we have also, requires more training to get better knowledge, while good 40 images could also be better than 100 ... the biggest problem I saw in training is in producing full body shots ... smaller number of images did allow producing good portraits, but not good full body shots ... while with using this large number of images, I was able to get full body shots with good faces with hi res at almost every image generated (or good enough faces) without using inpaint, which is the biggest problem for LoRA for real people...LyCoris also produced good results at different settings though.... hopefully in the future this will become easier and requires less experimentation.

@hindihits9260 Жыл бұрын

I think a good dataset also helps a lot in creating a great LoRA model. The dataset you chose looks fairly complete which covers all the types of angles and perspectives of the person. I tried to replicate this in my latest lora model, but because I didn't have a lot of good images, the training went just fine. Although the body proportions came out pretty good, the face just lacks the required details. I used the default learning rates and I think it was low according to my dataset. Do you think it would help to use DAdaptation instead of a constant learning rate? Also I really liked this video, please make more LoRA experiment videos if you can. It really helps to get better at training. For the first time I actually trained a person successfully using LoRA which I lacked in ability.

@AI-HowTo Жыл бұрын

if you have good GPU, i think other optimizers like DAdaptation and Lion could be better, they are not suitable for my laptop... in addtion, LyCoris/LoCo may also produce slightly better results (instead of standard lora... will make a video soon about this to compare both)... and some times, without regularization one may get better results ... its really mostly about experimentations, there is no fixed rule of thumb of what works for all cases and what doesnt in general... however, good data set in general, can produce general results

@hindihits9260 Жыл бұрын

@@AI-HowTo you can try colab if you can't do it locally. I think regularization images help when we train a person, concepts don't need regularization images since it is not in stable diffusion, that's why we train them.

@AI-HowTo Жыл бұрын

will do, thanks

@fernandodiaz8231 Жыл бұрын

Your video was a very clear explanation and detailed step by ste procedure. I would like that you confirm me that your method can work with Stable Diffusion in COLAB.

@AI-HowTo Жыл бұрын

same principle apply regardless of where you are running Kohya and the minor change in the UI versions, which can also be used in Colab but with some differences in GUI

@matthewkeating-od6rl Жыл бұрын

the girl that has instagram must love this video.

@logikaone2711 5 ай бұрын

Excellent video thanks for sharing ,... some non related thing caught my attention tho.... why the sneaky voice when talking of "non commercial use" ?

@funnerisaword Жыл бұрын

You are a saint. I was lost without this video, but now thanks to you I have found the kingdom of heaven. 🙏🏼

@AI-HowTo Жыл бұрын

Glad I could help!

@K-A_Z_A-K_S_URALA Жыл бұрын

респект подписка из России брат город Челябинск))

@AI-HowTo Жыл бұрын

thank you and you are welcome спасибо и добро пожаловать

@ZadokOvadia Жыл бұрын

I have tried in the past 2 days to achieve the same and it is not working. I would think that the only reason this worked for you is because you chose the trigger word to be a combination of tokens that the model already knows and trained for (when you write oliviacasta-v1 the model breaks it down into 3 token - olivia, casta, v1). My tests with your info (300-400 images, around 40K steps) did not yet produce any decent Lora. I am still trying 🙂

@AI-HowTo Жыл бұрын

I think the data is the key, lots of experimentation is required, anyway, when the sample images produced are starting to become broken, you should stop training to save time. I also suggest that you try without regularization, and without captioning too and see which results are more suitable for you...best of luck

@GomezBro Жыл бұрын

Can you provide that Lora please? Is it on Civitai or Mega? I would love to try this out!

@samkellman Жыл бұрын

That girl already looks like AI to begin with

@matuskokoska312 Жыл бұрын

Very helpful, especially the regularization bits, cheers man. Still experimenting with training better Loras, the biggest challenge I have is, that the character I am using for training, has a very specific hairstyle. Very distinct side locks and long twin braids in the back. My dataset consists of one half of screenshots of the character 3d model in various poses and other half of pictures already generated by the previous versions of my Lora model, mostly anime style. Any other things I should look out for?

@AI-HowTo Жыл бұрын

cool, its better then not to include anything related to hair style or features you want fixed in LoRA... so captions must be pruned... you can also try to train the model without captions at all, this often work for single objects... you might get better results if you want to get more distinct features ... if you GPU is good, also use higher res training images such as 768x768 or higher this produces better results

@matuskokoska312 Жыл бұрын

@@AI-HowTo thanks a ton, I will try it out ❤️

@gu9838 Жыл бұрын

god the thumbnail images are gorgous. i love sd and creating ai generated images but i sure hope one day i can find a gorgous women to do real life photos of. but hey this is a good start until that day lol

@MrCcfly Жыл бұрын

lora like LOOOOOL

@JerryAyres687 Жыл бұрын

Very interesting. I started my LoRa journey and had beginners luck with a large dataset and careful tagging. but very low size.Then I watched other tutorials that said that you should never exceed 3500 steps and I tried to keep it at that with worse results with a 32/16 size. I am still a little confused about the size of the LoRa and how that affects the outcome. My initial ones were 9meg (8/1) and outperformed ones with much larger size (32/16), but smaller steps. From your lecture I learned that you can have many more steps if you use more Epochs and have a larger size. - If I may ask a question what settings would you do for a 'Style' LoRa vs a 'Person' LoRa? For style I would have a dataset of around 300-600 carefully cropped and tagged images.

@AI-HowTo Жыл бұрын

unfortunately, I have not done style tests to give you answer for that ... anything related to SD seems mostly experimental, so even the developers of Koyass dont seem to know which settings work the best because there is no one settings that works for all... I have read some people used 1000 images for style ... so the more images the better in style too... and that they used LyCORIS for styles which can also be sued for characters as well and may produce slightly better results

@tengdahui Жыл бұрын

This is the most detailed explanation I have seen, but I have a question for you. I have a set of clothes pictures taken with mannequin models, and I took about 30 pictures to train lora, but the training results either included mannequin models or the clothes were different from the original pictures, I mainly want the same clothes, I hope you can answer it. What should I pay attention to in each link, such as parameter Settings, labels and so on, I hope to get your answer!

@AI-HowTo Жыл бұрын

kzbin.info/www/bejne/rXu7ZZV4qamlb7s hopefully this video covers all your questions, cropping head of the mannequin is necessary in some of the pictures... please check the details of the video, they can help you out.

@Gimpad Жыл бұрын

I had a lot of troubles with Kohya crashing on me but thanks to you describing the settings I was able to narrow it down and fix it. I wasted so much time checking that everything was correctly installed... A request: Would it be possible to provide a finished dataset for one of these guides? Starting with the same data and following similar procedures would help rule out potential errors in software/installation/configuration.

@AI-HowTo Жыл бұрын

I can only share the reg set see in description, but cannot share copy righted images, only use them to make videos, you can download the images as I have explained because they are public on insta.... but any set is ok, as long as it has variety... one problem by the way in this set that I have only noticed after completing the guide, is that it doesnt have enough face variations, the face must be seen from different angels, which makes LoRA results more efficient ... so I suggest you pick a new data set, and experiment with it ... with regularization, without .... dimensions 64, alpha 32, 16, 8...etc.

@aiartbx Жыл бұрын

Thanks for the in depth tutorial! Really helpful! 2 questions. 1. If you train only the face of someone with no body shots, does that mean when using the LoRa the body can be very flexible like whatever we put in prompt it will obey the body length and shape and only retain face? Or will it just not be able to do any full body shots? 2. This is great for replicating a real life person, but what if you want to make your own Ai model LoRa like KoreanDollLikess LoRa? Is that created by just dumping a bunch of random people and the LoRa creates a average of all the dataset?

@AI-HowTo Жыл бұрын

1- no, unfortunately, using face shots only will not make the LoRA generated good Body shots properly, it will then rely on other body shots in SD outside LoRA, and diffuses your face with the face in the other image and its body, this is why similarity will become less (unless by accident such as it took lower cropped part of a body and diffused it/ connected the face to that body) ... diffusion will result in decrease is similarity, thus, inpaint becomes necessary in this case, but even with inpaint the similarity will not be perfect.... hopefully this is improved in the next SD releases ... so unless it saw the body shot in the data set, it will become difficult to grenade suitable body shots in the future with same face. 2- most Korean models are based on 3D models, which allows perfect training on hundreds or thousands of images, where you can take the shot from every angel ..... if we use punch of random people, then it will try to generate some average of them, but it wont be stable and each generation will lean to one of the images in the data set .... SD is just diffusing images actually, it is not yet smart enough.

@aiartbx Жыл бұрын

@@AI-HowTo Thank you for the reply definitely helpful! One more question. In your video sample prompt what is --d 1 --/7 --s30 at the very end after your dimensions?

@AI-HowTo Жыл бұрын

--d: Specifies the seed of the generated image. --l: Specifies the CFG scale of the generated image. --s: Specifies the number of steps in the generation --n: Negative prompt up to the next option. --w: Specifies the width of the generated image. --h: Specifies the height of the generated image.

@DeezoAce Жыл бұрын

@@aiartbx if you want to do that i think a deep fake, faceswap like roop are the solution to do for

@aiartbx Жыл бұрын

@@DeezoAce thanks

@kloklojul Жыл бұрын

safed to "watch later"

@AAX1975 Жыл бұрын

I keep getting nothing but black images. I have an RTX 3080 Ti with 12GB of vram and I have followed your guide to the letter several times to create my likeness. It just gives me black images at every Epoch? It's a well produced video though...just wish I could figure out what is causing the black images. I also get an error saying "a tensor with all NaNs was produced in Unet." I'm working with 300 images of myself.

@AI-HowTo Жыл бұрын

Not sure, this never happened with me, this seems a problem related to the GPU drivers installation in your system, or a problem with pytorch and xformers installation... normally this Graphics card is perfect and works with bf16 as well ... i would suggest trying to run with precision fp16 for instance, disable xformers... but you shoudl look more into the driver installation to run at full potential for this GPU .... you can also test running 1 images only, make sure its dimensions is under 1024 and start playing with the settings too to figure out the source of the problem....and google driver related

@AAX1975 Жыл бұрын

@@AI-HowTo thank you for the advice. I will give it a shot.

@oImighty 11 ай бұрын

Did it help? @@AAX1975

@zikkk7578 7 ай бұрын

If I train a model on photos of apartment interiors then I have to use 1:1 (square) photos. Most of the photos are of different resolutions 5000x3050, 5000x2700 etc.

@AI-HowTo 7 ай бұрын

Kohya can train images of different sizes, 1:1 is not needed, in this video kzbin.info/www/bejne/rHKVp2V_qsx-bNk i trained on 1024x1024 images and 640x1024... when we enable Buckets we can determine bucket size and train any image size as long as our graphics card allows it.

@MrPlasmo Жыл бұрын

Thanks for the great tutorial. In general, what would be the “minimum” graphics card requirements to run this successfully?

@AI-HowTo Жыл бұрын

6GB can be used, the more the better... NVidia RTX + is best for this app.........however you can use Google colab which is free as well, but your images will have to go onto their cloud and processing is done there with limited weekly processing time, but it is the best option if you have a weak graphics card

@pdub420 Жыл бұрын

It's ironic that you chose an AI model to train an AI model to make an AI model.

@malignentsin Жыл бұрын

I came to make this comment 😂

@NowOrNeverAI Жыл бұрын

what do you mean?

@pdub420 Жыл бұрын

@@NowOrNeverAI Olivia Casta is a fake person. Or at least the face is just an AI. Face expression and lighting is always the same.

@Kurzrein Жыл бұрын

@@NowOrNeverAI It's rumored that Olivia Casta doesn't exist, that her pictures are AI generated or at least AI enhanced.

@fludblud Жыл бұрын

@@NowOrNeverAI Olivia Casta is not a real person, she is in fact a 40+ year old Russian model named Maria Tretjakova who basically gave her career a second life by using a teen face filter and created the identity of a 20 year old for a new generation of men to thirst over. Olivia Casta is quite literally the world's first commercially successful deepfake model and she has a whole team of guys who reply to her fan messages, though at this point I wouldnt be surprised if they are now bots too.

@kimweeng5358 Жыл бұрын

hey man @howto, thanks for the great and indepth video, ive been trying out LORA testing for myself but even with following your settings and instructions the LORA comes out very poor even with 360 images. Especially with human proportions, facial expressions and eyes. Any advice on improving the LORA training?

@AI-HowTo Жыл бұрын

the process is experimental in general, you should test reducing network dimension down to 64 for instance and alpha down to 16 or even 8, remove repetitive images in the training data set or images that have low resolution .... also test without regularization, for some data sets, it seems that it works better without regularization, different prompting,... or test LyCORIS as in kzbin.info/www/bejne/mobPnJxnfqZ1Ztk , which i saw to work better than standard LoRA.... if you followed the general guidelines and had a good data set that captures the face from different angels and the body, the results should be good.....honstly, the process is chaotic to some extent, sometimes, I use same settings and test the results in different checkpoints, i see some check points work better than others, for example... I tested on Majicmix v4 and Photon and it worked better than realistic vision and Chilloutmix in another model for same data set/// same settings /// despite that Majicmix is based on asian models and my character was not Asian...so to achieve a good LoRA, sometimes it works from first shot, other times, it takes more than 20 different models and tests to get it right unfortunately... hopefully that will become better in SDXL 1.0

@jason54953 Жыл бұрын

Although I could get Dreambooth through Stablediffusion operational and working, I could never get past TensorRT not found when trying to use Koya_SS on its own.

@LeKhang98 Жыл бұрын

Very useful video thank you very much. Which model do you think is best for training realistic face? I tried some models and I think one model can be good for generating images while not suitable for trainning.

@AI-HowTo Жыл бұрын

Thank you. Realistic Vision could be more suitable for more realistic portraits, Chilloutmix seems to give more aesthetic features which I personally prefer to have, and full body shots, in general, all of these models lack certain flexibility in them, for instance, limited body shots/limited poses/...etc... hopefully in the upcomming months SDXL would be a better option than all of these.

@Chad_Thundernuts Жыл бұрын

I can't remember what I'm using off the top of my head but I'll get back to you with the model I'm using. It seems pretty good for photo-realistic images, especially when paired with a quality LoRA

@santos.2020 27 күн бұрын

can you PLEASE, PLEASE update this tutorial?

@AI-HowTo 25 күн бұрын

kzbin.info/www/bejne/rHKVp2V_qsx-bNk this was a more recent and more accurate toturial, the info in it could be more accurate than the older one.

@RubenSterlingS Жыл бұрын

Thank you so much for sharing your knowledge! I have a question if you don't mind, I actually want to create a lora but only focusing on her face 100% because I want to change the proportions of the body at any time, should I just create a dataset only of her face? or is a good idea of having a few cowboy shots or full body shots on the images as well?

@AI-HowTo Жыл бұрын

Hopefully this was helpful for some.....unfortunately faces alone wont work, LoRA doesnt understand faces, and Stable diffusion will try to mix the face you trained with another person's body and his/her face from its model, this means, that the produced image will become slightly different from the original subject, diffusion means diffusing features from different images... however with hires fix the generated image might in some cases become more like the target and inpaint could also help, but not well enough to get proper resemblance.... it will also become very difficult to generate body shots in general .... so if LoRA doest have body shots it will fail to produce good body shots ...this is very disappointing, but the fact that SD and LoRA are not smart....so without good body shots, it is almost impossible to produce good body shots with clear face details without use of inpaint.

@AI-HowTo Жыл бұрын

a solution that may help you could be seen in this video kzbin.info/www/bejne/jaWqqaCkYpWkidk using after detailer extension, you can actually generate new body shots with lower weight of your loRA such as 0.7 then using after detailer it can automatically inpaint your pictures and will help you get what you want without using inpaint each time

@borrowedtruths6955 Жыл бұрын

It looks like you may have an obsession you should think about dealing with.

@theironneon Күн бұрын

2024 kohya looks different and doesnt work

@monkeysit7826 3 ай бұрын

I have questions on photos pick up. For example, I want to create a real character with all different face angles, including face close-up and face with upper body. Because it seems like that if the training images include too much face close-up while only few face with upper body, the images generated afterwards with upper body will have failed face whereas close-up photos generation will be fine. So is the ratio, or portion of different kinds of image important to prevent overfitting of one type and increase photos diversity? In general, how many photos per each types and how many training steps per that type of image would give good flexibility of that type as well as good quality. to make it more understandable, let’s say I just need to create a good close-up 45 degree face and 90 degree side face for my whole project, how many photos and training step should I use in general.

@AI-HowTo 3 ай бұрын

There is no rule... everything is purely experimental, even the creators of Stable diffusion do not know. I think that close up and upper body shots should have the same number to have a balance in training. we would usually have few full body shots such as 10% and most shots are protraits/close up/upper body You might be interested in watching this newer video kzbin.info/www/bejne/rHKVp2V_qsx-bNk which uses smaller number of images... as far as i have seen, full body shots are difficult to reproduce with high quality, this is why we often use After detailer to repaint the face (With prompt that has face with the LoRA inside) some people train only close up in a separate lora and upper body in another lora to get better results as for full body, since they will never function as perectly as one wants, they are best to have small ratio such as less than 10% (For realistic models in which details matter). experimenting is the key eventually, some models may work from first experiemnt, others might take 10s till you get something good, even use different regularization images could afffect the output greatly.

@monkeysit7826 3 ай бұрын

@@AI-HowTo Thanks you. It's helpful.

@kovanjola Жыл бұрын

Thanks for the lesson, but I hope you do a lesson on making a new model because our costumes are not there and we can't make a picture of our model

@AI-HowTo Жыл бұрын

it is possible to general similar pictures of virtual people using the SD, then fixing what is called (Seed) trying to obtain the same person from different poses or using Control net to create new Poses (there are some videos on KZbin about this, i have not create any yet)... will try to create later, then train on the generated data to get a new LoRA for a Custom model... the clothes can be changed using SD, but reproducing the same face with high level of accuracy is really difficult without having hundreds of high quality images unfortunately, so it may take days or weeks of try-error to get something proper in such case.

@kovanjola Жыл бұрын

@@AI-HowTo Thank you for the reply, the problem is in the clothes, and our clothes do not resemble anyone

@AI-HowTo Жыл бұрын

in this case, you should be training the clothes themselves, not the person, for example, in LoRA, if it saw different people wearing the same clothes, it will learn the clothes not the people, so what repeats in the training data is what is learned .... however, that requires having no repeating faces ... or train the clothes with the head cropped out of the picture .... what repeats in the images is what will be memorized by the LoRA.... the process will take a lot of work and may not always get what you want exactly...however .... if the images have the same repeating clothes you will be able to learn the clothes and dress them on various models in SD...lots or trials in the process might be required.

@AI_FOR_VFX Жыл бұрын

Your post is very helpful, I appreciate it can you make a video about train dreambooth in kohya ? Is training it the same as training in lora that you do? if i have more different models should i train dreambooth or lora ?

@AI-HowTo Жыл бұрын

may do so later.... dreambooth training follows almost similar principles in terms of data set/ regularization ... but dream booth results are better, because you can leverage the entire SD, but it requires more GPU , more disk space ... dreambooth is often better if you want to tune a private single person and produce quality results... but getting 2GB+ model with the GUP requirements makes it less practical, while LoRA and LyCORIS can train any number of concepts easily and move it around since they have smaller file sizes and do not affect the original SD model.

@AI_FOR_VFX Жыл бұрын

@@AI-HowTo Thanks for answering I trained a character with lora like you do and I want that character to have a style of its own, I want to create a style by myself, that is, I have to train that style on dreambooth to create a checkpoint file, then use that checkpoint to train lora. is it ok to do so?

@AI-HowTo Жыл бұрын

in general, LoRA will get affected by the checkpoint and its related styles ...what you are suggesting sounds logical ... but anyway, LoRA will only be as good on the checkpoint it was trained on, will become less effective on other check points when dealing with realistic images ... anime however is more adaptive.

@athecal6878 Жыл бұрын

When you are comparing the models in the set with 128 network rate, and one looks very similar to the subject but is not her, I have noticed when trying to train a lora of myself, that those off models tend to look exactly like my little sister 😂

@AI-HowTo Жыл бұрын

possible, simple 1girl prompts seems in general to produce a younger version of the character if no extra prompts that specify more details of the original subject is present

@marks7502 Жыл бұрын

Thank you. What happens if you use low resolution images (which is all I have)?

@AI-HowTo Жыл бұрын

You are welcome, you should try using another software to improve its resolution such as Topaz Gigapixel AI or Photoshop neural filters , to increase image quality/size/remove artifacts, otherwise, the LoRA will learn the defects or artifacts in the image as well, still you can try even with lower res images and it may produce acceptable results but unlikely to give you quality results.

@marks7502 Жыл бұрын

@@AI-HowTo Thank you for the reply. I have looked at resolution software. I was wondering if it would be better to improve resolution before making a LoRa, or make a fuzzy LoRa and sharpen the results. In either case, the resolution software does not seem to work well on faces.

@AI-HowTo Жыл бұрын

sorry for the late reply, but using a smoothing software for faces/images such as Topaz Photo AI or photoshop neurals will improve LoRA's results, doing that before is better than after in order to prevent LoRA from learning bad noise/patterns.

@marks7502 11 ай бұрын

@@AI-HowTo Thanks !

@AgileIntentions 2 ай бұрын

Hello. May I ask about your hardware? I have 4070ti and... my the speed of training is around 6-8 secs per it. I see your speed is around 3 it per second! Very interesting and curiously.

@AI-HowTo 2 ай бұрын

ti is more powerful than mine (laptop 3070) so you should get better speed than me, if you are training on same image sizes as me with this speed then xformers options might not be turned on, or the drivers require some update i think not, possibly.

@TheMarouuu Жыл бұрын

I am almost certain, almost, that I read somewhere that this girl is an AI model to begin with :D At least her face. You might be training AI with AI.

@AI-HowTo Жыл бұрын

so i was told, the concepts remain the same.

@TheMarouuu Жыл бұрын

@@AI-HowTo The wild part is, I think she became popular before all of this AI stuff blew up. Way ahead of everyone :)

@AI-HowTo Жыл бұрын

true, good for her and the creators, her pictures are of high quality and pretty

@TheMarouuu Жыл бұрын

@@AI-HowTo Great video btw, haven't had the time to follow-along, just watched it for now, but pretty interesting stuff. I'll be sure to give it a try.

@MrTortao Жыл бұрын

Thank you so much! Could you advise how to resume training form last save state?

@AI-HowTo Жыл бұрын

on the LoRA Training Parameters tab, the second option "LoRA network weights Path to an existing LoRA network weights to resume training from" click the notebook icon next to it and select the last epoch .safetensors file.... it will load weights from your recent LoRA and resume from the last trained epoch ... this is better than saving the entire state which could take Gigs for each state

@MrTortao Жыл бұрын

@@AI-HowTo Thanks, will try it

@gaborkiss1425 8 ай бұрын

After having a well-trained LoRA, can you put it into different poses using Controlnet or similar? For comics, of course.

@AI-HowTo 8 ай бұрын

yes, for consistency, training a LoRA for a character or using existing one, is the best option, controlnet is really useful for controlling exactly which pose you need, you can paint the face in After detailer for full body shots too to improve face quality.

@gaborkiss1425 8 ай бұрын

@@AI-HowToMany thanks!

@mayorc Жыл бұрын

What does it mean "64 rank model"?

@AI-HowTo Жыл бұрын

it is "Network Rank (Dimension)" it is like the size of the neural network used

@mayorc Жыл бұрын

@@AI-HowTo And what does it mean in this specific case? I mean for practical use.

@juridjatsenko2013 10 ай бұрын

CalledProcessError: Command '['C:\\Users\\grome\\stable-diffusion-webui\\cd kohya_ss\\venv\\Scripts\\python.exe', the script activates well from the folder. I can`t locate the command line within Kohya trying to change the command path. What am I doing wrong?

@user-uu6tm6wz4c 7 ай бұрын

SORRY, IM A NEWBIE AND HAVE A question! Im new into this world of Ai, whats the difference between this "stable diffusion and Fooocus?" which is is better and dont they do the sAME? For which secenario should i you which one???????

@AI-HowTo 7 ай бұрын

Fooocus is built on top stable diffusion models, SO Fooocus is like A1111 but easier.... A1111 has more community support and tools ... ComfyUI is workflow based gives biggest flexiblity and best performance, but not easy to use like the previous two... I have not used Focus extensively to say if it is better than A1111 or not however.

@relaxation_ambience Жыл бұрын

Hi ! How do you stop/interrupt training process ? I didn't find kind of STOP button. Usually I just close web UI and cmd line which actually doesn't look right...

@AI-HowTo Жыл бұрын

I dont think there is...but it is good that we can pause the training by clicking on the command line ... or press Ctrl C to stop it completely... doesn't sound right, but I dont think there is a command for it, if there is it would be great to know ... we can also resume training after we stop and close LoRA using the most recent safetensor generated later...which is also positive.

@doords Жыл бұрын

Just make sure your source pictures are all from the same age, no weird makeups where the person looks like someone else in another photo.

@AI-HowTo Жыл бұрын

thanks for the input, will do, I wont use this data set in the future, it's content is rather controversial and has inconsistencies more than i initially realized.

@lukasdaus1729 Ай бұрын

Olivia Casta

@cleverestx Жыл бұрын

None of my data has the back in images...so do I live nose, lips in the UNDESIRED TAGS @9:54 - you added them first and then removed them...I don't want the face to change, I want it be as accurate as possible for the actual face...?

@AI-HowTo Жыл бұрын

I actually, remove 1girl, nose, lips, eye color, realistic, and all the features i want to keep in my LoRA ... in this video, i didnt want to spend time on refining and fine tuning the captions, so I just made the note without applying it to avoid retraining, I never use this model in the real life so i didnt tune it, I only prepared it for this video.... so I suggest, yes, remove these features from your captions if they appear in all your images, that is better....that worked better for my other characters which produced better results.

@narulovechannel2249 Жыл бұрын

Came for the lora, stuck with the booba. Edit: it's meant to be a joke, I'm interested in both so don't worry. It's what's lora is for creating accurate boobas lol

@AI-HowTo Жыл бұрын

:) I should have chosen a different subject, I edited this video and trimmed it down a lot to reduce such displays, but where can one escape! :) I definately shouldnt have used this subject...but I had it trained couple of months before this video as my first LoRA, and saw it working to an acceptable level, its data was just saying: use me.... I thought! lets make a Video about it, and here we are, I dont want to delete the video either, because I saw many people finding it useful, despite that the newer videos are more accurate than this one and more practical.

@narulovechannel2249 Жыл бұрын

@@AI-HowToWhatever your doing, keep it up, coz it's working. Good job!

@AI-HowTo Жыл бұрын

Thanks.

@user-gq2bq3zf1f 7 ай бұрын

I've been following a tutorial video, but it seems like bf16 doesn't work with web UIs when executed through Colab. For local execution, I guess better hardware is required, right? I have an Intel GPU

@AI-HowTo 7 ай бұрын

true, RTX Series is essential to do the training locally, or powerful GPUs... bf is basically for RTX GPUs, fp can be used if bf isnt working.

@Derek_H_360 Жыл бұрын

Thank you!!!! How many reg images per training images?

@AI-HowTo Жыл бұрын

around 1024, I have seen some use more or less, I think Reg images number/quality/and content can slightly affect the output, a lot of testing is required in this area alone

@AI-HowTo Жыл бұрын

according to console info, Kohya ss only utilizes (number of images to trained * image repeats) ... so extra images in the class folder will be ignored

@PrimeCandi 11 ай бұрын

I downloaded Koya as a zip archive for my Mac and I struggle to even start it. Can anyone please help me?

@AI-HowTo 11 ай бұрын

sorry, didnt try it on mac, you may want to check this channel, the girl there is doing it on mac kzbin.info/www/bejne/qovNnYemib9rgsU hopefully it helps, also if you stick with the exacct steps on Kohya page it should work out and double check with issues if you are facing troubles, maybe someone else went through similar problems and was able to solve them. github.com/bmaltais/kohya_ss#linux-and-macos

@angloland4539 Жыл бұрын

@kfayayay2674 6 ай бұрын

What about LoRA training for game texture upscale ? Is there a way to train a good metal/rust LoRA ?

@AI-HowTo 6 ай бұрын

I am sorry, I dont have expertise about that... but I think that upscale models are trained differently using a different neural network model and different data sets ... if you want to train a metal/rust style for instance then training a style could be what you are looking for such as this kzbin.info/www/bejne/iIWVm51jatlrrpo , if you want something simpler based on a single image style then IP Adapter could be what you are looking for such as in this kzbin.info/www/bejne/qZnGaWaKlragp6M

@kfayayay2674 6 ай бұрын

@@AI-HowTo Thank you for taking time to explain 👍 I'll give it a try

@ReinhardMozes 4 ай бұрын

Since when DreamBooth appears there???? I can't understand this :(((

@AI-HowTo 4 ай бұрын

this video is from few months ago, back then, dreambooth training was easier on A1111 and i think it appeared on the GUI by default... if that is what you are asking about.

@Zibbibbo Жыл бұрын

i don't understand one thing, in the "undesired tags" you left 1girl and solo, to don't have it inside the captions, but when you show the caption it start with "solo, ..." so, i have to delete "solo" (and also 1girl/woman/etc) from captions or i need it into the captions? You say a thing but show another, i don't undeerstand

@AI-HowTo Жыл бұрын

sorry for the confusion, in these tests i removed nothing for simplicity and speed, ideally we should remove things like the hair color, the eye color, the hair style or anything that we want to be part of the LoRA and dont intend to change later... Kohya already shows that we can include 1girl, solo...etc because these too repeat in all captions and we know that this is 1girl or class woman already, so there is no need for them to be there... if we dont remove them, things might remain good too.

@Zibbibbo Жыл бұрын

@@AI-HowTo ooookkkey, thanks

@lwangacaleb2729 7 ай бұрын

If I can’t get sd automatic 1111 to generate regularization images can’t I get say 500 stock photos of women instead?

@AI-HowTo 7 ай бұрын

yes, in my opinion, getting real quality images is alot better than the images generated by SD, it only takes a while to collect them... the better the quality of the reguarlization images, the more beautiful and rich the LoRA model can get.

@DannySi Жыл бұрын

Koyha gui is different now, it just says lora and not dreambooth lora. Are the steps still the same?

@AI-HowTo Жыл бұрын

same principles apply despite minor changes in GUI. the parameters have not changed since its inception, even with SDXL, only minior changes apply such as (Checking SDXL check box and having 1024 images size as training data... all others are the same)

@diggydretv Жыл бұрын

For some reason, I get the face to look correct, but it is giving me some really horrible hands. Is there anything that I can do about that?

@AI-HowTo Жыл бұрын

unfortunately this is a common problem with SD, without having large number of correct hands in the training data with fingers, the model will not learn them correctly, you can use a Textual invasion like those on Civitai which may reduce some of the defects such as badhandsv4 in the negative prompt which is originally designed for anime but could help in realistic pictures as well in some cases civitai.com/models/16993/badhandv4-animeillustdiffusion or civitai.com/models/56519/negativehand-negative-embedding .... they might however change the generated image slightly

@AI-HowTo Жыл бұрын

so you either have to generate more images to select a good one or use inpaint .... only when the data has good number of hands/fingers, will it learn to generate good results (or increase the probability of that because the problem is with SD in the first place ....) hopefully this is improved in SDXL which will be officially out soon

@nowherebrain Жыл бұрын

you need these for what exactly?

@Xieken Жыл бұрын

not sure if it was mentioned, but what's the minimum vram needed for the gpu for this method of training?

@AI-HowTo Жыл бұрын

the more the better, I am running on 8GB RTX 3070, but I think there are some who use 4GB and it is working, there are some options that you can look up to use lower vram i think, there is also google colab that could be used for online training if your GPU is not good... there are some videos out there about using colab i think.

@Xieken Жыл бұрын

@@AI-HowTo ah okay perfect, we have the same gpu, ive seen too many methods requiring at least 16GB of vram. thanks and great video!

@witekwitkowski6022 Жыл бұрын

Would the setup work the same way for stable diffusion dreambooth extension?

@AI-HowTo Жыл бұрын

slightly similar in principle, in terms of regularization, having a good data set, but in dreambooth, we train the entire model with our data, so all model's weights are updated, this means dreambooth results would even be better! but dreambooth training takes more time, better GPU, and more disk space, because you are creating a new checkpoints that is at least 2GB in size for a single subject.

@stablefaker Жыл бұрын

Im trying to figure out how to continue training a Lora, i put the link to the last outputted safetensor file at the network weights thing you mentioned but it seems like it starts from scratch.

@AI-HowTo Жыл бұрын

not sure why, with LyCORIS it starts over, but it worked with LoRA, I tried it before, it read the weights learned and continued from where it ended without the need to save the entire training state, just from the LoRA.

@stablefaker Жыл бұрын

Ah well its LyCORIS so that explains it, what a shame :/@@AI-HowTo

@levuhoang2228 Жыл бұрын

thank you for share. Can you share Folder "woman".

@AI-HowTo Жыл бұрын

this folder is slightly larger, but i used some of its pictures in the training you can remove some of the images from it if you like, that wont affect the results in general... huggingface.co/datasets/AIHowto/Chilloutmix_woman_regset1/blob/main/1340chilloutmix_class_woman.zip basically any data set works as regularization set, it is recommended if generated by SD in the same checkpoint of training ... filtering the data set is not necessary but can be done ... using data from other Checkpoints is aslo ok ... even using real pictures is ok too.....

@AI-HowTo Жыл бұрын

please note that: Kohya ss will only use (number of images to trained * image repeats) out of class images... extra images in the class folder will be ignored, so having more images is ok to adapt to all cases, will explain that in another video soon

@justinjohngonzales7165 Жыл бұрын

im using Google colab stable diffusion notebook, automatic 1111, how to do this?

@AI-HowTo Жыл бұрын

Kohya can also be used in colab, i guess you may find some videos out there about this ... same principles/data/settings...etc.

@TentationAI Жыл бұрын

is it the same process for Lycoris models ?

@AI-HowTo Жыл бұрын

for Lycoris models, LoCo algorithm, just reduce the network dimension down to 32 for example with alpha 16 and increase convolution up to 8 and use alpha 1 with clip skip of 1 for instance, for LoHa only use network of 12 or square root of what you are used to in standard LoRA... I have not done enough tests on these methods though ... these methods requires a lot of testing, no one knows exactly what produces best results, everyone is just experimenting, I think! because anything related to Neural networks is basically experimental, there is no one good recipe for all.... but the general rule: good data will most likely produce good results.

@TentationAI Жыл бұрын

@@AI-HowTo thanks a lot

@TentationAI Жыл бұрын

@@AI-HowTo do you have Telegram or discord ?

@AI-HowTo Жыл бұрын

no sorry, not at the time being, time limitations.

@TentationAI Жыл бұрын

@@AI-HowTo Can you just re explain the part on the images stabilization, I have to generate with stable diffusion with the same model I will use for my Lora training 100+ portrait of differents women right?

@yhuna3365 Жыл бұрын

Hey! can someone throw some light over my problem? Even if I put weight 1 it doesn't work, my character isn't showing. Also on the folder of model-samples the picture that i got turns out black completely. I've tried many tutorials before and none worked for me. I follow the steps exactly with captioning also and it takes like 8 hours or more to make the lora ( i got 8vram ) and still had nothing. the resulting lora is about 140 mb so I'm guessing I'm missing something...Any clue? thanks!

@AI-HowTo Жыл бұрын

in that case you should only try with 1 picture first, till you figure the problem out, this is a problem with GPU driver i guess, because training is not happening to begin with, possibly try switching between bf16 and fp16 for instance, check you GPU compatiblity or GPU driver installation, not sure, hopefully someone else who faced the same problem could help out on this.

@AI-HowTo Жыл бұрын

or use google colab first, this problem wont happen there i guess in case the GPU was the source of the problem and other solutions didnt work out for you.

@yhuna3365 Жыл бұрын

@@AI-HowTo ok! Thanks I will try

@yhuna3365 Жыл бұрын

@@AI-HowTo Actually I used it (google collab kohya), and it worked I didn't remember untill now

@daniilsharp3443 Жыл бұрын

Hello, can you give a link where you can download your LORA?

@AI-HowTo Жыл бұрын

sorry, not making any available at the time being because I am only preparing these for the videos, they are not fine tuned enough, good Character LoRAs are better than this, good LoRAs require more trials/testing and produce better results than these in the videos...will consider uploading some later in relation to the videos for testing purposes if any finds them useful for learning purposes...while considering their copy right issues which might be an issue for this video for instance.

@photobackflip Жыл бұрын

25:13 - i see why you had to make an edit there. lol.

@AI-HowTo Жыл бұрын

:) true, yes i was afraid KZbin would ban the video or make it 18+ for seeing some transparent nude clothes

@xehanort3623 Жыл бұрын

Hey i run into a problem i have 400 dataset im using your settings but everytime I run the lora the time it takes says 16hrs. Is there a way to fix this?

@AI-HowTo Жыл бұрын

depending on the GPU, 16 hours for 400 images could be normal if you are running RTX 3060 and lower GPU models for instance and higher number of epochs... to speed up, make sure xformers is checked... reduce number of images ... reduce number of epochs ... don't use regularization images which requires half the time and you might still get great results as good with reg in many cases...in my case I have RTX 3070 only, it takes me 8 hours and more sometimes for training a LoRA with 300+ images to get best results ... but could take 1 hours for 20 images for instance, depends on how good i want the model to be and goals of the LoRA ... By the way, having higher resolution data set such as 768x768 makes the results even better... but will also make training take longer ... using LyCORIS can also improve results but that too takes longer time! GPU power and time is a big issue.

@xehanort3623 Жыл бұрын

@@AI-HowTo How do you use LyCORIS in the training process?

@AI-HowTo Жыл бұрын

I will upload another video soon on using LyCoris ... it is similar process, (from first drop down box, choose LyCoris/LoCon) in training parameters...just use smaller Rank such as 32-48 with smaller alpha such as 16... and convolutional dimension of 4 to 8 and convolutional alpha 1 seems to work well in general and produce slightly better results with more details than standard LoRA

@xehanort3623 Жыл бұрын

@@AI-HowTo Thank you

@headninjadog8120 Жыл бұрын

You know Olivia Casta isn’t actually real. It’s an AI face on a woman’s body. I find this very funny. Create AI model of AI model 😂

@AI-HowTo Жыл бұрын

so i was told recently :) anyway, what matters is the training method... the one who created the character is smart, good for them

@RyokoChanGamer Жыл бұрын

Hello how are you? how can i improve my lora? put more images to train, change steps, etc... do I have to start over or is there a way to take an existing one and improve it? Another doubt... if I create a lora of a character and want it to create nsfw images of that character, do I need to do something specific? put nsfw images of that character in training, for example... thanks

@AI-HowTo Жыл бұрын

usually, if you created a LoRA that doesnt produce good enough images, a redo is necessary, too many images is not a must, it is possible to create good LoRAs even with smaller number of images on Condition that images are of high quality ... I will post a video today showing better results than this video using 23 images only but Higher res images.... best results can be obtained from training LoRA on 1024x1024 images for instance...regarding nsfw images, there is limited flexiblity on what can be done depending on the checkpoint you are using, your data set content, andn how flexible your model is, in general using after detailer for instance is necessary for full body shots regardless oof what they are, for isntance, us weight 0.7 then use after detailer to pain the face, because SD in general is not that great with full body unless it has high res data set to begin with.

@RyokoChanGamer Жыл бұрын

@@AI-HowTo thank you very much for answering... I thought the more images the better it would be... I'm still a bit lost, I don't know how many steps count, if many epochs also count, I also didn't understand that part that explains about the loss during the process.. I'll be waiting for the video you're going to post to learn more... thanks

@AI-HowTo Жыл бұрын

sorry for the confusion, its entirely experimental, conclusions continue to change overtime with new tests/info.... I often get better results with more images, and use more than 70 images in the data set even for people, but if we have higher image resolutions, even smaller number of images such as 20 can give great results too as in this video kzbin.info/www/bejne/rHKVp2V_qsx-bNk ... how many steps: not conclusive, i get some great models in some cases with 15000+ steps, sometimes from 3000 steps, so it depends on the data/number of images / complexity of the data set ... so i monitor Kohya sample generated images and decide when to stop training ... there is no such rule that 3000 steps is more than needed for example.

@RyokoChanGamer Жыл бұрын

@@AI-HowTo I understand... no need to apologize, you're doing a great job, you're helping me a lot... I'll watch your new video now... but first I have one more question (sorry for the inconvenience), whenever I add regularization images my character simply disappears in training, I've tried with a few regularization images and also with many, and also without them, when I don't use the training it goes on normally (seeing the samples created at each epoch) but when I use regularization images the character never appears, even after finishing the process even with more than 4000 steps, not even a trace of it appears, the generated images are similar only to the regularization images... what could be wrong?

@AI-HowTo Жыл бұрын

based on my new tests with regularization, you should use the instance prompt(or trigger word) such as "young woman xyzinstance" otherwise it will take alot more than 4000 (could take 30000+ depending on dataset) steps to differentiate between your class and woman class in reg set...Kohya is learning the regularization along with the data ... see how I did it in the new video with regularization included ... I added the instance prompt to beginning of each caption, then added the instance prompt in the sample prompt generated by Kohya, also removed 1girl from captions ... without regulrization however, even not using trigger word can produce the LoRA effect since we have nothing else trained with our data

@KIRRRRU19 Жыл бұрын

I tried doing this but I keep blowing it.

@AI-HowTo Жыл бұрын

the process is experimental in general, you should test reducing dim down to 64, alpha down to 16, no reg set, LyCORIS, different check point...just keep trying, sometimes it works from first attempt, others, you need 10s of attempts to get a good model