A Great New IPAdapter with Licensing Issues: Kolors

Рет қаралды 4,947

Күн бұрын

Пікірлер: 54

@risunobushi_ai Ай бұрын

I did a whoopsie and read the face analysis data wrong, and since most of the dists are at 1, I might have hooked them up incorrectly. I'm going to check that out now and correct the workflow accordingly. Regardless of that, everything else still makes sense. Sorry about that! UPDATE: here's the new comparisons, apparently insightface was not working correctly so I switched to dlib. tldr the results are basically the same, but with the correct values: imgur.com/a/0MJtwn7

@MilesBellas Ай бұрын

Update video ?

@risunobushi_ai Ай бұрын

unfortunately you can't update YT videos afaik

@TheDocPixel Ай бұрын

Of course you can update videos. Make a new one with correct info and take this one down.

@MilesBellas Ай бұрын

@@risunobushi_ai The space changes in hours.😊 . Blackforest Flux by former Stability employees = wow ! 😁

@risunobushi_ai Ай бұрын

@@TheDocPixel yes and no. it's not as simple as just making a new one, mostly because it'd takes me half a day / almost a day to record the new footage, edit the video in DaVinci, get the upload ready on YT, etc. Since this is not my full time job, I have limited time to invest in it - that's why I pinned a comment with the correct methodology and results, and since it's a fraction of the video that follows incorrect methodology I'd rather do that instead. I know it's not the best solution possible, but I hope you get the reason for it.

@latentvision Ай бұрын

Kolors can be used with the "Community Unified Loader" node. The PLUS models are not trained on face embeds but on generic openai and laion models. Using the face analysis tool for a comparison in this case is probably not the best option. Probably a more reliable way would be to compare them for composition only (especially precise composition) or style. But one uses openai clipvision at 336 resolution the other laion at 224, so an apple to apple is very difficult to do. Anyway great content, I didn't know your channel, keep up the good work and contact me if you have any doubt on the ipadapter. PS: I totally agree on the license

@risunobushi_ai Ай бұрын

Hi! Huge fan, thank you for what you do. Being a fashion photographer by trade I’m halfway between the technical side and the “creative” side of things, so it’s great to get more insights from you. I wish I understood more of the technical side, but I’m trying to catch up. I agree that it’s not the greatest way to compare them, and while I do say so during the video, I still felt I couldn’t just go with the gut feeling of “I like this one more because I like it”. But it was a call I made arbitrarily nonetheless, so I don’t know if it was the right one. Didn’t know about the community unified loader, I’m glad I discovered something new today! EDIT: forgot to mention that I thought of using the face analysis tool mainly because they boast "better face adherence" on the project page

@latentvision Ай бұрын

@@risunobushi_ai yeah they probably used a lot of faces for the training, but they said they are training Kolors FaceID, that will be interesting and could be probably a more reliable comparison depending of course on the clipvision model they used. They talked about FaceID so I'm afraid they are using inisghtface which means... licensing issues.

@gwanyip345 Ай бұрын

Super helpful and informative as always, thank you for your work Andrea!

@risunobushi_ai Ай бұрын

thank you!

@AB-wf8ek Ай бұрын

9:30 Apache is pronounced ah-PA-chee It's the name of a native group in the Southwestern part of the US. They were known for their intense warrior skills on horses, and the name was adopted to reflect the stalwart attitude of developers maintaining network open source projects in the face of giant corporations like Microsoft, who work to privatize resources for profit.

@risunobushi_ai Ай бұрын

Thanks! Being Italian I kept hearing Ah-pa-sh while growing up, so I realized that only while I was editing the video, like it didn't sound quite right in English.

@AB-wf8ek Ай бұрын

@@risunobushi_ai No worries, good excuse for me to look up the history of the license :)

@ImAlecPonce Ай бұрын

Very interesting!!! Thanks. The contract does seem strange. Either be open source or be licensed, It looks like they want to license it but to also take advantage of the open sorce comunities input.

@risunobushi_ai Ай бұрын

yeah not a fan of the licensing and what's happening in general in the space

@goodie2shoes Ай бұрын

You and Matteo have tought me a lot!

@risunobushi_ai Ай бұрын

Matteo is a irreplaceable source of knowledge!

@mohammad.m6360 Ай бұрын

What is the name of Matthew's KZbin channel? I don't know him

@risunobushi_ai Ай бұрын

it's called latent vision!

@mohammad.m6360 Ай бұрын

Nice❤ Your KZbin channel deserves more view and subs, great videos

@risunobushi_ai Ай бұрын

Thank you so much 😀

@flyingdogsoft Ай бұрын

Great! You seem to be one of the few KZbinrs who takes licensing seriously. Most others always hype up a new AI and never mention the topic. Yet there are licensing issues with many projects: for example, all those using Insightface (Face-ID and many others). But SUPIR is also such a project. Often, it is much too difficult to even obtain a license. I wish this topic was always addressed by all KZbinrs. This would finally spark a necessary discussion about how to solve this - for example, to build comparable projects like Insightface.

@risunobushi_ai Ай бұрын

Thanks! I guess it’s because I’m a professional in the field first, and a KZbinr second - and I started doing KZbin videos because I couldn’t find the content I was looking for, addressed specifically to professionals -, that’s why I care so much about “niche” things like licensing, scalability, developing custom workflows, etc. unfortunately the space is really young still, so a certain amount of “grey areas” are to be expected, but still making it more complicated than it should be doesn’t help.

@flyingdogsoft Ай бұрын

@@risunobushi_ai I fully agree. Transitioning from an AI enthusiast to professional image editing makes this issue very important. Something significant is coming soon for ComfyUI, and addressing this would be crucial.

@denisquarte7177 Ай бұрын

Great and informative. In fact there is a 'wave' of new base models. I really hope Auraflow is doing good. But a general question: Is style transfer not explicitly there for transfering style and getting rid of the subject so that in that sense Kolors ipadapter would doing a far worse job than matteos ipadapter?

@risunobushi_ai Ай бұрын

Thanks! I guess it kinda depends on what's the goal of the tool - the usual IPAdapter, if I had to guess, I'd say aims to do every single thing differently and because of that design philosophy (which I agree with) it has various modes that are very different from one another. Whereas the Kolors one seems like it's trying to do what MidJourney does - a single tool for most of the common stuff that the normal user wants, without too much time spent understanding what does what. It's more of a "different strokes for different people" kinda thing I think

@vitalis Ай бұрын

It would to be hard to enforce licensing restrictions. If one uses it as an online service and it gets big then maybe these companies would first find out about it and then spend the energy analysing the data to confirm it. As a normal joe, one could just pass it through photoshop or do another image to image to avoid potential ai fingerprinting.

@risunobushi_ai Ай бұрын

as a regular Joe sure, I can't endorse it (obviously), but no one's going to find out. for professionals and brands, there comes a certain point where licensing is pretty important, even if it's hard to enforce, that's why clear licensing paths should be preferred

@vitalis Ай бұрын

@@risunobushi_ai yes, agree 100%. We have seen the backlash against Adobe as well, and for good reason.

@AbsolutelyForward Ай бұрын

You speak from my soul - it's so annoying to have to deal with vaguely worded licenses or to have to go into a personal exchange to negotiate "special conditions". Name the price and that's it. But the topic of licenses in connection with ComfyUI has been on my mind for a while - does anyone know of a source that breaks down all the individual components (nodes, models, preprocessors, etc.) by license? I'm trying to create workflows at work that are allowed for commercial use, but I keep discovering things in my workflow that can only be used for private use (e.g. SUPIR, some controlnet models, insightface models, etc.).

@risunobushi_ai Ай бұрын

I wish there was. Up till now I've been going backwards and looking at the license of each single building block of the most used nodes and the models they use. Maybe one day someone will have the time to compile a library of commercially usable nodes.

@akasht5 Ай бұрын

Thank you for the great video! Was able to get the links and run the example...like holy smokes is Kolors just a base model?? Has it been fine-tuned already by its creators? It seems way better than SDXL base model...and even better than Juggernaut, to me at least. I played around with setting CFG to 0 or 0.1 and including perturbed attention guidance at 0.8, with 20 steps and euler ancestral CFG++ with SGM_uinform sampling produced some very cool variations. Very very nice workflow!

@ChrisDermody 7 күн бұрын

Any tips for using this with a pose or depth controlnet, or with an inpainting workflow perhaps? Every time I try to use a controlnet in trhe latent I get a Ksampler error "KSampler mat1 and mat2 shapes cannot be multiplied". Kolors is all SDXL based right? I'm using SDXL controlnet models 🤔

@SnowTung999 Ай бұрын

I have try to use Kolors + Ipadapter, in new setting, i cant reference a face shape without original face color, if I use style reference, it also reference face color too. Face id or plus model, for Kolors plus changing clothes is work well . but face, i still cant control it well.

@alecubudulecu Ай бұрын

Pretty wild age we live in. A chinese group concerned about copyrights …

@AB-wf8ek Ай бұрын

Yea, and American groups concerned about genocide. Irony abounds.

@alecubudulecu Ай бұрын

@@AB-wf8ek that’s not really irony. Americans genociding is a new thing only past 50 years. Maybe 20. But china been blatantly stealing for hundreds of years. Since gunpowder they shifted to copying and stealing. (Prior to this I’ll admit most inventions came from china)

@Jacck Ай бұрын

Hey andrea what are your recommendations for a model to restore hands in addition to your product photo relight workflow?

@risunobushi_ai Ай бұрын

Honestly I’m not great at inpainting in general, and hand restoration is a branch of inpainting. I usually look for any good all around SDXL model since XL has inpainting capabilities by default.

@plagiats Ай бұрын

Hi! Thanks for the tutorial. I get this error: Error occurred when executing CLIPVisionLoader: Error(s) in loading state_dict for CLIPVisionModelProjection: size mismatch for vision_model.embeddings.position_embedding.weight: copying a param with shape torch.Size([577, 1024]) from checkpoint, the shape in current model is torch.Size([257, 1024]). Any help much appreciated!

@risunobushi_ai Ай бұрын

Are you using the same models I linked?

@IOVART Ай бұрын

Getting this error; !!! Exception during processing!!! Error(s) in loading state_dict for CLIPVisionModelProjection: size mismatch for vision_model.embeddings.position_embedding.weight: copying a param with shape torch.Size([577, 1024]) from checkpoint, the shape in current model is torch.Size([257, 1024]).

@risunobushi_ai Ай бұрын

are you using the same nodes as I am or did you change something? I was getting that error with another IPAdapter node.

@IOVART Ай бұрын

@@risunobushi_ai Yes. The same nodes as yours.

@risunobushi_ai Ай бұрын

Care to send me your json file over at andrea@andreabaioni.com? I’ll see if I can reproduce the issue

@risunobushi_ai Ай бұрын

hi! you downloaded the fp32 version of kolors instead of the fp16 version

@IOVART Ай бұрын

@@risunobushi_ai Thanks!

@mujahidanshary Ай бұрын

saya tidak menemukan folder LLM di pc saya

@risunobushi_ai Ай бұрын

you can make a new folder inside of ComfyUI/models/ and name it "LLM"

@afrosymphony8207 Ай бұрын

Aswearugud, of all communities to be obsessed with licensing the ai community would be the ones i least expected to be moaning over licensing. its fookin bizzare to me that you wont use this commercially but will happily use stable diffusion s well. Last i checked stable diffusion models were built without artist permission/ licensing agreement. if this licensing nonsense was such a big deal to you ethically you wouldnt be using stable diffusion in anyway shape or form. i just cant understand the incessant need for borderline moronic righteous validation. how d fook would the chinese company know you generated the image your selling with kolors??? i mean how can you seat there and say releasing it as closed source with a clear commercial license is better than like this when yu're using juggernaut model trained on images they didnt license. unfookingbelievable.

@risunobushi_ai Ай бұрын

It’s a fair bit more complex than that, though. Personally, I don’t care if people don’t pay for photoshop - but I wouldn’t advise a client to bypass an Adobe subscription. Apart from the ethics of generative AI, which I’m not too invested in, mainly because as a fashion photographer we’re brought up from day 1 to do something very much akin to dataset compiling in the form of moodboards (but I guess they’re “inspirations”, so that’s fine?), something that the whole fashion industry do, brands and structured clients have legal teams asking for commercial licenses, how much do they cost, and how do they work. The industry won’t ever be ready if it doesn’t get this sorted out, and I speak from experience. Clients who want to pour 10s -and 100s of Ks into genAI R&D want clear licensing terms.