How to Fine-Tune LLama-3.2 Vision language Model on Custom Dataset.

Рет қаралды 2,961

NextGen AI Guy

Күн бұрын

Пікірлер: 15

@sarithamiryala2819 2 ай бұрын

Nice video

@nextGenAIGuy490 2 ай бұрын

Thanks

@ChandanKumar-nr2vm 2 ай бұрын

Thanks you sir this video help me to understand the this model in very first video

@nextGenAIGuy490 2 ай бұрын

Glad it helped

@georffreyarevalo3067 2 ай бұрын

Good video, how can I test the model that push to Hugging Face? Could you please share an example.

@nextGenAIGuy490 2 ай бұрын

Thanks. You can use AutoModelForVision2Seq to load your model. You need to pass your model path and use huggingface access token.

@soulaimanebahi741 Ай бұрын

thank you for the démonstration. do you think can we fine tune this model on a videos data?

@nextGenAIGuy490 Ай бұрын

@@soulaimanebahi741 No, We can't.

@babusd 17 күн бұрын

Absolutely wrong ! If you dont know say "dont know" . Dont mislead him , fine tuning over video is possible

@nextGenAIGuy490 17 күн бұрын

@@babusd relax bro. Do one thing rather than saying show the proof. During training on llama 3.2 vision model they have used image and text pair. Read the model architecture. And if you know show me where they have written we can Fine-Tune vision model on videos.

@tamilselvan3525 14 күн бұрын

how long will it takes for the whole process?

@nextGenAIGuy490 14 күн бұрын

@@tamilselvan3525 I haven't trained completely because of GPU limitation. So i won't be able to answer. I just wanted to share that its possible to train and how to train. But training time is dependent on dataset, hardware(GPU configuration) and no. of epochs you are training for.

@tamilselvan3525 14 күн бұрын

@@nextGenAIGuy490 Okay, thanks.

@mohammadaqib4275 2 ай бұрын

w e are fine tuning llama .2 vision model but collate functionwas utilising Qwen2. IS it fine to use Qwen model in collate function while fine tuning llama-3.2?

@nextGenAIGuy490 2 ай бұрын

By customizing the collate_fn, we are able to control how the data is prepared. we are using it for batch processing, padding bringing data into format to train the model. Its fine to use it.