Llama 3.2 Vision + Ollama: Chat with Images LOCALLY

Рет қаралды 10,347

Leon van Zyl

Күн бұрын

Пікірлер: 42

@leonvanzyl 7 күн бұрын

Thank you guys for the incredible support! Remember to like and subscribe to help this channel out 🙏

@BirdManPhil 4 күн бұрын

Leon I need your expertise good sir

@Aureliusus 2 күн бұрын

Thank you for helping us getting started. You saved me quite some time on how to enhance my local llama with vision capabilities.

@leonvanzyl 2 күн бұрын

You're welcome 🤗

@chizzlemo3094 7 күн бұрын

super helpful. its quite incredible how lacking in notes and examples these models are when released, so thanks very much.

@leonvanzyl 7 күн бұрын

I know!

@rogerthao588 7 күн бұрын

Your tutorials are so helpful for me. Also, simply subscribing to you keeps me updated on new AI releases and tools. I learned about Flowise and Langflow from you. I also learned about the release of Llama 3.2 Vision (this video!) from you as well! Thanks!

@leonvanzyl 7 күн бұрын

That's awesome to hear. Thank you

@TeamUnpro 4 күн бұрын

Ty~ this will help greatly, was so tired of copypasting from a terminal lol

@leonvanzyl 4 күн бұрын

You're welcome 😁

@dgitalnarrative 7 күн бұрын

I would love to see a NextJS app with Ollama. Cherry on top would be agents looking into the images and categorizing them or something. Thank you for your amazing content @Leon

@leonvanzyl 7 күн бұрын

Awesome suggestion

@BadBite 7 күн бұрын

very useful Leon, like everything you are posting! the best channel on the subject

@leonvanzyl 7 күн бұрын

Thank you!

@angelochu3156 5 күн бұрын

Hi Leon, How much VRAM do you have on your computer to run this 9B vision model?

@tarassvystun466 7 күн бұрын

Thanks for the video, I would like to see another video tutorial with a STREAMLIT

@leonvanzyl 7 күн бұрын

Will do

@unokometanti8922 7 күн бұрын

tried both 11b and 90b models. the 11b seems to be uncensored while the 90b is censored (first shortfall…); on top of that it looks like multimodal models cannot support parallel streams of actions (i.e. extract info from an img via OCR and then perform a search on the extracted contents); last but not least, they seem to be able to process only 1 img at a time….The resulting capabilities appear to be far behind “commercial” models. Unfortunately. Does anybody know if an uncensored version of a decent vision-enabled LLM has already been created?

@ShuaibShahzan 4 күн бұрын

Thanks again for the great tutorial Leon. Please create a Next.js app.

@leonvanzyl 4 күн бұрын

Will do

@ShuaibShahzan 4 күн бұрын

@ Thanks Again Leon. Another idea for video is can we call one Agentflow/Chatflow from another Agentflow/Chatflow. The rationale behind this is to break complex flows into smaller flows.

@Col-pd2zd 2 күн бұрын

Does Flowise allow us to use this with ollama chat?

@leonvanzyl 2 күн бұрын

Not yet, but I think they'll release the feature SOON. Will create a video on it as soon as it's available.

@leanprogrammer 7 күн бұрын

nice! Nextjs ollama client would be really cool to see. i also wonder how good this model is with web design - convert design to code

@leonvanzyl 7 күн бұрын

Ooh, interesting idea.

@rickyS-D76 7 күн бұрын

Thank you, like to see integrate this model using Flowise ❤soon

@leonvanzyl 7 күн бұрын

Oh, trust me. I'll definitely create a FW video on this

@grahamharris7010 7 күн бұрын

I hope this will be usable with LM studio eventually. Hello from a fellow SAfrican xD

@leonvanzyl 7 күн бұрын

Howzit! I seriously need to create LMStudio videos as well

@grahamharris7010 7 күн бұрын

@@leonvanzyl Oh yes! I have a cool multi AI agent chatroom running with an admin backend to control their convo and humans can partake in the chat. seriously believable chat agents and al running off Llama3.2 Instruct 3B and LM studio. Cheers on the content and subbed!