Ollama with Vision - Enabling Multimodal RAG

Рет қаралды 12,656

Prompt Engineering

Күн бұрын

Пікірлер: 18

@throwaway-g9f Ай бұрын

This video got me hyped; I was waiting for ollama multi-modal for a long time.

@TheRealHassan789 Ай бұрын

this video and tools have so much value! ...people will sleep on it tho

@MM-24 13 күн бұрын

did you need to change the model context at all? Great video - thank you for the clear and non-sense presentation - excellent job

@HappyDancerInPink Ай бұрын

Nice, what GPU do you use for these tests?

@engineerprompt Ай бұрын

I have a MacBook Pro M2 Max with 96GB unified memory

@muntus 21 күн бұрын

@@engineerprompt is M4 Pro 48 enough for this as well, or will it struggle?

@Masoud2xm Ай бұрын

I am getting an error during indexing saying "Torch not compiled with CUDA enabled". I am using Mac M4. Could you help with this, please?

@timstevens3361 Ай бұрын

get an rtx 3060 12 gig gpu or rtx 4060 16 gig they run alot of diff model really well !!!

@samsquamsh78 Ай бұрын

great video and cool project! will chekc it out!! thanks!

@Know_Ur_World Ай бұрын

So can u help me with the usecase My usecase it to extract the relevant text and images available in the pdf.when any prompt is given then relevant text alogh with image should display as reponse in a sequential manner, not images separate text separate. Query:Give steps in RSA agent installation Answer: 1.Text1 Image1 2.Text2 3.Image2 Text 3 4.Image4 Text4 5.image5 Image 6 Text5

@ChristopherMcKinley-c1s Ай бұрын

Is there api calling in the future for this project? I would love to be able to use it as a replacement/upgrade from fine tuning models and running them from ollama.

@truthwillout1980 Ай бұрын

???

@ChristopherMcKinley-c1s Ай бұрын

@@truthwillout1980 The idea in my head is that I can host this on the LAN and have other programs just make an api call so as to not go through a GUI. Is that already an option and I missed it?

@truthwillout1980 Ай бұрын

@@ChristopherMcKinley-c1s Yes you should already be able to do that. I think there's a section in the video that explains it in fact (though I'm going off memory, I haven't watched it again). Just spin it up on a port number and call it.