Real-World Multimodal AI Agents use case

Real-World Multimodal AI Agents use case | Phidata and Gemini 2.0

Рет қаралды 3,078

Күн бұрын

Пікірлер: 29

@hemkeviv 4 күн бұрын

Great video Tarun. Just a fix... the upload image and camera image options in the code will always return wrong output as the streamlit uploader which you are passing to analyze_image() is not the file path rather it is Byte10 class. As a result even you provide the legit image it will always be unable to analyse. We need to temp save the image and then provide the path. This can be done by adding with NamedTemporaryFile(dir='.', suffix='.jpg') as f: f.write(uploaded_file.getbuffer()) analyze_image(f.name) Don't forget --> from tempfile import NamedTemporaryFile A great work BTW, keep rocking.

@AIwithTarun 4 күн бұрын

Good catch. Let me fix this asap 😅. Thank you for pointing this out, I will make the changes and try to pin this comment or add description on the changes done in the code. Thank you.

@hemkeviv 4 күн бұрын

@@AIwithTarun Also add "Please do not analyse any other type of images." in the system prompt else it will analyze any type of images. 😊 nothing wrong though, it can be generalised.

@AIwithTarun 4 күн бұрын

The app is updated: github.com/lucifertrj/Product-Ingredient-Agent/ Thank you :)

@hemkeviv 4 күн бұрын

@@AIwithTarun Great! Off topic, what software you use to create videos? I want to create few videos as well, if you don't mind telling me.

@AIwithTarun 4 күн бұрын

@@hemkeviv sure. I use iPhone to capture the face, QuickTime Player to screen record and finally Final Cut Pro to edit or merge the video. Initially I was using Zoom.

@salesforcecafe101 8 сағат бұрын

Great work !! Keep helping with your rocking videos !

@tora3492 2 күн бұрын

which IDE are you using & how to get this kind of terminal 17:06?

@AIwithTarun 2 күн бұрын

I am using VS Code. For terminal it’s ZSH. When you run print_response you get that kind of results

@תמרכהן-כ4ק 6 күн бұрын

Thank you for this series! You are a great teacher 🫶

@AIwithTarun 6 күн бұрын

Thank you🚀 We are just getting started. It’s only been 3 videos yet. More videos on the way. Keep supporting

@ghulamjunudchishti3440 6 күн бұрын

Thank you brother, I always watch your videos.

@AIwithTarun 6 күн бұрын

Thank you brother. Keep supporting and watch the videos. I hope you build some cool projects with this 🚀

@ghulamjunudchishti3440 6 күн бұрын

@@AIwithTarun Yes brother, keep uploading videos regularly.

@ramareddymathsacademy 17 сағат бұрын

Great video, Tarun! It’s really helpful. Can we have a mobile version where we can scan product details and get reviews/ratings to decide whether to accept or reject the product when I’m at Dmart? The details should be available with just a button click, so I can read them later. This should follow the quality control guidelines set by the Government of India. Just sharing some ideas.👋👍

@AIwithTarun 15 сағат бұрын

Thank you. Yes, the app is deployed, it can be tested directly on mobile as well. (ingredients-analyzer.streamlit.app/). You need to upload the image and get the results accordingly. Regarding the reviews and ratings, as of now its not implemented, but yes its easy to achieve that along with implementation of quality control guidelines

@stalinmurugesan9744 4 күн бұрын

Thank you very muchTarun. when I try first approach I get only few line message like "The image shows a product package for Bournvita, a nutritional supplement. The package is primarily orange and white,etc. Not full details like your output. Can you let me know wny ? I used same System_prompt, Instructions,etc.

@AIwithTarun 4 күн бұрын

Interesting. Can you provide temperature = 0 and may I know what LLM are you using?

@stalinmurugesan9744 4 күн бұрын

@@AIwithTarun Thank you for your prompt response. I used Gemini Flash 2.0..

@AIwithTarun 4 күн бұрын

@ can you add temperature = 0 and rerun the code.

@subhashchandra3318 Күн бұрын

Please make more rag langchain end to end projects with grooq and Gemini api and ollama

@AIwithTarun Күн бұрын

@@subhashchandra3318 we already have 8-9 videos on that in my channel. But yes project based videos are pending. Maybe January 2nd week

@nikith15 4 күн бұрын

Do u provide ai agent building services?

@AIwithTarun 4 күн бұрын

@@nikith15 as of now no. Maybe from next month or February

@ArpitSingh-wp6yx 4 күн бұрын

Bro any vdo for building agents for CRM works ?

@AIwithTarun 4 күн бұрын

I haven’t planned for it. Need to think about it. Meanwhile if you have any questions on building it, join our Discord channel, we can have discussion over there

@SuhasB-ke6mu 4 күн бұрын

I want to create a LocalRAG system (chat with PDF) using Llama 3.2 and text embeddings. However, the results often include hallucinated information. Do you have any suggestions on how to train and test the model to ensure the system provides accurate answers?

@AIwithTarun 4 күн бұрын

There are various factor to check with you are working on RAG using Open Source LLMs: - Have you used the Prompt template as used in Llama3.2? If your context is getting extracted you need to augment your prompt to reduce hallucinations [However this is not 100% accurate but it reduces the risk] - On the retriever part, you need to check if the relevant documents is retrieved or not for the user query. This is where you need to try CRAG or Re-reranking to improve the performance. You can join our Discord server, we can take this discussion further to see where things are going wrong.

@AIwithTarun 4 күн бұрын

Here is my repo: github.com/lucifertrj/Awesome-RAG/ I have most of the colab notebook that uses Open Source LLM itself.