How to Use LLM Vision to Analyze Camera Images and Video in Home Assistant

Рет қаралды 11,586

Michael Leen

Күн бұрын

Пікірлер: 74

@michaelsleen Ай бұрын

👉 Let me know any creative ways you’re using LLMs or would like to use them in your home automations!

@brian7android985 14 күн бұрын

I am working towards the house finding a book for me. It should then flash the nearest LED. [I have lots of books, but also lots of leds.]

@michaelsleen 4 күн бұрын

Sounds like an interesting project. Good luck!

@dandixonus 26 күн бұрын

Identify birds at a feeder I just used this to process images from a camera I pointed at my bird feeder to identify the common bird name (and the scientific name). When motion is detected, I capture an image from my Relink camera and send it to Google for processing. When the bird name comes back, I copy the file to a new file with the bird's name in it. My prompt is this: - If there's a single bird in this photo, write the bird type, scientific name - If more than 1 bird, for each type: write bird type, scientific name, number of this type - If there is no bird, write "No Bird Found" - If you don't know the bird type, write "Bird Type Unknown" This video was super helpful. Thank you!

@michaelsleen 26 күн бұрын

That's awesome, and thanks!

@oxxide 4 күн бұрын

This is amazing, but my issue is it doesn't show the full resposne. I can see full responses for other notification with other apps, so i don't think its a phone issue. It cuts off, for example my last noti "The image shows a front porch with a brick wall, a door with a securit...." which is fixed by just making 2 notifications, for some reason though, the date and time isn't working for me, also have a g4 door bell and last_changed doesn't get a date on the automation but does when check it with templates.

@michaelsleen 3 күн бұрын

What you’re describing may just be the expected behavior. You can try telling the LLM in your prompt to limit the description to a certain number of characters. But if it’s too long, long pressing on the notification on iOS at least should reveal the full message and show a larger image of the snapshot photo. To my knowledge, there is nothing on the HA side that would limit the amount of text shown - this is enforced by Apple. See here for an example: drive.google.com/drive/folders/1KwET0jCuqH9LbVqm8VUQTJprmTYaZtus

@AlanJones-vc9yr 4 күн бұрын

Hi. Great video and thank you. I am using a Ring doorbell camera. All works fine except the notifications via HA app on my iPhone are truncated, so only get the first 100 ish characters of the description. Help please TIA

@michaelsleen 3 күн бұрын

@AlanJones-vc9yr 3 күн бұрын

@@michaelsleen Hi Thanks for the reply. The message is actually truncated prior to receipt by the HA/iPhone. It stops mid mid word. I have tried requesting a shorter response which reduces the characters but still truncates the message ??? Flummuxed :-(

@michaelrich4872 22 күн бұрын

Got it up and running thanks to your tutorial. Hopefully I can get it up and running for vehicles too, just waiting to trigger it. Next step is figuring out how to get it to ignore our vehicles, coming home is relatively easy, leaving not so much.

@michaelsleen 21 күн бұрын

I am glad to hear the tutorial was helpful!

@robinpillekers4871 14 күн бұрын

Hi, great vid you have here, do you know a way to retrieve the message variable input / meta data (from the normal HA Automation flow) in to a node red flow (payload)?

@michaelsleen 14 күн бұрын

You know despite years of playing with HA, I’ve yet to dive into Node-RED.

@leightonevans1071 4 күн бұрын

How do I know what file path to use for a snap shot image? Can someone give a step by step guide? On a side note I find most HA videos assume learners know the steps in between!

@michaelsleen 4 күн бұрын

It's really up to you to decide. As an example, you could use the path /media/images/doorbell_snapshot.jpg for the camera.snapshot action, and then the path /media/local/images/doorbell_snapshot.jpg for the action that sends the image in a push notification to your phone. You can use something other than 'images' or 'doorbell_snapshot.jpg' if you prefer another name or organization structure.

@88Snipi88 3 күн бұрын

is it possible to send only a massage when the postman arrived?

@michaelsleen 3 күн бұрын

In the LLM description you can tell it to only comment if the mail carrier is seen (it correctly identifies ours every time). Then you would need some kind of condition to only proceed with the rest of the automation if the response variable for the LLM contains a response. I haven’t tried this exactly.

@wscottfunk Ай бұрын

Hey Michael, is the image retention automatically managed or is there a "purge" option to set a maximum number of snapshots to retain or to overwrite the oldest? I could see this taking up file storage if the images aren't deleted/ overwritten. Nice job with the tutorial. Much appreciated.

@michaelsleen Ай бұрын

Each image overwrites the prior. Thanks!

@Shaq2k Ай бұрын

Nice. Can you train the language models? If you have 2 cats for example; can the language model be trained to know their name and see differentiate them?

@michaelsleen Ай бұрын

There was a similar discussion about using this to train facial recognition in a post I made on Facebook. See here: facebook.com/share/p/12BUp2ESnMz/?mibextid=WC7FNe

@VWTesla Ай бұрын

So, I'm intrigued by this video. My use case is to characterize my incoming USPS mail. I'm using the "Mail and Packages" HACS integration to generate an MP4 containing the USPS mail being delivered (USPS Informed Delivery). Now, I already see the MP4 on my Home Assistant dashboard upon hitting a button. What I want to do is to have AI read the images and let me know if they're addressed to "Resident" or "Home Owner" versus myself, my wife, or my kids. I'm currently using a home grown solution in Python (Pytesseract) but I believe there might be a better solution?

@michaelsleen Ай бұрын

Interesting. If you feed the images to the AI model you can try giving it a prompt like that.

@FawziBreidi 17 күн бұрын

I was trying to make the voice assistant run vision llm of some sort to be able to pass the prompt to the vision llm after it captures the image but i was unable to make the assistant to trigger a script. for example I would like to ask my assistant if there is any car parked outside in my garage and be able to analyze. sorry for the long message but if this is possible, please let us know!

@michaelsleen 16 күн бұрын

What is the trigger for your automation? Do you have a camera with vehicle detection? You can try giving LLM Vision a prompt like, "Tell me if there is a car in the image. If so, describe what it looks like in one sentence."

@FawziBreidi 16 күн бұрын

@@michaelsleen i was researching it and looks like we need to create new intents to instruct it to run vision llm. might be a good idea to research and do.

@alexfroehlich711 Ай бұрын

How do you generate a preview (snapshot) of the camera feed in the Home Assistant Notification? I use the exact same camera with Protect, and am on IOS. I can hold down long on the notification which opens up the Protect app, but no previews. I bought your code and followed line for line. Everything is working besides that. Thanks!

@alexfroehlich711 Ай бұрын

Disregard. Realized I have to use my domain link like you did (through my Cloudflare tunnel) couldn't just use my internal IP. Thanks

@michaelsleen Ай бұрын

Glad to hear it’s working!

@alexfroehlich711 Ай бұрын

@@michaelsleen “A barefoot dude, looking like he just lost a bet, flips off the security camera with a surprised Pikachu face.” I love these responses 😂 this is quickly becoming my favorite automation already.

@mutley247365 6 күн бұрын

hay, Ive followed this to the letter, first off Im not getting any image through to my phone (Samsung) but I am getting the AI generated response. is this possible for android devises?

@michaelsleen 6 күн бұрын

Yes, it's possible. The only Android nuance I'm aware of is the AI-generated response may be cut-off if it's too long. Check to see if the camera image is being saved to the folder path you put in your automation for generating the camera snapshot.

@samiam732 13 күн бұрын

Do you think HA will ever be made easier to use? I resist it and don't like it because it doesn't seem very user friendly. I like the way Homey seems but it's expensive.

@michaelsleen 13 күн бұрын

Yes, I expect such things to get easier to use over time. In my ~3 years using Home Assistant, so many things have gotten easier. It’s really come a long way.

@kevinallen500 8 күн бұрын

define easy. Even as they improve, it's not an off the shelf plug in product, You need to have enough tech experience to use it. The plug and play is likely years away. There are a ton of video's on how to set it up, but again, you need to have some tech experience.

@alingabrielafloarei3499 Ай бұрын

Great video. Is this not the same as generative ai ?

@michaelsleen Ай бұрын

Generative AI is a broader term that includes LLMs. So, the use of LLMs as shown in this video falls within the scope of Gen AI.

@iMazTV Ай бұрын

Let’s go!! 🔥

@marcusagren2838 8 күн бұрын

On my Oneplus 9Pro Android, the LLM-text doesn't really fit in the notification if i also attach the snapshot. I only see "Here is a description of the image in one sentence..." No way to expand the notification so i can see the ful description.

@michaelsleen 8 күн бұрын

I’ve seen others with Android say something similar. This is not an issue on iOS, and I don’t have an Android phone.

@marcusagren2838 7 күн бұрын

@@michaelsleen Yeah, maybe a bug in the Android companion or os. For now, I split up the notification in two. One for the snapshot and one for the LLM.

@dimitrisdimitriou9769 Ай бұрын

Thank you, I make everything and it working very good, but the notification on adroid have a problem , when the notification have a image it is only two lines of message

@michaelsleen Ай бұрын

I’ve heard others with Android also talk about this limitation. I’m on iOS and do not have this limitation. Otherwise I’m glad it’s working for you!

@mazi2be 13 күн бұрын

is Gemini API free? or is there limited number of free prompts? how does it work?

@michaelsleen 13 күн бұрын

I am using the Google Gemini API Free Tier, so it doesn’t cost me anything. There are rate limits, but I’ve yet to hit them.

@PhilBlancett 8 күн бұрын

are you deleting the image after so many images?

@michaelsleen 8 күн бұрын

I believe only the most recent image is saved

@PhilBlancett 8 күн бұрын

@@michaelsleen you should check that, because I believe you need to add another script to make sure or your drive is going to fill up (eventually)

@michaelsleen 7 күн бұрын

I double-checked. Each image overwrites the prior.

@Shunopoli Ай бұрын

I bought the yml and not matter what I do I get Message malformed: template value is None for dictionary value @ data['actions'][3]['data']

@michaelsleen Ай бұрын

Reach out on the Contact page, and I'll get you sorted out: shop.michaelsleen.com/pages/contact

@fightingmajor Ай бұрын

Getting this error from using your code. Error rendering data template: UndefinedError: 'response' is undefined

@michaelsleen Ай бұрын

Did you try naturally triggering the automation? For example, if your automation is set to trigger based upon motion or a person detected at the camera, try re-creating that by walking in front of the camera and see if it works. If I just click “run” to test out the notification, I also get that error because no response variable exists yet from LLM Vision. But the automation works perfectly for me every time it is naturally triggered by a person being detected at my front video doorbell. And I know several others are using my code successfully. Let me know so I can get you sorted out!

@RakshitPithadia 28 күн бұрын

@@michaelsleen Was facing this same error and realized the actual automation works :) Thanks for making this detailed video!

@eierund 19 күн бұрын

@@michaelsleen I'm confused. Why would the response variable not exist if the automation is triggered manually? It would still run the LLM integration first and therefore, create the response variable, no?

@michaelsleen 19 күн бұрын

@@eierund You can run the entire automation, or you can run specific actions within the automation. If you run the entire automation, it should work. But if you run just the action where it sends a notification to your phone, that will not work, and instead present the error message: Error rendering data template: UndefinedError: 'response' is undefined. Regardless, the automation itself still works when triggered.

@h3ld3rk1d 7 күн бұрын

Hello, Im trying send to telegram but no luck, can you send the response to telegram? How? Thnkx

@michaelsleen 7 күн бұрын

I do not use Telegram so cannot comment on it.

@h3ld3rk1d 7 күн бұрын

@@michaelsleen thnkx my friend.. i wil try find solution

@TheRealDanielsan 5 күн бұрын

I can't beleive you pay walled the yaml...

@michaelsleen 4 күн бұрын

I share everything you need to know both in my video and in a written article on my website, all for free. It is not necessary to pay a small fee for the code, but for those who want it to be as quick and easy as possible, I make that option available, and I'm not the only one to do so. Thanks for watching.

@aijii Күн бұрын

I can't believe you're complaining about a few bucks

@clsferguson Ай бұрын

Selling the automation? Really?

@michaelsleen Ай бұрын

Producing quality reviews and tutorials requires a large investment of time. My videos are free and show everything you need. To make it even easier for others, I invest additional time in creating and sharing the Blueprints, code, etc., and you can access these for a small fee.

@clsferguson Ай бұрын

Better to invest in yourself. Quality generates subscribers/views. Rely on the potential ad revenue/sponors. More people will watch/share/continue to watch if you don't put the yaml behind a pay wall. This is a hot topic right now, and someone else will outrun you with views because of it.

@EricHernandez91 Ай бұрын

@@clsferguson What's wrong with him selling a shortcut to people who don't want to sit through an entire video that shows you exactly how to do it for free?

@Delyn Ай бұрын

@@clsfergusontell me you’re not a creator, without telling me you’re not a creator.

@clsferguson Ай бұрын

@@Delyn hmm.. curious, do you charge for any home assistant automations you have written?