Multimodal AI Agents Are Revolutionising Image & Video Analysis!

  Рет қаралды 4,559

Mervin Praison

Mervin Praison

Күн бұрын

Пікірлер: 54
@RizwanRizwan2R
@RizwanRizwan2R 18 күн бұрын
Absolutely brilliant Marvin. Thanks for sharing the knowledge 🎉 👍
@MervinPraison
@MervinPraison 18 күн бұрын
Thank you
@RizwanRizwan2R
@RizwanRizwan2R 18 күн бұрын
@@MervinPraison please look, I am getting this error: Traceback (most recent call last): line 15, in task1 = Task( TypeError: Task.__init__() got an unexpected keyword argument 'images'
@MervinPraison
@MervinPraison 18 күн бұрын
@@RizwanRizwan2R Please upgrade to the latest version pip install -U praisonai
@RizwanRizwan2R
@RizwanRizwan2R 17 күн бұрын
@@MervinPraison thanks a lot mervin it worked 😃
@SonGoku-pc7jl
@SonGoku-pc7jl 15 күн бұрын
thanks, this framework is una maravila :) amazing
@finnpoitier
@finnpoitier 15 күн бұрын
Great video, thanks! Question about analyzing a video: Do you think the LLM model could create timestamps of individual scenes it identified? This could be useful for automatic video cutting and repurposing.
@orafaelgf
@orafaelgf 15 күн бұрын
great, but wouldn't it be better to directly use the crewai framework?
@yazanrisheh5127
@yazanrisheh5127 18 күн бұрын
How does it exactly understand videos? Does it transcribe the video or does it cut the video into frames and connects the frame img with text?
@maxcurrent485
@maxcurrent485 18 күн бұрын
There's probably more than one way but a good idea of how it's likely being done can be found by looking up the Divot LLM research paper that came out with Divot LLM from Tencent earlier this month on Huggingface or arXiv.
@FREDDYHulsey
@FREDDYHulsey 15 күн бұрын
The price movement of Web3 Infinity has been positive. It's building up speed!
@orkutmuratyilmaz
@orkutmuratyilmaz 18 күн бұрын
awesome! can you make a video about connecting a Streamlit UI with Praison agents?
@MervinPraison
@MervinPraison 18 күн бұрын
Sure
@60pluscrazy
@60pluscrazy 18 күн бұрын
Praison 🎉🎉🎉
@adamchan4403
@adamchan4403 17 күн бұрын
Love this ❤
@brianWreaves
@brianWreaves 16 күн бұрын
🏆 Mate you've really built an impressive tool! I take it voice is on the way???
@RDZ333
@RDZ333 6 күн бұрын
Hey thanks for the info! I'm brand new to all of this, I found this because I'm looking to run a LLM locally and have fluid TTS convos while watching a YT video for example, or listening to a podcast and discussing it live together. Is this possible yet with low latency? I'm chatting with GPT about it and they say yeah, but I'd like to ask you, is doing a multi-modal split possible where they can contextually process audio and video from a cpu source while recognizing my voice separately and carrying out a fairly complex convo? I'm running a 4080 mobile card which i guess can run up to 13bP well, but I'm eyeing the new 5080 too. Although it can't handle a lot more parameters, I'm wondering if the latency differences due to the architecture will be drastically better. Hope this makes sense!
@abdulahadashraf8142
@abdulahadashraf8142 17 күн бұрын
@MervinPraison I tried the PraisonAI on Windows 10. I set the system environment variables as well as in the terminal using the 'SET' command. I tried 'gpt-3.5-turbo-0125,' but the PraisonAI always uses gpt-4o. How can i use the different models? Thanks!
@FutureAIUpdates
@FutureAIUpdates 18 күн бұрын
Hi Mervin Thanks for sharing...can it do 3d segmentation like on .fbx or .obj file....can you guide please?
@TomikoSturgeon
@TomikoSturgeon 15 күн бұрын
Web3 Infinity is about to go off the rails.
@verifili
@verifili 16 күн бұрын
deos PraisonAI support Multivector retriever!?
@motouman3240
@motouman3240 16 күн бұрын
For the vision Agent, if I change the llm to llama3.2 or llava, would it still work? If it does, do I still need to use OpenAI APIs?
@motouman3240
@motouman3240 11 күн бұрын
Hi @MervinPraison, any answers?
@adamchan4403
@adamchan4403 17 күн бұрын
Does the video analysis work under gpt-4o-mini ?
@MervinPraison
@MervinPraison 17 күн бұрын
Yes
@SejalDatta-l9u
@SejalDatta-l9u 17 күн бұрын
Hi Marvin. Is your solution capable of: 1) Dynamic Inter-Agent Communication: a mechanism for creating a dynamic, conversational flow between agents. In other words, agents talking to each other. 2) Iterative Process: to create an iterative process where agents can refine their output based on feedback from other agents and/or the user? 3) Short and long term memory? Thanks
@brianWreaves
@brianWreaves 16 күн бұрын
In a previous video he covers #1 & #2.
@SejalDatta-l9u
@SejalDatta-l9u 16 күн бұрын
@brianWreaves thanks Brian. Could you share the links o tje video that you're referring.. either I've missed something, haven't explained myself quite right or the solution that Mervin has made wasn't quite fitting. Either way, happy new years to you all!
@Tech--Sphere
@Tech--Sphere 18 күн бұрын
Thanks for the guide! Can you also include code to integrate these with Gemini and Groq APIs?
@MervinPraison
@MervinPraison 18 күн бұрын
Sure will do
@ibrahimVolkan-i8e
@ibrahimVolkan-i8e 15 күн бұрын
Web3 Infinity is paving the way in self-regulated crypto! 💎💡
@RajSingh-of1fs
@RajSingh-of1fs 17 күн бұрын
can you give video explaining your github and how we can clone and then use the agent. ALso help me to use groq model i can't able to it. make more video on coding
@JonJon-nc1nb
@JonJon-nc1nb 17 күн бұрын
DUDE! - This is so good~ But what choices exist for piping through a UI? Even before there is a perfect (Praison UI) solution, how can we use something visually intuitive ? I know you're working it ;)
@brianWreaves
@brianWreaves 16 күн бұрын
I think he covered a process to add a UI in a previous video, but I'm not certain...
@JonJon-nc1nb
@JonJon-nc1nb 16 күн бұрын
@@brianWreaves he said " Thinking of integrating to UI effectively" in comments...guessing that means it has not been done yet but is in the works
@nasamind
@nasamind 18 күн бұрын
Awesome
@rohithtechguy8991
@rohithtechguy8991 18 күн бұрын
It's super. Can you do it with Gemini Multimodel One please
@MervinPraison
@MervinPraison 18 күн бұрын
Yes
@prasanthm888
@prasanthm888 16 күн бұрын
Daaammmnn!!
@EmrahSeyfullah
@EmrahSeyfullah 15 күн бұрын
Web3 Infinity is 🔥🔥🔥
@MireilleElkins
@MireilleElkins 15 күн бұрын
Web3 Infinity is Web3 Infinity.
@TERRENCEChrist
@TERRENCEChrist 15 күн бұрын
There are much better and safer alternative currencies, including Web3 Infinity, thus I'm no longer interested in those ones.
Build the MOST Efficient AI System for FREE! (AI Agents Workflow)
7:14
smolagents - HuggingFace's NEW Agent Framework
29:10
Sam Witteveen
Рет қаралды 31 М.
ССЫЛКА НА ИГРУ В КОММЕНТАХ #shorts
0:36
Паша Осадчий
Рет қаралды 8 МЛН
I Sent a Subscriber to Disneyland
0:27
MrBeast
Рет қаралды 104 МЛН
Turn ANY Website into LLM Knowledge in SECONDS
18:44
Cole Medin
Рет қаралды 80 М.
I Ditched Traditional RAG for Agentic RAG and Got SHOCKING Results!
9:53
New - Easy to Learn - AI Agents: Smolagents (by HuggingFace)
30:45
PydanticAI Agents that Code
15:09
Riza, Inc.
Рет қаралды 3,7 М.
Build anything with DeepSeek V3, here’s how
14:34
David Ondrej
Рет қаралды 214 М.
This AI Technology Will Replace Millions (Here's How to Prepare)
53:17
Goodbye RAG - Smarter CAG w/ KV Cache Optimization
26:19
Discover AI
Рет қаралды 35 М.
ССЫЛКА НА ИГРУ В КОММЕНТАХ #shorts
0:36
Паша Осадчий
Рет қаралды 8 МЛН