5 Problems Getting LLM Agents into Production

  Рет қаралды 11,306

Sam Witteveen

Sam Witteveen

Күн бұрын

In this video I discuss 5 common problems in building LLM Agents for production
🕵️ Interested in building LLM Agents? Fill out the form below
Building LLM Agents Form: drp.li/dIMes
👨‍💻Github:
github.com/samwit/langchain-t... (updated)
github.com/samwit/llm-tutorials
⏱️Time Stamps:
00:00 Intro
00:58 Reliability
02:46 Excessive Loops
04:36 Tools
07:59 Self-checking
09:22 Lack of Explainability
10:09 Bonus: Debugging an Agent

Пікірлер: 58
@BuildNewThings
@BuildNewThings 20 күн бұрын
crewAI creator here 👋 Thanks great feedback! We are addressing a lot of that internally :) Great video!
@samwitteveenai
@samwitteveenai 18 күн бұрын
Hi João, big thanks for chiming in here. It really shows you are committed to what you are doing. I didn't want to sound like I was dumping on CrewAI, I really like what you are doing and I think the issues raise in the video apply to all the "Agent" Frameworks currently. Really look forward to seeing what you add to CrewAI
@BuildNewThings
@BuildNewThings 17 күн бұрын
@@samwitteveenai No worries, all feedback is welcomed! In matter of fact I would love to jump on a call and show you some of the work we are doing to address this, how is next week for you :)
@samwitteveenai
@samwitteveenai 14 күн бұрын
@@BuildNewThings sounds good lets connect on Linkedin
@MrKrzysiek9991
@MrKrzysiek9991 18 күн бұрын
This is hands down the best channel about AI. No hype-building or irrelevant tests. Just pure practice. Thank you!
@samwitteveenai
@samwitteveenai 18 күн бұрын
Thanks, I appreciate that!
@MartinBlaha
@MartinBlaha 20 күн бұрын
This reflects completely my experience and I appreciate a lot your summary - thank you for all the effort! I started first with CrewAI and ended up very fast with 10 USD on one day in API calls. Then I realized I cannot really debug it. Also, I missed a transparent / standardized AI DevOps incl. tests. I'm currently experimenting with Langraph, but started wondering why to use a framework at all? Not there yet to decide. Anyway, thanks again, looking forward to follow up with your journey 😊🤗
@tonyppe
@tonyppe 20 күн бұрын
agentops is your friend.
@squiddymute
@squiddymute 20 күн бұрын
LLMs are nice for subjective stuff like creative tasks (which is the last thing we want to use them for) and for having fun with things like chatbots etc. For real world application that require concrete results and some overall robustness LLMs are like a time bomb. We are not there yet. There are people who trust LLMs can be part of a robust application which is what will cause the most issues with the so called AI in the long term. Skynet won't happen, stupidity will kill us first.
@tovanhai4930
@tovanhai4930 20 күн бұрын
Can't wait to see the next videos about buildig LLM Agents with Langraph
@samwitteveenai
@samwitteveenai 20 күн бұрын
hi I have made a few vids already about building with LangGraph on the channel.
@silvias4808
@silvias4808 10 күн бұрын
OMG your point is right on the spot! That's exactly the problem I had to deal with in my project
@mrka_
@mrka_ 20 күн бұрын
Very honest and to the point about the actually status of multi AI agents to the date.
@bpraghu
@bpraghu 19 күн бұрын
Thanks for the video, just confirms my thoughts after fiddling with langchain, Agents and Tools for the past year, the best thing to come out has been langgraph. Loved your video and you have a sub.
@jamesyoungerdds7901
@jamesyoungerdds7901 20 күн бұрын
Amazing, thanks Sam! Huge fan, really loving and devouring your content. My first reaction was "ok good, it's not just me" 😅. But to your point, I think fine-grain agentic decisions with formatted/constrained outputs are key to mitigating some of those problems 💪. Even the idea (with the right models and speed/costs allowing) to have a few agents with the same task (maybe with slight prompt variations?) and then consensus/voting on the constrained output to the next agent is a decent approach.
@samwitteveenai
@samwitteveenai 18 күн бұрын
The "ok good, it's not just me" comment really makes me feel the video did its job. I talk agents all day with my team and I realized that externally people were starting to think this was a solved area which is so far from the truth. Totally agree on "formatted/constrained outputs" and will make some more vids about this hopefully soon.
@homeautomationlab1533
@homeautomationlab1533 18 күн бұрын
Great video, thanks.
@micbab-vg2mu
@micbab-vg2mu 20 күн бұрын
Sam, great video - I totally agree with the content. In my business (I work for a pharmaceutical company), the minimal reliability threshold is 95%. At the moment, I use agents as part of the experimentation process with AI, not for production. I think in a few months with GPT-5, they will work better and be more useful.
@samwitteveenai
@samwitteveenai 20 күн бұрын
GPT-5 might not be as close or as good as what you think from what I am hearing 😀
@enceladus96
@enceladus96 7 күн бұрын
2:26 just got trippy real quick
@MeinDeutschkurs
@MeinDeutschkurs 20 күн бұрын
I‘m sure that one day it will simply work to write background stories, and a team that does whatever. But: currently, I can just have own functions, producing outputs with a single word, to react on how to use and/or work with the current user prompt. Similar to dialects, models are not able to be prompted the same way. A Prompt delivers on models p * (n + m )^( n + m)-1 results. And this is just the input. It is tedious, but a start. And I agree to the decision point thing.
@waneyvin
@waneyvin 20 күн бұрын
I would suggest to introduce critics and SOPs to the agent operations. how do you think?
@hawa7264
@hawa7264 20 күн бұрын
I really like your channel but recently actually*watching* it has become a lot less enjoyable with all the nonsensical stock videos in the background. I would much more enjoy the classical talking head or even just a still frame over those stock videos that don't add anything to the actual content.
@diallo9149
@diallo9149 20 күн бұрын
ai generated video!!!
@willjohnston8216
@willjohnston8216 20 күн бұрын
I completely agree. It is so distracting to have the cheesy stock videos. Please just have an ocean and beach or something that is not distracting in the background.
@samwitteveenai
@samwitteveenai 20 күн бұрын
Sorry guys. I gave the editor an audio only track and thats why it came out like this. Interesting points about the ocean beach etc. I am finally back in main home so will stick my head on some of the vids soon.
@unclecode
@unclecode 20 күн бұрын
Indeed you are pointing a real issue. And with your permission I like too add one more. The tech ecosystem is currently hyped around large language models (LLMs), causing people to forget that many tasks used to be handled by smaller, task-specific models. The push from VCs and the hype around LLMs lead to the misconception that LLMs are needed for everything, creating immature products and numerous issues. Most problems that people try to solve with LLMs and so-called agents could often be addressed with small models or even without machine learning. A better approach is to integrate LLMs as part of a larger system, allowing for incremental improvements as new research emerges, rather than building entire systems based on LLMs due to hype. So, to add to your five problems, the sixth is: Don't build agents out of hype, and don't design entire systems on LLMs.
@samwitteveenai
@samwitteveenai 18 күн бұрын
totally agree!! The small specific models is such a big issue. Maybe I will make a vid just about this. I see startups and companies using 7B+ models to do reasonably simple things like NER, Sentiment etc. Really quite crazy.
@runmicteam
@runmicteam 20 күн бұрын
"Alright"? What happened to the iconic "Okay" intro? I even used it as a tribute to you in my demo video.
@kakashisensie100
@kakashisensie100 20 күн бұрын
I thought i opened the wrong video
@midnightmoves7976
@midnightmoves7976 20 күн бұрын
We have just implemented this kind of service. we needed 6 tools and two agents. not an easy job. lots of time was spent writing logic for errors.
@samwitteveenai
@samwitteveenai 20 күн бұрын
yeah I feel people are underestimating the work that goes into something like that.
@Aidev7876
@Aidev7876 20 күн бұрын
was it worth it, or are agents just kind of trendy and cool for the resume?
@midnightmoves7976
@midnightmoves7976 20 күн бұрын
@@Aidev7876 we felt that agents where needed to direct the flow of the logic. agents are just the name given to more than one prompt the llm can follow.
@JacuelineSatu
@JacuelineSatu 20 күн бұрын
​@@midnightmoves7976I like the way you defined the word agent. Nice.
@ChrisDermody
@ChrisDermody 19 күн бұрын
Really useful video, thank you. Is that website diffing tool open source somewhere? You reminded me I need something similar.
@samwitteveenai
@samwitteveenai 18 күн бұрын
It isn't currently, but I guess since I have shown I might make it open. I plan to make a video about building custom tools so might go in depth with that one
@ChrisDermody
@ChrisDermody 18 күн бұрын
@@samwitteveenai awesome, looking forward to that video. Subscribed
@khaledsaud6677
@khaledsaud6677 2 күн бұрын
Really insightful video, Sam! Glad to find your channel. Could you share your opinion on Haystack? I haven't seen a comparison between Haystack and LangGraph for real product development. Which would you recommend for creating scalable and complex LLM agents with more than 30 nodes?
@samwitteveenai
@samwitteveenai 3 сағат бұрын
I haven't used Haystack in a long time, have they added more stuff related to Agents? I will try to take a look and see what they have been up to. For something complex nowadays currently I would either use LangGraph or code it from scratch.
@shApYT
@shApYT 20 күн бұрын
Replacing your employee with an LLM is like hiring a virtuoso teenager on LSD.
@samwitteveenai
@samwitteveenai 18 күн бұрын
Congratulations!! You win the internet today!!😀
@willjohnston8216
@willjohnston8216 20 күн бұрын
Sam, your channel is my favorite on AI, but the stock video footage is very distracting. Keep up the great work!
@samwitteveenai
@samwitteveenai 20 күн бұрын
Sorry I just gave the editor an audio track. Your idea about an ocean and a beach is interesting. What if it was just some slides with with my notes etc? How would you feel about that?
@sergeziehi4816
@sergeziehi4816 20 күн бұрын
In some not so distant past there were bots called RPA bots very powerful for repetitive and predefined steps based processes. Excell for realability in stable context but lack for reasoning capabilities end none capable of any kind of self reflexion.dream was to have cognitive capability one day in the future maybe.. One day Generative AI with LLM and reasoning emerged . Suddenly automation industry started to thinks how automate work with this? RPA bots were forgotten. But Agent created with LLM started showing they limit in real business .. they allucinate and are reflect wrongly and to much some time were straight action is enough.. but then someone recall.. straight actions that exactly where RPA bots shine. Roles for genAI LLM based Agent and RPA started to become clear. Orchestration with Reasoning capabilities sound to be the place for Agent . Actions or tooling devoted to RPA. It's the dawn of a new era of possibilities. Where a smart combination of GenAi Action Orchestrator Agents & RPA bots could create best ever possible Business process Automations solution for corporates.
@samwitteveenai
@samwitteveenai 20 күн бұрын
I worked on building RPA tools in the past using LLMs with CV Models and agree with the sentiment here.
@tonyppe
@tonyppe 20 күн бұрын
Where can people like us go to chat and discus how to tackle the problems raised in this video? I Think there are a ton of us that are trying to use these tools and failing to achieve reasonable outcomes. I am actively focusing on crewai usage and improvements and I currently have a crew with 7 agents and 8 tasks which works...... unreliably. I was also trying to find out what are the best practices for defining the agents description (ie personality) as well as the specific task details such as what should be contained or what should be avoided and how should it be structured.
@samwitteveenai
@samwitteveenai 18 күн бұрын
You think I should start a discord? The main reason I haven't is I don't have a lot of time etc.
@gordoncandelin7688
@gordoncandelin7688 20 күн бұрын
I feel seen! 😅
@curtkeisler7623
@curtkeisler7623 20 күн бұрын
Truth
@tonyppe
@tonyppe 20 күн бұрын
Crewai has some flaws where the prompts sent to the ai agents actually confuse them. It works better with chatgpt but even chatgpt 4o gets confused and sends back a response which doesnt exactly match the tool schema so crewai sends the agents around in loops where they're just chatting BS amongst themselves trying to figure it out. Agentic chatting needs to be extremely clear and specific so that the response can invoke tools or assign work to another agent.
@samwitteveenai
@samwitteveenai 18 күн бұрын
I think all the "pre-made" agent frameworks are similar they were usually written for OpenAI GPT-4 and need so customization of other models
@s11-informationatyourservi44
@s11-informationatyourservi44 20 күн бұрын
i’ve noticed the biggest bottleneck in my experience is people forget that operations planning principles don’t and won’t change. a bad manager will still struggle with ai agents just as with humans, but it doesn’t hurt as bad as losing capital on lost salary
@edellenburg78
@edellenburg78 20 күн бұрын
Your audio is a little low for TV watchers. It's not horrible, but I would prefer it a little louder from the editor.
@dr.mikeybee
@dr.mikeybee 12 күн бұрын
I've been very disappointed with all these frameworks. They do somethings well, but they all handle context assembly very badly, and I think context assembly is the most critical part of agent building.
@teprox7690
@teprox7690 20 күн бұрын
Sure that you discuss it? Or is an agent discussing it for you ? Is it AI generated ?
@samwitteveenai
@samwitteveenai 20 күн бұрын
I wish it was AI generated would be much less of a pain in the editing issues etc.
Have You Picked the Wrong AI Agent Framework?
13:10
Matt Williams
Рет қаралды 42 М.
Understanding ReACT with LangChain
21:10
Sam Witteveen
Рет қаралды 42 М.
Универ. 13 лет спустя - ВСЕ СЕРИИ ПОДРЯД
9:07:11
Комедии 2023
Рет қаралды 6 МЛН
Why You Should Always Help Others ❤️
00:40
Alan Chikin Chow
Рет қаралды 133 МЛН
The 5 Types of LLM Apps
15:30
Sam Witteveen
Рет қаралды 25 М.
GraphRAG: LLM-Derived Knowledge Graphs for RAG
15:40
Alex Chao
Рет қаралды 75 М.
✅ Easiest Way to Build AI Agents With RAG & CrewAI Locally
8:07
Analytics Camp
Рет қаралды 1,6 М.
Marker: This Open-Source Tool will make your PDFs LLM Ready
14:11
Prompt Engineering
Рет қаралды 36 М.
GPT-4o is WAY More Powerful than Open AI is Telling us...
28:18
MattVidPro AI
Рет қаралды 260 М.
AI Pioneer Shows The Power of AI AGENTS - "The Future Is Agentic"
23:47
LangGraph Crash Course with code examples
39:01
Sam Witteveen
Рет қаралды 62 М.
The Future Of AI, According To Former Google CEO Eric Schmidt
20:07
Noema Magazine
Рет қаралды 156 М.
Master CrewAI: Your Ultimate Beginner's Guide!
1:00:18
Sam Witteveen
Рет қаралды 62 М.
1$ vs 500$ ВИРТУАЛЬНАЯ РЕАЛЬНОСТЬ !
23:20
GoldenBurst
Рет қаралды 239 М.
Lid hologram 3d
0:32
LEDG
Рет қаралды 4,4 МЛН
Will the battery emit smoke if it rotates rapidly?
0:11
Meaningful Cartoons 183
Рет қаралды 28 МЛН
APPLE совершила РЕВОЛЮЦИЮ!
0:39
ÉЖИ АКСЁНОВ
Рет қаралды 3,3 МЛН
💅🏻Айфон vs Андроид🤮
0:20
Бутылочка
Рет қаралды 618 М.