Using LangChain Output Parsers to get what you want out of LLMs

  Рет қаралды 37,189

Sam Witteveen

Sam Witteveen

Күн бұрын

OutParsers Colab: drp.li/bzNQ8
In this video I go through what outparsers are and how to use them in LangChain to improve you the results you get out of your models.
My Links:
Twitter - / sam_witteveen
Linkedin - / samwitteveen
Github:
github.com/samwit/langchain-t...
github.com/samwit/llm-tutorials
00:00 Intro
04:56 Structured Output Parser
12:26 CommaSeparatedList OutputParser
14:13 Pydantic OutputParser
19:00 Output FixingParser
21:26 Retry OutputParser

Пікірлер: 91
@magicofjafo
@magicofjafo Жыл бұрын
Dude, I was exactly having output parser problems just last night. This is exactly what I needed. Thanks.
@pankymathur
@pankymathur Жыл бұрын
Thanks a lot Sam, I really like the way you went deep into explaining all different types of parser with examples. This is definitely one of the top notch content video you released, keep it up 😊
@rashidquamar
@rashidquamar Ай бұрын
thanks Sam, I was strugling with outparser and you helped on time
@kuhajeyangunaratnam8652
@kuhajeyangunaratnam8652 2 ай бұрын
Thanks a lot mate. This is invaluable as it gets, code walkthrough with all the explanations. Not to mention that code itself well documented.
@toddnedd2138
@toddnedd2138 Жыл бұрын
Thank you. There are pros and cons with langchain. It is a powerful framework but sometimes it is (imho) a little bit to verbose if it comes to prompt templates. This adds up if you make a lot of request to the model and costs unnecessary tokens (in production ready applications). Therefore i use my own written prompts, the crucial thing is finishing the prompt with the final instruction, here is an example: --- The result type 2 should be provided in the following JSON data structure: { "name": "name of the task", "number": number of the task as integer, // integer value "tool": "name of the tool to solve the task", // omitted "isfinal": true if this is the last task, // boolean value } Respond only with the output in the exact format specified, with no explanation or conversation. --- So far, this has always worked reliably (over 500 calls). I found this in a lot of papers, so, the credits for this go to some other intelligent guys. From my experience, the name of the fields and also the order of the fields can make a difference.
@MichaelNagyTeodoro
@MichaelNagyTeodoro Жыл бұрын
I did the same thing, it's better to do the parsing outside of langchain methods.
@thatryanp
@thatryanp Жыл бұрын
From what I see of langchain examples, it seems that for someone with development experience, they would be better served with some basic utility functions rather than taking on langchain's design assumptions.
@toddnedd2138
@toddnedd2138 Жыл бұрын
@@thatryanp & ​ @Michael Nagy Teodoro There could be some disadvantage if you write own solutions when it comes to updating to a newer underlying model. Maybe not critical today, but one day it might be a topic. My guess, the langchain community will be fast provide updates.
@jdallain
@jdallain Жыл бұрын
Thank you very much! This is super helpful and something I’ve struggled with
@jasonlosser8141
@jasonlosser8141 Жыл бұрын
Great video. I’m using core python elements to parse right now, but I’ll incorporate output parsers in my next rebuild.
@ugk4321
@ugk4321 Жыл бұрын
Super content...explained well. Thank you
@MariuszWoloszyn
@MariuszWoloszyn Жыл бұрын
I used to insert and output in YAML. It's more human readable than json hence it works better with llms. No missing colon or stuff like that.
@____2080_____
@____2080_____ Жыл бұрын
Awesome and thank you for teaching
@jolieriskin4446
@jolieriskin4446 Жыл бұрын
Another variation I've been using is to have a separate JSON repair method. I usually use a similar technique of showing the example JSON and immediately call my validation routine afterwards. If there is an error, send the JSON error and line number it's on as a separate call and try up to 3x to repair the output. The nice thing is you can use a lot fewer tokens on the repair call and potentially call a more specific or faster model that is tailored towards just fixing JSON (rather than wasting an expensive call to GPT4 etc...).
@RobotechII
@RobotechII Жыл бұрын
Wonderful content! I'm sending it to my team
@bingolio
@bingolio 8 ай бұрын
Sam you are my unsung Hero of AI. THANKS!
@TomanswerAi
@TomanswerAi Жыл бұрын
Great explanation thank you!
@galkim1
@galkim1 10 ай бұрын
This is great, thanks
@tubingphd
@tubingphd Жыл бұрын
Thank you Sam
@vikrantkhedkar6451
@vikrantkhedkar6451 5 күн бұрын
This is a really important stuff
@pranavmarla
@pranavmarla Жыл бұрын
I've been playing around with langchain for a couple of days and this is really helpful! Output parsers would be great while dealing with tools that need to interpret the response. I hope this gets integrated into SimpleSequentialChains too? Because currently SimpleSequentialChains only accept prompt templates which have a single inputs.
@vijaybudhewar7014
@vijaybudhewar7014 Жыл бұрын
That is something new i did not know this...as always you did your job at its best
@oz4549
@oz4549 11 ай бұрын
I have an agents which goes through a list of tasks. I want the output structure to be different depending on the question asked. Maybe in one instance I just want to return json but in another instance i want to return markdown. I tried to do this with prompts but it is not consistent. Is it possible to do this?
@alihosseini592
@alihosseini592 9 ай бұрын
As you also mentioned in the video, CommaSeperatedOutputParser does not really work well(for example there was a dot at the end of LLM's response. Is there any other way to get the model to output only a list?
@redthunder6183
@redthunder6183 11 ай бұрын
How does the output parser actually parse the output that it gets back tho? Is it just regular code? Or is it something more. Like as an example what if the model forgets the second “ to end a string???
@anindyab
@anindyab Жыл бұрын
Thanks for this, Sam. Your videos on Langchain have been incredibly informative and helpful. Here's a request: Can you please do a video on creating Langchain agents with open source/local LLMs? The agents seem to require specific kind of output from the LLMs and I think that can be a nice follow up to this video. In my brief experience open source LLMs are not easy to work with when it comes to creating agents. Your take on this will be very helpful.
@samwitteveenai
@samwitteveenai Жыл бұрын
The big challenge is most of the released Open Source models can't return the right format. I have a new OpenAI one coming and will try to convert that to open source to show people.
@anindyab
@anindyab Жыл бұрын
@@samwitteveenai This is great news. Thank you!
@RobvanHaaren
@RobvanHaaren 3 ай бұрын
Sam, I love your videos, I'm a huge fan. The only feedback I would have to make your channel better is to fix your typos. Both in your template strings (not a big deal since the LLM will understand regardless), but also your video titles (eg. at 4:57) "Ouput", may affect your credibility. All the best and keep up the good work!
@samwitteveenai
@samwitteveenai 3 ай бұрын
Thanks & Sorry about that. I have tended to record these on the fly and put them out. I have someone editing now who will hopefully catch them as well.
@giraymordor
@giraymordor Жыл бұрын
Hello Sam, i have a question: I aim to send a sizable text to the OpenAI API and subsequently ask it to return a few select sections from the text I've dispatched. The text I intend to send consists of approximately 15k tokens, but the token limit for gpt3-5.turbo is merely 4k. How might I circumvent this limitation and send this text to OpenAI using the API? This is not for the purpose of summarization, as there are ample examples of that on KZbin. My goal is to send a substantial amount of text to OpenAI within the same context, and for the model to retain what I previously sent. Following this, I would like it to return a few parts from the original text, preserving the integrity of the context throughout these operations. Thank you in advance for your guidance!
@ohmkaark
@ohmkaark 10 ай бұрын
Thanks a lot for great explanation!!
@Darkhellwings
@Darkhellwings Жыл бұрын
Thanks for the explanations. What I still miss from this tutorial (and some others of yours), is how to personalize langchain's API to go beyond what is provided at the moment. For instance, a simple question raised immediately after watching this would be : how to implement a custom output parser, for a custom format that is not JSON or lists ? Is it possible to make something for tables ? Thanks anyway, that was still great !
@samwitteveenai
@samwitteveenai Жыл бұрын
almost all customization is done at the prompt level. If you are doing something for a table you would want to think through first what would the LLM return as a string. a CSV? How would it represent a table etc. Then work on the prompt that gets that and lastly think about an output parser . You raise an interesting issue, maybe I should make a video walking through how I test the prompts first and get that worked out. If you have good use case please let me know. One issue I have is I can't show any projects I work on for clients etc.
@tomaszzielonka9808
@tomaszzielonka9808 11 ай бұрын
How giving a specific role (in this example a master branding consultant) improves (or impacts, in general) the outcome of a prompt? LLMs make predictions based on sequence of words and I try to bond role-playing with model's output.
@Aroma_of_a_Roamer
@Aroma_of_a_Roamer 10 ай бұрын
Love your content Sam. I was wondering have you ever got classification/Data Extraction working with an Open Source LLM such as Llama 2? Would love to see a video on this if you have. Thanks Keep up the great work.
@samwitteveenai
@samwitteveenai 10 ай бұрын
I have been working on this with mixed results, hopefully can show something with LLaMA2
@Aroma_of_a_Roamer
@Aroma_of_a_Roamer 10 ай бұрын
@@samwitteveenai You are an absolute champion. I think all app development is exclusively done with ChatGPT since it is a) superior to open source LLM & b) App & library developers such as Langchain have geared their app development towards it, using their own prompt templates. Each LLM has its own way and nuance as to how to format the prompt in order to make it work correctly.
@imaneb4073
@imaneb4073 Ай бұрын
Hello thank you so much for such valuable and creative content that helps us a lot please i have a question , I am using pydantic Output Parser on a strcutured pdf documents to generate a dataset ( where I will select only specific fields ) I used OpenAI as llm model but the problem i faced is i am working with a folder of 100 pdfs so the code suddenly is intrurrepted due to openai limit daily rate of requestion . Please how to handle this is there a trick ? Or another alternative?
@elikyals
@elikyals 8 ай бұрын
Can output parsers be use with the csv agent?
@askcoachmarty
@askcoachmarty Жыл бұрын
Great vids, Sam! So, is this awesome pydantic output parser available for node? I'm finding shaky info in the JS docs, I'm currently using the StructuredOutputParser, but I'm creating some agents that I want to output in Markdown. Is it best in javascript to just post-process and convert to markdown? Any pointers or thoughts would be greatly appreciated!
@samwitteveenai
@samwitteveenai Жыл бұрын
Pydantic is a Python thing so maybe not in the JS version, but my guess is they will have something like this soon. Technically you could just make it yourself as it is all just setting a prompt. I have a new vid coming out in an hour which shows another way to do the same thing.
@askcoachmarty
@askcoachmarty Жыл бұрын
@@samwitteveenai cool. I'll look for that video!
@cloudprofessor
@cloudprofessor 3 ай бұрын
How can we use output parsers with RetrievalQA ?
@BrianRhea
@BrianRhea 11 ай бұрын
Thanks Sam! Would using an Output Parser in combination with Kor make sense? Is that worth a video on its own?
@samwitteveenai
@samwitteveenai 11 ай бұрын
At the moment all of this is changing with the OpenAI functions (if you haven't seen them I have a few vids about this ). Currently LangChain also seems to be rethinking this. I will revisit some of these. One issue is going to be are we going to have 2 very different ecosystems ie OpenAI vs everything else. I am testing some of the new things in some commercial projects, so let see how they go and then I will make some new vids.
@sethhavens1574
@sethhavens1574 Жыл бұрын
i’ve noticed that using turbo 3.5 recently there is quite often issues with the model being overloaded - using langchain (at least i assume that is where this comes from) the chain will usually retry the llm query - is there a way to control the number of retries and the interval between retries? and thanks for the awesome content, super useful stuff! 👍
@samwitteveenai
@samwitteveenai Жыл бұрын
I think there is a PR submitted to control the number of retries but don't think it is there yet.
@sethhavens1574
@sethhavens1574 Жыл бұрын
@@samwitteveenai cool thanks for the feedback dude
@RedCloudServices
@RedCloudServices Жыл бұрын
There is no LangChain plugin in the ChatGPT plugin store. Did they remove it?
@bleo4485
@bleo4485 Жыл бұрын
Hi Sam, thanks for the video. You should set up a patreon or something. Your videos have helped a lot. thanks and keep up the good work!
@samwitteveenai
@samwitteveenai Жыл бұрын
Thanks for the kind words.
@easyaistudio
@easyaistudio Жыл бұрын
the problem with trying to do the formatting in the same prompt that does the reasoning is that it impacts the result
@samwitteveenai
@samwitteveenai Жыл бұрын
You can get the model to give reasoning before as part of the output. Ideally you want reasoning instructions earlier that output instructions.
@PaulBenthamcom
@PaulBenthamcom Жыл бұрын
With regards the Pydantic Output Parser, when it gets the badly formatted output - do you get that as your prompt result or does the parser feed that error back to itself to correct it until it has a well formatted output to return to the user?
@samwitteveenai
@samwitteveenai 11 ай бұрын
It will give an error and you can set that to trigger an auto retry etc.
@MrOldz67
@MrOldz67 Жыл бұрын
Hey @Sam thanks again for all these useful videos I was wondering would that be possible to use the same outputformer to get a Json file that later we would be able to use as a dataset to train our language model If yes would it be possible to bypass openai in this process and maybe use another Llm from a privacy perspective Thanks a lot
@samwitteveenai
@samwitteveenai Жыл бұрын
Yes, absolutely you can use it to make datasets. Lots of people are doing this. It will work with other LLMs but most the open source ones won't have good outputs so they often fail etc.
@MrOldz67
@MrOldz67 Жыл бұрын
@@samwitteveenai Thanks for the answer I will try to find a way to do that. But meantime if you would like to make a video i'll be really interested :) Thanks in advance
@ElNinjaZeros
@ElNinjaZeros 9 ай бұрын
When I try to apply this parsing with models called -langchain, sometimes it works and sometimes it doesn't. Same with langchain's pydantic.
@blackpixels9841
@blackpixels9841 Жыл бұрын
Thanks Sam! Is it just me or do you also feel that the API is slow to return json 'code' than it is plaintext? Getting upwards of 30 seconds per API call to parse a PDF table into 250 tokens of json
@samwitteveenai
@samwitteveenai Жыл бұрын
Interesting. I haven't noticed that. It shouldn't be any slower
@mytechnotalent
@mytechnotalent Жыл бұрын
New to LangChain Sam and I appreciate this video. Really looking for how to tune this properly with the open-face HuggingFace rather than OpenAPI paid API.
@user-wr4yl7tx3w
@user-wr4yl7tx3w Жыл бұрын
By chance. Does LangChain have an implementation of the Tree of Thought?
@samwitteveenai
@samwitteveenai Жыл бұрын
Not yet but I have been playing around with it. I want to make sure it works for things not just in the paper before I make a video.
@picklenickil
@picklenickil Жыл бұрын
This is what you call.. more than a party-trick.
@xiam19
@xiam19 Жыл бұрын
Can you do a video on ReWOO (Reasoning WithOut Observation)?
@samwitteveenai
@samwitteveenai Жыл бұрын
Yeah it looks pretty cool. I will take a proper look.
@MichaelScharf
@MichaelScharf Жыл бұрын
Is this not eating up a lot of tokens, especially the pedantic case?
@samwitteveenai
@samwitteveenai Жыл бұрын
Yes it does eat up some more tokens, but the pydantic model really allows you to use the outputs in an API etc much easier. Regarding price it all depends on how much you value interaction. I see some customers are happy to pay a dollar ++ for each conversation which is a lot of tokens. Usually that is a lot cheaper than a real human being involved etc.
@ivanlee7450
@ivanlee7450 10 ай бұрын
is it possible to use another llm for output parser
@samwitteveenai
@samwitteveenai 9 ай бұрын
yeah certainly then it becomes a bit like an RCI chain which I made a video about.
@ivanlee7450
@ivanlee7450 9 ай бұрын
How about hugging face model
@shivamkumar-qp1jm
@shivamkumar-qp1jm Жыл бұрын
Can we extract code from the response
@samwitteveenai
@samwitteveenai Жыл бұрын
yes take a look at the PAL chain it does this kind of thing
@orhandag540
@orhandag540 2 ай бұрын
but wwhat if we want to to that with an open source LLM(Hugging Face) ?
@samwitteveenai
@samwitteveenai 2 ай бұрын
You can certainly do the same with something like a Mistral fine tune etc
@orhandag540
@orhandag540 2 ай бұрын
@@samwitteveenai but somehow the prompt template of mistral is not compatible with langchain models, I was trying to build this with exactly with mistral
@NoidoDev
@NoidoDev Жыл бұрын
I don't get it, maybe I missed something or don't know some important element. Why is the language model supposed to do the parsing as some form of formatting? Why isn't this just done in code with the response from the model?
@samwitteveenai
@samwitteveenai Жыл бұрын
Getting the model to output it like that makes it much easier than try to write regex expressions for every possible case the model might output.
@gitmaxd
@gitmaxd 9 ай бұрын
I disagree! This is one of the more sexy parts! It’s the hocus pocus of “Prompt Enginering”. Great video!
@clray123
@clray123 Жыл бұрын
Honestly the more I watch about LangChain the less value I see in using it vs. just coding your own interactions with the model. It seems to be doing trivial things at a very high level of text processing and obscuring what it does. While you still have to learn the API and be limited by it.
@ashvathnarayananns6320
@ashvathnarayananns6320 7 ай бұрын
Can you post these videos using open source llm rather than using open ai APIs. Thank you
@samwitteveenai
@samwitteveenai 7 ай бұрын
I have posted quite a few videos that use OpenSource models. One challenge is up till recently the OSS models weren't good enought to a lot of the tasks.
@ashvathnarayananns6320
@ashvathnarayananns6320 7 ай бұрын
@@samwitteveenai Okay and Thanks a lot for your reply!
@hqcart1
@hqcart1 Жыл бұрын
I've managed to get gpt3.5 to return json for 100k prompts, and it always returned json. it took me few hours to get the right prompt though!
@MadhavanSureshRobos
@MadhavanSureshRobos Жыл бұрын
Practically speaking, isn't guidance so much easier and better to use? For practical reasons these doesn't seem to add more value
@samwitteveenai
@samwitteveenai Жыл бұрын
I am planning to do a video on Guidance and Guardrails as welll.
@MadhavanSureshRobos
@MadhavanSureshRobos Жыл бұрын
That'll be wonderful!
@jawadmansoor6064
@jawadmansoor6064 11 ай бұрын
What parser or other method do you use in chains? For example: memory = ConversationBufferMemory(memory_key="chat_history") tools = load_tools(["google-search", "llm-math"], llm=llm) agent = initialize_agent(tools, llm, agent=AgentType.CONVERSATIONAL_REACT_DESCRIPTION, handle_parsing_errors=_handle_error, memory=memory, verbose=True) I am getting output parsing errors: Thought:Could not parse LLM output: `Do I need to use a calculator or google search for this conversation? Yes, it's about Leo DiCaprio girlfriend current age raised 0.43 power.` Action:google_search` Observation: Could not parse LLM output: `Do I need to use a ca Thought:Could not parse LLM output: `Could not parse LLM output: `` Do you want me to look up more information about Leo DiCaprio girlfriend's current age raised 0.43 power?`` Action:google_search`` Observation: Could not parse LLM output: `Could not parse LLM o Thought:Could not parse LLM output: `` Is there anything else you would like me to do for today?`` AI:Thank you! > Finished chain. 'Thank you!'
Building a LangChain Custom Medical Agent with Memory
17:47
Sam Witteveen
Рет қаралды 45 М.
Understanding ReACT with LangChain
21:10
Sam Witteveen
Рет қаралды 42 М.
OMG🤪 #tiktok #shorts #potapova_blog
00:50
Potapova_blog
Рет қаралды 10 МЛН
Balloon Stepping Challenge: Barry Policeman Vs  Herobrine and His Friends
00:28
A pack of chips with a surprise 🤣😍❤️ #demariki
00:14
Demariki
Рет қаралды 41 МЛН
OpenAI Functions + LangChain : Building a Multi Tool Agent
18:52
Sam Witteveen
Рет қаралды 36 М.
I wish every AI Engineer could watch this.
33:49
1littlecoder
Рет қаралды 51 М.
Langchain's Prompt Templates and Sequential Chains
17:35
Derek Arends
Рет қаралды 4,3 М.
LangGraph 101: it's better than LangChain
32:26
James Briggs
Рет қаралды 49 М.
Pydantic is all you need: Jason Liu
17:55
AI Engineer
Рет қаралды 167 М.
Python RAG Tutorial (with Local LLMs): AI For Your PDFs
21:33
pixegami
Рет қаралды 113 М.
Building Custom Tools and Agents with LangChain (gpt-3.5-turbo)
16:48
Телефон в воде 🤯
0:28
FATA MORGANA
Рет қаралды 774 М.
Урна с айфонами!
0:30
По ту сторону Гугла
Рет қаралды 6 МЛН
ВЫ ЧЕ СДЕЛАЛИ С iOS 18?
22:40
Overtake lab
Рет қаралды 129 М.
Apple watch hidden camera
0:34
_vector_
Рет қаралды 64 МЛН