Streaming for LangChain Agents + FastAPI

Рет қаралды 32,597

Күн бұрын

Пікірлер: 66

@llaaoopp Жыл бұрын

Wanted to get to this for quite a while but it was never a priority. Love that I am now able to simply follow your advice instead of going through this all on my own! Thanks for the great videos, they are an incredible resource!

@jamesbriggs Жыл бұрын

haha glad to hear, it can be a little fiddly to get right

@3stdv93 Жыл бұрын

I'm searching for this for a along time. Thank you! ❤

@Brkcln Жыл бұрын

These recommendations were really helpful! I'm excited to watch all the series you shared. Thanks a lot for sharing them with us

@brandonreid797 Жыл бұрын

Thanks for the awesome walk through! Have you had a chance to test this flow with OpenAI Function Calls? I have a similar implementation for streaming, but once the LLM triggers an OpenAI Function Call, it fails to stream - It seems like the iterator just doesn't return any tokens.

@jamesbriggs Жыл бұрын

I haven't yet but will try it out

@brandonhresko6844 Жыл бұрын

having the same issue, have you solved it?

@tonymizunhi 8 ай бұрын

great tutorial, super useful. thanks for sharing!

@coreymiller6581 6 ай бұрын

Why is it you can create a fairly simple generator to the frontend using the LLM, but when you add an agent, it becomes so needlessly cumbersome? Anyway, amazing job with the tutorial!!! You're a Godsend.

@maxlgemeinderat9202 9 ай бұрын

Would be nice if you could do another example on Streaming with FastAPI, Langchain and a local model, for example when building a Rag app

@domineia 4 ай бұрын

Really helpful content, thank you for sharing it.

@ABHISHEKSHARMA-fo4tf Жыл бұрын

This is amazing video where we can get lot of information . Can you make a video on how we connect streaming with llmchain and memory and show the streaming in webpage or html or streamlit?

@Hyferionx Жыл бұрын

Great video! Do you know why this doesn't work with a GET request? If I send a GET request instead of a Post (of course I adapted the API) it loads the whole message first and then sends it to the client instead of doing it async.

@henkhbit5748 Жыл бұрын

Excellent video about streaming👏👏 extending with voice streaming if a complete sentence is streamed would be cool…

@jamesbriggs Жыл бұрын

that would be incredibly cool - will try it out

@zacboyles1396 Жыл бұрын

Over the weekend I was working on this exact problem. I couldn’t say which part I was missing, probably different parts at different times. What I can say is how epic it was to start the day with this video, thank you 🙏 legend! … I don’t suppose you feel like taking a crack at autogen next 😅 In any case thanks again!

@jamesbriggs Жыл бұрын

autogen is interesting, so maybe haha - glad this video helped :)

@sanchaythalnerkar9736 Жыл бұрын

That is so great , now I can integrate my agent into the chatbot

@kamalkamals 11 ай бұрын

Good example but the result still very summarised specially when using search engine like serper, so the question here how can control size of response text, because there are same cases need response contain more of details.

@JohanFire 6 ай бұрын

Great video! How could I do this but with ConversationalRetrievalChain to mantain memory & vectordb ?

@peterc.2301 Жыл бұрын

Excellent video as always! Do you have any idea about how to use the "Professor Synapse" prompt with langchain?🙂

@tichon007 7 ай бұрын

Really useful. You win a subscriber 😅

@meet_gondaliya 8 ай бұрын

hey, when I use tool, the streaming stops. When no tool is used, streaming occurs as expected but when a tool is called, the final response after taking observation into consideration doesnt comes in aiter() function

@cihangirkoroglu 8 ай бұрын

Thank you.. great video.. so how can I implement this streaming to a text to speech app... like pyttsx3...

@AnubhavGoel-cj3fq 7 ай бұрын

great video! but whenever i try to run uvicorn main:app --reload in the terminal, i get this error: TypeError: multiple bases have instance lay-out conflict

@digitalchild 9 ай бұрын

Looks like since the last langchain update all the current calls are deprecated. Do you have any details on how to update the code to work with all the changes?

@rithickguru1561 Жыл бұрын

Really Useful content!!!

@yasminesmida2585 Ай бұрын

When I add streaming=true with callbacks, max-tokens is no longer taken into account. How can I solve this issue?PLZ

@sowmyancs 11 ай бұрын

Great content. It works with Agent, in my case I need to use AgentExecutor instead of Agent, agent_executor = AgentExecutor(agent=agent_chain, memory=memory, verbose=True, tools=tool_list,return_intermediate_steps=False) Look like AgentExecutor is not streaming with LCEL. Any ideas?

@abdullahhashmi654 10 ай бұрын

I have the same issue. I have a few tools, and an Agent executor which handles all the calls. I am using a SQL database tool, and even though the tool itself is giving me streamed results, the final answer is not streamed. Any idea how to tackle this?

@oliver1231033 8 ай бұрын

@@abdullahhashmi654 I got the exact same issue

@mahdifalek252 5 ай бұрын

any updates ?? if u find a solution how with agent executor ??

@qochpeter9836 Жыл бұрын

Thank you very much James for this tutorial! The FastAPI template works perpectly without adding tools. When I add LLMMathChain tool to the agent, the application starts normally but it just stucks at the step "Entering new LLMMathChain chain..." I have only access to AzureOpenAI. I wonder if it is an AzureOpenAI problem or it is a general problem. Have you tried to add tools for the Agent in the FastAPI code and did it work? I also tested with Zero-shot agent, the result is the same. > Entering new AgentExecutor chain... ```json { "action": "Calculator", "action_input": "50 * 48" } ``` > Entering new LLMMathChain chain... Many thanks in advance!

@jamesbriggs Жыл бұрын

No it’s just a case of modifying the async iteration logic to handle how you should deal with tools (if-else statements to handle when to stream and when not to) - there may be some other LLM math specific behaviour that I haven’t come across though as I have not used that specific tool with this

@CintraAI Жыл бұрын

@@jamesbriggs Ya, most tools should have an _arun method that works with the acall agent function. if it cant run async, you'll have to modify _arun to just return self._run, and it should work. This means you won't be able to stream the result of that tool right away, but you can stream the resulting response from the AI

@schOOlsin 11 ай бұрын

It seems there's some kind of issue when trying to perform the get_stream("Hi there") the second time. The first time i receive correctly: { "action": "Final Answer", "action_input": "Hello! How can I assist you today?" } the second time i just receive: Hello! How can I assist you today? and that generates an exception

@ersaaatmeh9273 7 ай бұрын

Hi James, could I ask you how I can do the same step if I try LLM from Hugging Face?

@caizenz Жыл бұрын

intresting , do you know what other llms or platforms that supports streaming ? like from replicate or clarifai ?

@shivarajgudaganatti8573 10 ай бұрын

How can i get to extract only the LLM Langchain thoughts and stream those.

@shaunpx1 Жыл бұрын

So I got a question, do we need to specify an input and output parser using and agent for the APIs I'm trying to wire one up to Django rest api and get constant errors saying LLM output can't parse but yet works perfect in Colan? plus is there a difference. betwee other agents and using initialize_agent its hard to get a clear answer answer on the data structures and usage of agent methods in langchain or just a something that is clear and not all over the the place info overload with their docs? thanks, Awesome content by the way. Do you have any other videos clarifying agents?

@jamesbriggs Жыл бұрын

If not using streaming you can use the standard output parser classes in LangChain, with streaming you should try and parse live, which is harder but typically doable unless the model really messes up I have a YT course on LangChain, chapters 7-9 focus on agents :) kzbin.info/aero/PLIUOU7oqGTLieV9uTIFMm6_4PXg-hlN6F

@shaunpx1 Жыл бұрын

Awesome this worked thank you so much!!@@jamesbriggs

@ngontinhhaymely Жыл бұрын

how to set a timeout for the agent, agent sometimes forgets the prompt and stuck there with streaming.

@muazashraf409 11 ай бұрын

I am using flask and html. I add this callback and get the streaming response in terminal but not frontend. I also use websocket, SSE client but not succeed.

@izarniz 2 ай бұрын

How did you fix that issue?

@hamzakivideo Жыл бұрын

I see what you did there 😂 awesome stuff

@sabalmalhan5447 Жыл бұрын

Question - I followed along and ensured the code matches, however I am unable to emulate this streaming behavior when calling the "get_stream" function. It seems to still wait for the chain to complete gathering the text and then printing it all at once. Any pointers on what might have gone wrong?

@jamesbriggs Жыл бұрын

maybe you're missing an async function somewhere?

@zacboyles1396 Жыл бұрын

Make sure websockets are installed.

@zacboyles1396 Жыл бұрын

Or the uvicorn standard extras: “uvicorn[standard]”

@zacboyles1396 Жыл бұрын

I pulled up the sample to check and ran into the issue too. In my working code, the equivalent to the demo `AsyncCallbackHandler`, I’m overriding `__init__` and calling `super().__init__()`. Changing that got it working for me. I bet if you’d used the built-in handler and switched to custom without full shutdown, you wouldn’t catch the issue.

@sabalmalhan5447 Жыл бұрын

Hmm thanks for the idea@@zacboyles1396 , however it still doesn't seem to work for me - I did confirm websockets and the uvicorn extras as also installed. The standard handler does seem to work better (although not perfect as seen in the video)

@ABHISHEKSHARMA-fo4tf Жыл бұрын

the streaming is print in terminal , how we can streaming do with show in the webpage ?

@jamesbriggs Жыл бұрын

check out the final part of the video, I show how to stream from fastapi - that is how you'd do it

@ABHISHEKSHARMA-fo4tf Жыл бұрын

@@jamesbriggs can I used conversational chain and memory with chain predict and show like streaming?

@fredrik-ekelund Жыл бұрын

👍👍👍 Thanks!!

@romeogomez9948 Жыл бұрын

I need that but with pinecone

@saibhaskerraju2513 6 ай бұрын

please update your notebook. langchain code changed. kindly update

@vaibhavsaxena6482 Жыл бұрын

How can we stream LLAMA on gradio?

@gabrielvinicius6766 Жыл бұрын

Anyone can streaming with any opensource model?

@setop123 9 ай бұрын

Langchain make me regret even using a framework for my LLMs...it's awful.

@jamesbriggs 9 ай бұрын

they do have new streaming API - apparently better, will look into it soon

@engineeredmind Жыл бұрын

Langchain really needs to get it together with the streaming structure. This is terrible.

@coreymiller6581 6 ай бұрын

Seriously.

@Viellmo Жыл бұрын

I have a question -> What encoding should I use to encode Polish characters, because line.decode("utf-8") returns errors for Polsih letters. I tried other encodings but its, not working. Any ideas @jamesbriggs ?