Thankful for channels like this that go above and beyond the standard tutorials 💪🏾
@codingcrashcourses853310 ай бұрын
Thanks for your motivating comment :)
@saurabhjain50710 ай бұрын
Another helpful video! please create more videos of langchain in production
@codingcrashcourses853310 ай бұрын
Next monday i will release one about Monitoring with langfuse
@delgrave47862 ай бұрын
so i have some doubt regarding the digest generation. the code generates digest for all the document uploaded to it right? and the advantage is that same document will generate same digest which can be checked with existing digest and be excluded, I guess a better way would be to generate digest for each pages of a pdf in case a new pdf is uploaded with only single page difference? Currently the code does not handle any cases like this right? it only generates digest as part of the metadata and stores it without checking anything if i haven't missed anything?
@codingcrashcourses85332 ай бұрын
I generally agrew. That is a different approach since you need to use sind kind of observer pattern, which is not so easy, since you have to rely on your data Provider to offer that.
@navaneeth442 ай бұрын
Great content! But I have a question here, does the post method accepts large files as input?
@codingcrashcourses85332 ай бұрын
@@navaneeth44 no, i used application/json. But fastapi also has classes which allow to accept files. What do you mean with large? The default is not large, a few mb
@omaralhory806510 ай бұрын
Hi, I am following your codebase, and I really like it. I am still unsure why do we need to update the data via an API, if we can have an ETEL (Extract, Transform, Embed, Load) Data Pipeline that runs on a schedule if new data comes on. Why do we give such access to the client, + why is it an API that gives access to deleting records. What would you do differently here? Would you develop a CMS in order to maintain the relationship between the client and the db?
@codingcrashcourses853310 ай бұрын
You could also do it that way, but in this repo I dont have a pipeline or anything. There is more than one way to do it :-). I currently have no good solution for updating data without the additional API Layer.
@omaralhory806510 ай бұрын
@@codingcrashcourses8533 Thank you for being responsive! your channel is a gem btw. Usually RAGs data sources aren't predictable, maybe a data lake (delta lake by Databricks) can be quite beneficial here, you can utilize pyspark to do the pipeline and it will be great when it is connected to Airflow for example for scheduling.
@awakenwithoutcoffee5 ай бұрын
did you find a solution ?
@pvrajanrk9 ай бұрын
Great video. Can you add your thoughts on including state management for maintaining the chat window for the different chat sessions? This is another area I see as a gap in Langchain Production.
@codingcrashcourses85339 ай бұрын
I did another video about this and cover that in my Udemy course. The answer for me is Redis, where you set key value pairs. Key is the conversation id and value is the stringified conversation
@Pure_Science_and_Technology10 ай бұрын
When processing a file for RAG, I save its name, metadata, and a unique ID in a structured database. This unique ID is also assigned to each chunk in the vector database. If a file needs updating or deleting, the unique ID in the database is used to modify or remove the corresponding entries in the vector database.
@codingcrashcourses853310 ай бұрын
Yes, very robust solution :)
@RobertoFabrizi10 ай бұрын
Just to see if I understood you right, let's assume that you have a file (product catalog, functional software specification, your pick) that is a doc with 100 pages. You use a document loader to load it, then split it with a recursive character text splitter with a chunk size of 1000 and overlap of 100, then embed those chunks and store them in a vector db, saving thousands of them all created from the file to ingest. Then a single line around the start of that file changes, but that has repercussions to all later chunks that even though are technically the same data, they are partitioned differently from before (assuming that the change before them caused the chunking process to creare different chunks, maybe the modificed row is longer than before). How do you efficiently update your vector db in this scenario? Thank you!
@codingcrashcourses853310 ай бұрын
@@RobertoFabrizi You wont just read a whole catalog in memory at once. You should have each page seperate as raw data. Then you split each page into smaller chunks. I would even argue against a fixed chunk size, but this is something I will cover in my next (small) video.
@nicolascr181Ай бұрын
Hello, I don't understand how to upload the document since the endpoint receives only a json format. thanks for your content
@codingcrashcourses8533Ай бұрын
Document is just a class. It looks like this: Document(content="xxx", metadata={}) You can serialize this to: { "content": "xxx", "metadata": {} } then you have the JSON format. Code looks like this (dummy implementation): class Document: def __init__(self, content, metadata): self.content = content self.metadata = metadata def to_dict(self): return { "content": self.content, "metadata": self.metadata } Does this help? :)
@zendr010 ай бұрын
have you thought about caching implementation in RAG based systems? Curious.
@codingcrashcourses853310 ай бұрын
Yes, but currently it is one of the less prioritized topics, since when you use the whole conversation history, the amount of Times we you can use the Cache is not too often. Have you worked with Cache before?
@zendr010 ай бұрын
I have used in memory cache. Can we do something like... we use cache to store the embeddings and then do cosine similarity on the new input query embeddings and the ones in the cache. If the score is more than a threshold then it is somewhat obvious that the query has been asked previously, so we just use the cache to answer that. What do you think? @@codingcrashcourses8533
@mcdaddy4206910 ай бұрын
why do you have to put your vectorstore in a docker contrainer?
@codingcrashcourses853310 ай бұрын
Containers are just the way to go. You dont have to, but it makes everything so much easier.
@alchemication10 ай бұрын
Very nice. Did you consider langchain serve before trying an inhouse solution? Just curious..
@codingcrashcourses853310 ай бұрын
Langserve is more about prototyping in my opinion:)
@alchemication10 ай бұрын
Interesting take on it, I think they promote it as a prod env API, but as usual, without actually trying for real - you won’t know 😅 best!
@codingcrashcourses853310 ай бұрын
@@alchemication Well, I am quite good with FastAPI and use it since a very long time, so I would in general prefer not to add an abstraction layer on top of it. My first glance on it was like "ok, that´s quick, but robust code is something different"
@daniel_avila9 ай бұрын
Hi thanks for this! I have a question about digest specifically. I understand that would be a great way to compare page_content for changes, but I'm not sure where to do this programmatically, or where to inspect where this is happening already. As far as I know, this is not happening already and maybe more on this would be helpful to someone new to pgvector. Following how documents are added, it seems embeddings are created regardless.
@codingcrashcourses85339 ай бұрын
There is the indexing api to do this. Or do you mean visually like a git diff?
@daniel_avila9 ай бұрын
@@codingcrashcourses8533 I was unaware this would involve indexing API but makes sense, however there's no official async pgvector implementation for the indexing manager: langchain-ai/langchain, issue #14836
@entrepreneurialyt10 ай бұрын
Thank you for videos! Can you please make a video about tools that can be used for both performance measurement and accuracy tracking? Basically how to build test environment for bot before realising to production
@codingcrashcourses853310 ай бұрын
RAG Performance? Performance of service with a Load Test? What would interest you?
@entrepreneurialyt10 ай бұрын
@@codingcrashcourses8533 Performance of service with a Load Test will be super cool!
@entrepreneurialyt10 ай бұрын
@@codingcrashcourses8533 RAG Performance will be cool!
@entrepreneurialyt10 ай бұрын
@@codingcrashcourses8533 Rag perfomance will be great!
@entrepreneurialyt10 ай бұрын
@@codingcrashcourses8533 attempting to respond to your question 101 time: please make video about r-a- g performance (please KZbin don't ban my reply). I am developing a bot that is able to answer question based on the transript of the video lecture and other course materials to speed up learning proces. if I am not misteken first one will be more relevant? How can I deduce that my bot is ready for production? Thank you :)
@DePhpBug7 ай бұрын
Still new with all the concept here , saw the video about having API on top of the model's API is this correct? For having an abstraction layer on top of model. Am i correct to say , my model need to sit in let;s say server A , then i need to create the API in server B to connect to the A ?
@codingcrashcourses85337 ай бұрын
Exactly. Adding one Layer is crutial nornally, Adding more can but must not make sense for your usecase
@DePhpBug7 ай бұрын
@@codingcrashcourses8533 thanks
@swiftmindai10 ай бұрын
As always excellent content. I have learned from your previous content about use of langchain index api (SqlRecordManager). Now, I've learned about using of hashing function (generate_digest). I believe both are for same purpose. I'm wondering which one would be better coz I don't see the way to measure performance for both methodology. Appreciate your suggestion.
@codingcrashcourses853310 ай бұрын
Thank you! I think its just important to understand the concepts of WHY Langchain introduces something like that and learn about the limitations. I found it hard to use the indexing API when there are a large amount of documents.
@swiftmindai10 ай бұрын
It took me literally few days to understand and implement the indexing API concept. I even had to switch to PGvector from other vector store provider which i was using earlier since indexing api was only applicable to sql based vector store. But now, I love PGVector more than any other. I thank you alot for your production implementation video as I literally use this as the basis of my latest project.
@sskohli7910 ай бұрын
great video thanks! can you please also add requirements.txt to your repo
@codingcrashcourses853310 ай бұрын
can add that today, yes :)
@sskohli7910 ай бұрын
Thanks!
@sskohli7910 ай бұрын
@@codingcrashcourses8533 not there 😞
@omaralhory806510 ай бұрын
Can yo add it please? I checked and its not there
@codingcrashcourses853310 ай бұрын
@@omaralhory8065 really sorry, forgot about it
@picklenickil8 ай бұрын
Just came to comment that, maintaining a backend for this will be hard!
@codingcrashcourses85338 ай бұрын
What do you exactly mean?
@YerkoMuñoz-q7uАй бұрын
Isn't quite the opposite? having a backend like this allows you to have a maintainable infrastructure
@yazanrisheh512710 ай бұрын
Can you show us how to implement memory with LCEL and if possible, caching responses? Thanks
@codingcrashcourses853310 ай бұрын
The Memory classes from Langchain are not a good way to work in production, they are just for prototyping. In real world apps you probably want to handle all of that in Redis
@say.xy_10 ай бұрын
Best best best!!!
@codingcrashcourses853310 ай бұрын
Thank you :)
@Pure_Science_and_Technology10 ай бұрын
Will Gemini 1.5 and beyond kill RAG?
@codingcrashcourses853310 ай бұрын
Highly doubt that with Gemini 1.5, but beyond hopefully. Currently Answers still are bad then your context is larger than 20 Documents or so
@xiscosmite684410 ай бұрын
@@codingcrashcourses8533Curious why you think answers are bad after that size and how Gemini could solve that in the future. Thanks for the great video!
@codingcrashcourses853310 ай бұрын
@@xiscosmite6844 I dont trust gemini after i tried it on my own :)
@dswithanand8 ай бұрын
How to integrare langchain chat memory history with fastapi
@codingcrashcourses85338 ай бұрын
You don´t. You want your API to be stateless normally.
@dswithanand8 ай бұрын
@@codingcrashcourses8533 I understand that.. I am working on a sqlbot and using fastapi along with it. But this bot is not able to retrieve the context memory. Can you help with that. Langchain has ChatMemoryHistory library which can be used for this.