Building a RAG application from scratch using Python, LangChain, and the OpenAI API

Рет қаралды 74,719

Күн бұрын

GitHub Repository: github.com/svp...
I teach a live, interactive program that'll help you build production-ready machine learning systems from the ground up. Check it out at www.ml.school.
Twitter/X: / svpino

Пікірлер: 215

@beastunleased0205048 6 ай бұрын

DUDE!! this is the best resource I have found so far. The part I truly appreciate is that you focused on the conceptual parts and NOT just here is the code

@sgatea74 6 ай бұрын

This is the best crispy clear RAG walkthrough that I have seen until now ! Congratulations and thank you for putting effort into educational videos on GenAI use

@underfitted 5 ай бұрын

Thanks!

@Kim-kr6of 6 ай бұрын

Nice - looking forward to dive into this. Could be nice to see a setup for Application with local LLM for private docs. I think a lot of people are looking for such a video 🤞🏻💪🏻 TY for your Nice work!

@AI-Reviews-and-Tutorials 5 ай бұрын

Fantastic tutorial! Thanks a lot! I already saw other tutorials on LangChain and also I purchased a Udemy course. This tutorial here is the best of all. Everything is well explained and can easily be understood.

@lokeshsharma4177 4 ай бұрын

"All you need is Attention" to the details this GREAT teacher is providing , he could have picked up a simple context within the limits and would have been done BUT no his zeal to share and show every single road block that we need to know is OUTSTANDING. God Bless You. Could you please do a Speech To Speech Application as well?

@ddre54 2 ай бұрын

Great content! Extremely helpful to get hands-on introduction into LLM and RAG. Thank you!

@UjjwalDevarapalli 6 ай бұрын

Thanks for the video. Very clear explanation of all concepts involved. Looking forward to your next videos. If possible, please can you cover the topic of production deployments and operations for a LLM project.

@seallyolme 6 ай бұрын

I already have experience with Langchain, but wanted to learn more about Pinecone. You are just exceptional at teaching. Thank you so much for this. :)

@RankMotion 6 ай бұрын

I have implemented your example in node js and it's working like a charm. Thanks!

@RankMotion 6 ай бұрын

But I really hate the langChain api. I'm trying to implement it as plain old functions.

@Shubhampalzy 2 ай бұрын

I wanted to ask something related to 1:11:20. If I have understood correctly, the cell being run at this time is showing the records present in the pinecone database that match closely with the user prompt and the answer would be generated based on this. My database consists of a large number of code files (HTML,typescript,python) and I wish to make a chatbot on these files so the user is able to get answers related to each file. I wish to know the best approach to store the vector embeddings of many different files in pinecone so that I can also provide the exact reference to that code file name. Also if the answer is generated from multiple files, it should be able to refer that too.

@AugustoBouzas Ай бұрын

👏Congrats and thanks Bro for this incredible video! What every student needs: knowledge and passion in one place. Very clear.💪

@MrGirishbarhate 2 ай бұрын

excellent tutorial explaining theory as well as code in very beginner friendly manner, subscribed! I found you via twitter feed and started this first video.

@tee_iam78 4 ай бұрын

Million stars to this video. The content is clear, precise and well explained. Thank you.

@schmooks 6 ай бұрын

Great video. The jupyter notebook was very helpful! Interesting that, by default and only given the context of "Patricia likes white cars," the parser came to the conclusion that Patricia's car was white even though she might not actually own a white car. I added instructions to tell me when it was inferring an answer but makes me wonder what other things it might be inferring without telling me why.

@freerider6300 6 ай бұрын

I don't know why but I get the response "I don't know" for that question even though my code and the prompts are identical and I'm also using the OpenAI API.

@schmooks 6 ай бұрын

@@freerider6300 Given that the template for the prompt says to respond as "I don't know" that seems appropriate. Are you by chance using gtp-4 vs gtp3.5-turbo? I changed my model to use gtp-4 and I get the response as "I don't know" where, with gtp-3.5-turbo, I get the response as "White" just like it shows in the video.

@Azam.Rajabov 6 ай бұрын

Love it how you explain everything step by step!

@toddroloff93 3 ай бұрын

Amazing video. You are such a wealth of knowledge. Thankyou for these great video tutorials. Please keep them coming!!😃

@sunitjoshi3573 Ай бұрын

Great work here! Thanks a lot. 😊

@lakraMundka 4 ай бұрын

Amazing video. Thanks for the work you do in putting all this together.

@causalinference4176 4 ай бұрын

So great video, you alway explain complex concepts very well!

@rakeshgohane8938 6 ай бұрын

Hi Santiago, at 31:33 you mentioned that 1000 words is 750 Tokens. Isn't it the other way round? 1 word around 3/4 Tokens?

@underfitted 6 ай бұрын

Ha! Yeah, more tokens than words. I’m always getting this wrong (notice you also made the same mistake in your comment.) 1 word is about 1.3 tokens

@rakeshgohane8938 6 ай бұрын

@@underfitted😅

@abhichintapalli7444 4 ай бұрын

Could you also do a video on how you can show the source documents that the context comes from?

@MauricioFagundesBSB 6 ай бұрын

Wonderful video! I made my post-graduate final project exactly like this.

@BoonTee 5 ай бұрын

This is really great video, thanks a lot @svpino for sharing this! Liked and subscribed!

@underfitted 5 ай бұрын

Thanks man!

@tinchoroman10 4 ай бұрын

Love your video! I you allow me a small suggestion, I would have put your face in the top left corner in order to avoid the overlapping with the code. Great Work

@underfitted 4 ай бұрын

True

@DavidMwiti-v3i 6 ай бұрын

This will be a great resource! I'd love to see a dedicated video diving deeper into LangGraphs and their applications. Keep up the great work!

@anonymoustechnopath1138 6 ай бұрын

Such an amazing Resource !! Thanks Santiago ❤

@gabrielomogbai256 6 ай бұрын

Thank you for this, your explanation on embeddings was superb!!

@niklasfischer3146 6 ай бұрын

Thanks for the video! Perfectly explained. However I did get an error when I tried to transcribe the YT Video: URLError: Why that? Thanks!

@breynerrojas1735 5 ай бұрын

Thank you a lot Santiago, best tutorial I have watched so far

@peacefullmusic8374 4 ай бұрын

i started to love machine learning bcz of you thnkx uu

@underfitted 4 ай бұрын

🙏🏻

@Shubhampalzy 2 ай бұрын

First of all, thanks for providing this awesome video on RAG. I wanted to ask something related to 1:11:20. If I have understood correctly, the cell being run at this time is showing the records present in the pinecone database that match closely with the user prompt and the answer would be generated based on this. My database consists of a large number of code files (HTML,typescript,python) and I wish to make a chatbot on these files so the user is able to get answers related to each file. I wish to know the best approach to store the vector embeddings of many different files in pinecone so that I can also provide the exact reference to that code file. Also if the answer is generated from multiple files, it should be able to refer that too.

@abhichintapalli7444 4 ай бұрын

How would you scale to more than 1 document (in your case 1 transcript). Would you continue to use 1 vector store for all the documents or are there other methods? Great video btw!

@HarGog 6 ай бұрын

Great one even people with little knowledge about LLMs can understand this very much.

@temanangka3820 Ай бұрын

Very clear! Thank you, Sir!

@harshaldharpure9921 5 ай бұрын

you have not use the opensource vector database and opensource language model in the above video. please try to make video on that.

@JavierTorres-st7gt 3 ай бұрын

How to protect a company's information with technology ?

@myungkoo802 6 ай бұрын

Thank you! This is really great job and much helpful to me. It would be greatly appreciated if you could make a video about prompt engineering.

@srikanthdongala6922 3 ай бұрын

I am building a RAG application, but I am just looking for some advice on issues: 1. How to make it answer general questions like greetings, farewell, etc., directly without trying to retrieve them as there is no need to do so? I know I can use LLM to classify the question but I will need an additional API call for that (if the model is openai) 2. I implemented RBAC if a single question from a user contains two queries and the user only has access to the document that is sufficient to answer one of the two queries, I retrieve the context of whatever is close to that (here the context is sufficient to answer one of the two queries) but the response from LLM is I don't know, which means it is failing to extract the context and answer at least whatever it can, how do I address this? 3. If I ask a specific question it answers, but if I ask in an overall sense say, what is the summary of XYZ doc, it fails to answer it would be great if someone could help me with the possible approaches to fix this thanks 🙏

@haticetozoglu2752 4 ай бұрын

Thanks for your valuable insights! It has been immensely helpful.

@babganz 6 ай бұрын

Damn... I learned a lot from this video! Thank you for this. I have subscribed to your channel and looking forward to watch your future videos.

@AdandKidda 6 ай бұрын

great content. I have a question: "how can we add the openai's function calling in this for extrenal api's calling?"

@orandolph8 5 ай бұрын

This is really good. Thanks for posting this tutorial

@sushanths.l4865 6 ай бұрын

This video is very informative ,thank you

@LuisMiguelSerrano 6 ай бұрын

Super insightful! Very nice explanation and walkthrough! Thank you!

@Shubhampalzy 2 ай бұрын

@LuisMiguelSerrano 2 ай бұрын

@@Shubhampalzy , you should probably ask that question in the main comments thread for better visibility by the author and others. In any case, regardless of the content of your files, typically you want to: Pre-process them (divide into chunks with relevant metadata attached) --> Generate Embeddings --> Store in Pinecone , and then when you get user queries, you want to generate an embedding and use that to query for the closest match in Pinecone.

@Shubhampalzy 2 ай бұрын

@@LuisMiguelSerrano thanks mate, so basically I need to attach the name of the file with every chunk which I store. Am I right?

@LuisMiguelSerrano 2 ай бұрын

@@Shubhampalzy Yes, you should attach the filename (and any other relevant information such as the chunk id) as metadata to each chunk that you store.

@Shubhampalzy 2 ай бұрын

@@LuisMiguelSerrano thanks a lot!

@TechPuzzle_Haven 6 ай бұрын

Excellent Video and Excellent Explanation. Thanks alot.

@gautammandewalker8935 6 ай бұрын

Very well structured tutorial. If there are transcripts of 100's of videos, then how to send the video also in the answer so that user can open and watch it after reading the answer? Has anyone tried it?

@learningTESTING 5 ай бұрын

Hello, Thank you for the GREAT tutorial ! i am stuck in step 3. i get AuthenticationError: Error code: 401 saying the API key is not correct. i am providing the correct key i got from OpenAI. Kindly, do you have any infos about this issue. thank you.

@callumwoodford 5 ай бұрын

I tried it with an API key from my work account and it worked!

@dtawins 5 ай бұрын

Great video and explanation, thanks m8!

@aishveerr 6 ай бұрын

thank you for this video, mate :)

@kailashgopalan4352 6 ай бұрын

Hi Santiago, thank you for this video. What is the PINECONE_API_ENV. I could not find that info anywhere?

@valentind.5398 6 ай бұрын

Hi, he answered that question in another comment : "We don't need that one any more, actually. I apologize. When I wrote the code, that variable was needed, but not anymore."

@Datastreamsandai 6 ай бұрын

Really great content , i'm happy to announce that you will be my virtual teacher for my Generative AI Journey , thanks for such videos!

@danielsavage1784 6 ай бұрын

Great tutorial - thanks

@omkarrajmane9408 6 ай бұрын

Wow this was great!

@martingeorge2222 6 ай бұрын

super. Well explained. thanks

@saschiiiiii 6 ай бұрын

Great video. Subscribed.

@contractorwolf 6 ай бұрын

awesome, now subscribed to your yt

@Noneofyourbusiness2000 6 ай бұрын

I thought people hate langchain?

@underfitted 6 ай бұрын

Who does? Certainly not me.

@mehershahzad-n5s 25 күн бұрын

@mehershahzad-n5s 1 hour ago Easy, Understandable, well elaborated Tutorial about RAG

@anatoliisyvaschenko1553 6 ай бұрын

Looking forward to upcoming videos! Also it would be interesting if you can cover the following topics: 1. Langchain - saving context of conversation and using in next questions 2. Dialogflow intents or if it can be built in LangChain. 3. the real-time data to be added based on conversation. 4. Streaming audio to LLM, and the way to implement conversations like chatGPT by holding button, with less latency.

@Shubhampalzy 2 ай бұрын

@zhihaoguo103 4 ай бұрын

Thank you for the great video! This is very insightful and helpful. I have one question: What if there are multiple documents input, and the answer you want to retrieve exists across different chunks? In that case, only fetching the most similar chunk will generate incomplete answer. However, if we search the top N most similar chunks, how can we make sure that they can be connected meaningfully (it could be the case that the top 3 most similar chunks are on Page 3, 80, 100, and there is no meaning connecting them together)? Thanks again for the effort you put in this video!

@liuyan8066 6 ай бұрын

Thank you for teaching, I used Pinecone before, but not langchain_pinecone. I know how to access API key, and openai key is success. But in this tutorial last step, load the transcription into pinecone, it failed, and error is : UnauthorizedException: (401) Reason: Unauthorized HTTP response headers: HTTPHeaderDict({'X-Cloud-Trace-Context': '3xxxx', 'Date': 'Sun, 24 Mar 2024 08:02:38 GMT', 'Content-Type': 'text/html', 'Server': 'Google Frontend', 'Content-Length': '15', 'Via': '1.1 google', 'Alt-Svc': 'h3=":443"; ma=2xxxx0,h3-29=":443"; ma=2xxxx0'}) HTTP response body: Invalid API Key

@roshankhandelwal2281 3 ай бұрын

were you able to solve this issue? I am facing the same issue?

@m4rth1n13 6 ай бұрын

if anybody had the same problem with activating the virtenvir venv as i: source command didnt work - it’s just different based on the os so source works for linux/macos and the .venv\scripts\activate for windows

@steveokay8810 6 ай бұрын

So, langchains are pipes for LLMs ? Do they provide any of the output splitting/redirection that you get from classic stdio ?

@marcinp3343 6 ай бұрын

For me, good mentor is someone that can explain difficult topics with elegant simplicity and you doing it superb! Looking forward to Cohort 12 :)

@underfitted 6 ай бұрын

I’ll see you in class!

@mmkplm119 5 ай бұрын

someone give this a man a six-figure salary

@Jorge-ls9po 6 ай бұрын

What value do you set to PINECONE_API_ENV? If I try to use text-embedding-3-small it wont work. For some reason it deploys ada. Do you know why? btw, nice intro to RAGs and its details.

@underfitted 6 ай бұрын

We don't need that one any more, actually. I apologize. When I wrote the code, that variable was needed, but not anymore.

@theanonymous92 15 күн бұрын

Amazing, You were just Amazing. Thank you so much for Explaining with this level of detail. Subscribed! btw, I really think your project should be used to build a Rag system with 'your video' as the knowledge base

@saxegothaea_conspicua 3 күн бұрын

Impresionante. Que gran tutorial, es el tercero que veo para aprender RAG, y los conceptos y el procedimiento están muy bien explicados. Muchas gracias!!

@sushanths.l4865 6 ай бұрын

Keep uploading this kind of content

@aashishjhaa 5 ай бұрын

Hello, I'm a music producer interested in AI, but I'm unfamiliar with machine learning due to my dislike of mathematics. As an entrepreneur-minded individual, I'm eager to leverage AI to create something impactful in the music industry or beyond. However, I'm willing to learn to code but prefer to avoid mathematical complexities. Are there any AI applications or tools that don't require extensive mathematical knowledge? I'm excited to explore the possibilities and create something meaningful. I want to build studd and Create a business well I rewritten this paragram using Lama 3 Groq

@TonyMM 6 ай бұрын

Excellent introduction to Pinecone use for RAG using Langchain. Looking forward to more...I had an error 2/3rds way through AttributeError: 'builtin_function_or_method' object has no attribute '__func__'. Eventually, I installed Python 3.11.8, re-cloned the repo, installed the same dependencies and it worked again, no problem!

@batkibalint 6 ай бұрын

Great tutorial! Thank you Santiago! I actually tried it myself (first time using the OpenAI API). I was able to build a RAG application to get answers based on a PDF document. And it works! However, I see that there are some more stuff to think about, such as storing the embedded pages in the vectorstore might not be the best approach. I am looking forward to new videos!

@AbdallahxGaber Ай бұрын

You're amazing!! Thank you for covering both code and conceptual aspects of the concept.

@ravindarmadishetty736 Ай бұрын

This video is excellent. how can we manage when we have lack of memory during splitting to document(loading in tempfile) when we have 10GB/100GB/more, of pdf or any format documents ? Please suggest as it is more problem with memory issue. Also please make a video on different types of RAG(like DR RAG...etc) and RAG + Fine tune combination method along with how will you evaluate the response is reliable or not.

@Kidzwish 2 ай бұрын

You have sorted most of my understandings for RAG which i was missing here and there, this is your zero shot prompt for me understanding RAG basics Thank You !!

@soutulhira 22 күн бұрын

Can't explain how much i have learned from this GREAT video. So far the BEST !!! thanks for such amazing explanation 👍👍👍

@maarijahmed9031 Ай бұрын

You are a truly amazing instructor of Machine Learning and LLMs, Truly inspired!

@seallyolme 6 ай бұрын

Hello again! I am trying to build upon what I've learned from your video, but now extending it to sequential chains and using multiple input variables. Everything seems to be good until I try to invoke it alongside Pinecone. I already have my index ready and I am getting the correct data when doing the similarity_search. I tried this: setup = RunnableParallel(context=pinecone.as_retriever(), input=RunnablePassthrough(), topic=RunnablePassthrough()) chain = ( setup | prompt | model | StrOutputParser() ) res = chain.invoke({"input":"What are LLMs?", "topic:"Artificial Intelligence"}) and also this setup for the chain chain = ( { "context": pinecone.as_retriever(), "topic": itemgetter("topic"), "input": itemgetter("input"), } | prompt | model | StrOutputParser() ) res = chain.invoke({"input":"What are LLMs?", "topic:"Artificial Intelligence"}) but I keep getting an `TypeError: expected string or buffer` error. Does this mean only the input should be included in the invoke call? I was able to do the same chain.invoke() with multiple input variables in your earlier examples without the vectorstores, so I am not sure why now it is only asking for a string. Would love to get your insights on this. Thank you so much!

@serhiua 4 ай бұрын

Thank you, this is top-notch content! Many new for me llm rag concepts are clearly explained, and most importantly, it shows how they are interconnected and reinforce each other.

@wazir3974 7 күн бұрын

Such an amazing indepth video! Thank you so much brother

@mohamedadel1734 4 ай бұрын

Hello Santiago! Thanks for this amazing video and for the explanation. Is it possible to provide an updated version of this RAG system to have be a chatbot with a conversational behaviour not only Q&A system? so it takes into consideration the previous questions and answers as part of the context to be able to answer questions that refer to something already mentioned before?

@TomasTrenor 4 ай бұрын

Thanks Santiago ! As always , high value content ! Just one quick question: why not match the LLM max content size with the chunk document size ?

@AbdelilahBo 6 ай бұрын

Hello, thank you so much for this video that help me a lot to clarify a lot of concept. i have a question related of aggregation questions in LLM documents.for example the vector database have thousands documents with date property, and i want ask the model how much document i received in the last week? what can be the best practice to handle this kind of use cases?

@arumypele1919 6 ай бұрын

Why no one ever uses open source everyone just goes with openai

@eddymison3527 Ай бұрын

1. GPT3/4 models are very good 2. Price is actually not bad So yeah, why not.

@souravbarua3991 3 ай бұрын

Please make a video on RAG using the knowledge graph database neo4j aura db as your teaching technique is awesome.

@ayushidubey8734 2 ай бұрын

Superb explanation, Thank you for such great informative videos!!!

@duanxn 6 ай бұрын

Thanks for the detailed, clearly explained tutorial. Nice

@aitut-dc3dd 5 ай бұрын

man this is great, keep it up. also it will be appreciated if you make a video for a full RAG system including: 1. adding the most suitable way of using memory . 2. how to embed and deploy these application for production(web, mobile, saas...etc)

@Shubhampalzy 2 ай бұрын

@tehamill1 12 күн бұрын

This is an amazingly thorough introduction thank you

@MahimaChhagani 6 ай бұрын

Love your method of teaching!!!

@anatoliisyvaschenko1553 6 ай бұрын

Nice video! Very helpful for beginners! Just one question, in terms of performance isn't it better to create one prompt and one call to model with the context, answer and translation? It would be just one call to OpenAI server instead of two..

@underfitted 6 ай бұрын

For that particular example, yes. One call would be better. But think beyond that. You may have two separate chains using different models and processes. One call might not be possible, and that’s where chaining different chains might be helpful.

@anatoliisyvaschenko1553 6 ай бұрын

Oh, I see.. Valid argument! Thanks for explanation, it's really great explanation of the concept itself!

@leoshlomo Ай бұрын

Muchas gracias! Excelente tutorial! A poner manos a la obra

@CodeZen 3 ай бұрын

Thanks a lot for such a great video🙏🧿

@atulbisht9019 3 ай бұрын

Thanks for the video. The content and the explanation is top notch.

@joekumar9355 4 ай бұрын

How would Elastic Search fit into this architecture ?

@Sam-oi3hw 4 ай бұрын

Thank you very much, this video is awesome to begin in ai

@bnshanks 4 ай бұрын

Great, thanks, Santiago. What about using gpt4?

@satishsvap 19 күн бұрын

I have a basic question, and I apologize if I'm wrong about the RAG concept. You build a RAG using a vector store, and every time you ask a question, it retrieves the best cosine match from the vector store and passes it to the model. It seems similar to a SQL database. So, what is the role of the model here?

@ShivamPradhan-c1x 15 күн бұрын

SQL Database: Relies on structured queries (e.g., SELECT * FROM ...) and retrieves exact matches based on predefined schema (tables, rows, columns). Vector Store: Retrieves data based on semantic similarity and unstructured content (e.g., text documents). vector db are optimised for similarity search

@josedacruz4058 4 ай бұрын

Amazing! Very well explained!!!

@adityaazizi 4 ай бұрын

thanks sir, its help me a lot as a newbie -/\-

@ResIpsa-pk4ih 6 ай бұрын

It’s not from scratch if you’re using LangChain. The things that LangChain helps with aren’t hard or complicated to write yourself. What is hard is implementing literally anything that isn’t already supported by LangChain once you’ve integrated it into your project.

@RahulKumar-lv9yz 6 ай бұрын

I believe you aren't a supporter of library functions either. I would hate to be interviewed by you.😢

@wdonno 6 ай бұрын

there is a big step from not using any library functions to commenting on the use of Langchain in a video claiming build from scratch. The contradiction was the first thing that came to my mind too!

@garfield584 2 ай бұрын

Assembly ftw. 😂