Thank you for this video. I have spent two weeks going through the video and coding along. Its really amazing.
@jplkid146 ай бұрын
The complete happenstance of the phrase "do rag" sounding like "durag" coming from this video was awesome. Sorry, totally unrelated...but it made me chuckle.
@bhavyajain74035 ай бұрын
durag activity
@mr.daniish7 ай бұрын
Lance is the man! Love his content
@BlakeGallagher-i8h3 ай бұрын
🎯 Key points for quick navigation: 00:00 *📚 Introduction to RAG by Lance Martin, a LangChain engineer.* 00:14 *💡 Explanation of how RAG combines custom data with LLMs.* 00:28 *🔍 Motivation: Most data is private, but LLMs are trained on public data.* 01:08 *🗃️ Context windows in LLMs are growing, allowing more private data to be input.* 01:48 *⚙️ Overview of RAG: Indexing, retrieval, and generation stages.* 02:54 *📊 RAG unites LLMs' processing with large-scale private data.* 03:24 *🧠 Breakdown of RAG components: Query translation, routing, construction, and more.* 04:46 *⭐ Methods for document retrieval and reranking in RAG.* 05:55 *💾 Indexing external documents and converting them to numerical representations.* 08:25 *🧩 Splitting documents for embedding due to context window limits.* 10:00 *🖥️ Computing fixed-length vectors for documents using embeddings.* 12:45 *🔍 Using k-nearest neighbors to find similar documents.* 15:59 *📝 Generating answers based on retrieved documents in RAG.* 17:07 *📝 Prompt templates for generating answers in LLMs.* 19:02 *🔗 Combining prompts, LLMs, and retrievers into chains.* 22:14 *🚀 Introduction to advanced query translation in RAG.* 23:07 *✔️ Importance of rewriting queries for effective retrieval.* 24:05 *🌐 Multi-query approach: Rewriting questions from different perspectives.* 25:38 *🚀 Indexed a blog post on agents in a vector store.* 26:19 *🔍 Split question into sub-questions and retrieve relevant documents.* 28:08 *🔧 Used LangSmith to trace intermediate and final steps.* 30:42 *🗂️ Built a consolidated list from multiple retrievals.* 35:02 *🧩 Discussed sub-question decomposition retrieval.* 36:23 *🔄 Combined answers to iterative sub-questions for final answer.* 38:18 *🔗 Connected question-answer pairs sequentially in prompts.* 41:02 *📚 Stepback prompting for generating more abstract questions.* 43:02 *🪜 Generated more generic questions to enhance context for retrieval.* 44:45 *🔄 Retrieval performed on both original and stepback questions.* 48:50 *🌐 HYDE involves converting questions into hypothetical documents for better alignment with document embeddings.* 49:43 *🔎 Generated hypothetical documents based on questions for more effective retrieval.* 51:15 *📝 Hypothetical Document: Demonstrated hypothetical document generation and retrieval process.* 51:44 *🌟 Performance: Using hypothetical document generation can improve retrieval performance.* 52:13 *🚦 Routing: Involves translating a query and routing it to appropriate data sources.* 53:48 *🔍 Semantic Routing: Embeds and compares questions to prompts for routing.* 56:08 *🔗 Routing Mechanism: Connects the intended data source to specific retriever chains.* 58:11 *🚀 Semantic Routing Example: Demonstrates choosing a prompt based on semantic similarity.* 59:47 *💬 Query Construction: Transforms natural language queries to structured queries for metadata filters.* 01:00:15 *🗓️ Example Query: Converts natural questions into structured queries with date filters and metadata.* 01:04:26 *📚 Query Optimization: Optimizes retrieval by translating natural language into data-querying domain-specific languages.* 01:11:48 *🗄️ Hierarchical Indexing: Raptor technique deals with questions needing detailed and broader information.* 01:12:57 *🧩 Hierarchical indexing helps in retrieving more relevant document chunks by clustering and summarizing documents recursively.* 01:14:08 *🤏 Summaries provide high-level semantic representations, while raw chunks offer detailed, document-specific insights.* 01:15:04 *🧪 Comprehensive studies indicate that hierarchical indexing enhances semantic search by offering better coverage across different question types.* 01:17:19 *📇 Process involved embedding, clustering, and recursive summarization to build a tree structure of document information.* 01:20:09 *🛠️ Code demonstration included creating a vector store, embedding documents, clustering, summarizing, and managing tokens.* 01:22:22 *🔍 CoBER method enhances semantic search by generating embeddings for every token and computing maximum similarities between questions and documents.* 01:24:57 *🧑💻 RoBERTA library facilitates playing with CoBER, which showcases good performance but requires evaluating production readiness due to possible latency issues.* 01:26:40 *🌐 CoBER demonstrated through LangChain retriever integration, offering an efficient and unique indexing approach.* 01:28:10 *🗺️ Langraph released for building more complex state machines and diverse logical flows in RAG applications.* 01:33:05 *🔍 Corrective RAG workflow improved retrieval by re-assessing document relevance and performing web searches for ambiguous results.* 01:35:06 *🧩 Functions for state modification in Langraph illustrated how each state (node) in the flow modifies the document retrieval process.* 01:37:08 *🔍 Logical filtering: Use a grading chain to mark documents as relevant or not and perform actions based on the results.* 01:37:32 *🚦 Conditional routing: Based on the 'search' value, route the workflow to either transform the query for a web search or proceed to generate a response.* 01:39:13 *📑 Document relevance check: Filter documents for relevance before transforming the query and performing a web search.* 01:39:55 *🔄 Query transformation: Adjust the query based on information retrieved from a web search to improve relevance.* 01:40:52 *📊 Detailed node inspection: Use tools like LangSmith to inspect each node's output to ensure the logical flow is correct.* 01:42:26 *🚀 Moving from chains to flows: Transitioning from simple chains to complex flows offers cleaner and more sophisticated workflows.* 01:44:06 *🔧 Flow engineering: Flow engineering with Lang graph is intuitive and allows for sophisticated logical reasoning workflows.* 01:45:03 *🧩 Integrating ideas: Combining query analysis and adaptive flow engineering improves your RAG pipeline's efficiency.* 01:46:14 *📚 Corrective workflows: Use unit tests to ensure smooth corrective workflows during model inference.* 01:48:34 *💡 Command R: Uses Command R model with structured output, enabling binary yes/no responses for easier logical flow control.* 01:56:21 *⚙️ Binding functions to nodes: Bind each node in your graph to a specific function to handle different logical decisions and flows.* 01:58:24 *🔄 If tool calls are not in the response, a fallback mechanism is triggered to choose the next data source.* 01:59:18 *🔍 Different data sources (web search vs. Vector store) are used, and their outputs determine the subsequent nodes in the graph.* 02:00:25 *🧾 Conditional edges in the graph handle logic such as document relevance and hallucination checks.* 02:01:05 *📊 Functions are defined as nodes and edges in the graph, following a flow that matches a predefined diagram for logic.* 02:03:18 *🗂️ The flow diagram for the graph aligns with the logic drawn out earlier, ensuring consistent data routing and processing.* 02:05:10 *⏱️ The implemented RAG system processes questions quickly, demonstrating efficient retrieval and generation handling.* 02:07:15 *⚡ Command R model shows rapid performance and effective handling of relevance, hallucination, and answer usefulness checks within the RAG system.* 02:08:55 *🧠 Lang graph provides a reliable, less flexible solution compared to agents, suitable for defined flows and faster implementation.* 02:10:51 *🧩 Agents offer more flexibility for open-ended workflows at the cost of reliability, especially when working with smaller LLMs.* 02:11:46 *💻 Open-source models like Command R can be run locally, enabling fast inference and practical use for online applications.* 02:12:46 *🔧 Practical implementation of RAG systems combines Lang graph with Command R for a fast, reliable solution adaptable for various workflows.* 02:17:09 *📉 Tested GPT-4's ability to retrieve and reason over multiple facts within a large context window, showing degradation in performance as complexity and context length increase.* 02:18:27 *🧩 Observations included the difficulty of retrieving facts placed at the beginning of a large context window, potentially due to a recency bias.* 02:19:10 *🔄 Confirmed that adding reasoning tasks exacerbates retrieval difficulty, highlighting limits within LLMs without a retrieval augmentation system.* 02:19:52 *🚩 Be skeptical of single-needle retrievals as they often oversimplify the retrieval problem.* 02:21:00 *🎯 Focus on the retrieval of precise document chunks, but be cautious of over-engineering.* 02:22:48 *🏗️ Consider document-centric RAG over chunking to simplify retrieval and reduce complexity.* 02:26:30 *🧩 Clustering documents and summarizing clusters help to handle queries requiring multiple pieces of information.* 02:28:07 *🔍 Use long-context embedding models to embed full documents effectively.* 02:31:33 *🖥️ Using open-source models can make RAG systems more accessible and efficient, even on local machines.* Made with HARPA AI
@letsplay0711Ай бұрын
🎉
@bhavyajain74035 ай бұрын
This felt like a semester condensed in a few hours. This dude reads a lot. I learned about so many of interesting things.
@lakshay11685 ай бұрын
is there line by line explaination of full code ?
@noname133454 ай бұрын
@@lakshay1168 Ask an LLM to do it 😉
@sagarkeshave53577 ай бұрын
Include more of langchain, llms, industry level based tutorials
@AD-np2sh6 ай бұрын
I rarely say that a tutorial is good - but this is an amazing tutorial, extremely underrated!!!
@wanderingtravellerAB995 ай бұрын
Underrated by who?
@lakshay11685 ай бұрын
is there is line by line explaination of full code ?
@AD-np2sh4 ай бұрын
@@wanderingtravellerAB99 Relative engagement
@engenglish6107 күн бұрын
Is it more easy to do this with n8n ?
@faisalmushtaq22877 ай бұрын
I was waiting for this particular course. Thanks
@iqtech60657 ай бұрын
Assala mu alaikum brother
@simrangupta76895 ай бұрын
🎯 Key points for quick navigation: 02:21:13 *🔄 RAG Evolution* 02:22:20 *❓ Reconsider Chunking* 02:23:42 *📑 Document-Centric RAG* 02:25:20 *🔄 Multi-rep Indexing* 02:26:30 *📊 Utilize Raptor* 02:28:34 *🔄 Beyond Single-Shot* 02:30:23 *🧠 Enhance with Reasoning* 02:30:38 *🎯 Out-of-Scope Queries* Made with HARPA AI
@arthursiriaco677129 күн бұрын
Thank you for the video. This is way better than the vast majority of paid courses.
@n.h.son1902Ай бұрын
Hey, thanks for the helpful video. One minor thing that I think it can make the tutorial more useful is to use laser pointer or cursor to point to what you are talking about on the slide. I kind of couldn't keep track of what you are talking about while watching the video.
@jasonmuscat5346 ай бұрын
Lance thank you for sharing your deep insights on the subject of RAG and taking the time to share this with the community. Just a question, at 1:04:00 into the overall video concerning the subject of Query Construction. For the question: "videos that are focused on the topic of chat langchain that are published before 2024" Should the result have been?: latest_publish_date: 2024-01-01 as opposed to earliest_publish_date: 2024-01-01 This would be more inline with question: "videos on chat langchain published in 2023" where the results where: earliest_publish_date: 2023-01-01 latest_publish_date: 2024-01-01 Thank you
@claudiodisalvo99256 ай бұрын
This is great content. Speaking of that 95% of private data I guess a lot of practitioner are finding it hard to convince business people to share their data with an LLM provider. And of course concerns are very much understandable. I guess people would feel more comfortable if a RAG application would be able to clearly define a partition of data that it can work on for the benefit of the tool, and a partition that can be either used as obfuscated or simply never shared, not even by chance.
@Kalmaos6 ай бұрын
Maybe the solution would be running the model locally?
@juanpablopenaloza50936 ай бұрын
NVDIA CHATRTX might just do the job
@Digitalspecie-n8i2 ай бұрын
Never stop improving is enough for successful life 🙏
@ArunKumar-bp5lo3 ай бұрын
so much details - i had to watch to twice to get to understand it just wow
@CookingWithGunnar7 ай бұрын
Love the teaching style! at 9:00 you mention that you've walked through the code previously. Is there another video to go with this one or did I miss something?
@KOTAGIRISIVAKUMAR6 ай бұрын
those are shorts videos and they combined them to form an long single video. when lance referring previous video means not another video.
@shraeychikker6946 ай бұрын
I think this is the playlist from the videos are taken: kzbin.info/aero/PLfaIDFEXuae2LXbO1_PKyVJiQ23ZztA0x
@CookingWithGunnar6 ай бұрын
@@shraeychikker694 Nice one - many thanks :)
@NajibElkatebАй бұрын
Thanks for the amazing content. Is there any video where he discusses memory in rag. I try to give a summary of the previous query and response along with the new query to limit token usage. As the chat continues the LLM hallucinates. I believe there might be a better way to manage old user queries and a way to "forget" by analyzing whether it is a new topic or not . Please let me know :)
@bhuvanbharath033 ай бұрын
love from India, keep doing the great work, Lance
@nawaz_haider7 ай бұрын
Udemy created 50 accounts to dislike this video
@vishwanathnb1286 ай бұрын
😂😂😂
@zaidnadeem49186 ай бұрын
I will create 50 accounts to like your comment 😂
@GratefulJoss3 ай бұрын
Yeah, as the Good Book says the love of money is the root of all evil 😂😂😂😂
@selfhosted-lover19 күн бұрын
Thanks for your tutorial.
@Sss-zl8qj3 ай бұрын
Thanks a lot! Really helpful!
@mrchongnoi6 ай бұрын
I watched this twice. Very good.
@cyborg694206 ай бұрын
I recommend this vid to everyone.
@yashtiwari35656 ай бұрын
Please let us know when the blog related to adaptive RAG will be uploaded, Lance mentioned that he will be uploading it in a day or so. Also I wanted to ask this question to general public, which one is better, State machines or Guardrails?? (In the context of creating complex flows using llms)
@harshdeepsingh3872Ай бұрын
THis helped a lot .
@cristian_palau6 ай бұрын
Great video! What software is used to create these nice diagrams ?
@joaquinduranpoker27344 ай бұрын
Would also like to know
@geekyprogrammer48317 ай бұрын
This man is amazing!
@Meda-e9j4 ай бұрын
Kudos that is the best so far
@ser1ification7 ай бұрын
This is great! Thank you so much!
@utk10006 ай бұрын
VERY WELL EXPLAINED. THANK YOU
@ai-touch94 ай бұрын
amazing content on RAG
@ComputingAndCoding5 ай бұрын
Awesome video! it helped a great deal to explain the concept.
@andrebarsotti5 ай бұрын
Hello there! This material is amazing. I would love to read the articles cited in the video. Could someone please list them for me? Thank you!
@ImmediatelyLeaveYT4 ай бұрын
Thank you
@ZhannaOrlova-t8cАй бұрын
Where is the video in which he explains how to layout those basic RAG pipelines?
@sougaaat2 ай бұрын
RAG-Fusion uses the RRF to calculate a fused score of retrieved documents that were previously ranked by their respective retrievers. What I don't understand is that here only one retriever is being used and it's responsible for retrieving a list of documents per query, so in such a scenario, query1 might lead to fetch doc1, doc2 doc3, and so on. Yes, it's quite possible that doc1 is ranked 1 for query1 and ranked 2 for query2. But even in such a case, it's only based on a single retriever's ranking. Then, isn't it redundant to apply RRF on top of it?
@bdurgaramprasad41652 ай бұрын
Correct me If I am wrong here. We are using the RRF here to retrieve the ranking as a whole as opposed to a single query right?? So it kinda makes sense to get the ranking on all the queries. So to continue your example: for {q1:[doc1, doc3, doc2], q2:[doc2, doc3, doc1], ..,}. RRF helps merge these rankings to reflect how well a document performs across multiple queries, not just one. Of course, your statement is true when we are using multiple retrievers using a single query and then I suppose we could go with some other approach or use RRF in a single query level and after that to extract the ranking at a global level.
@sougaaat2 ай бұрын
@@bdurgaramprasad4165 Thanks a lot man. I actually did not think about this, ".. after that to extract the ranking at a global level." I kept thinking that if multiple retrievers aren't there, the RRF becomes redundant.
@kevinkawchak6 ай бұрын
Thank you for the talk.
@tomasohara5 ай бұрын
Very, very good! Can you post the links in the comments? Not all are in the notebooks (e.g., 2 of 4 at the 1:21:38 mark).
@cboyley5 ай бұрын
Isn't this process cost prohibitive? It seems like its taking an input that would cost 10 tokens, and generating an input that costs hundreds if not thousands of tokens. Perhaps Im misunderstanding the costs and underworkings of these LLMs.
@devloper_hs6 ай бұрын
Awesome as always
@souhaildahmeni99616 ай бұрын
Thanks for the content !
@mukilloganathan14427 ай бұрын
Always a fan of a lance video
@vaioslaschos5 ай бұрын
Are there any evaluation sets for RAG? If one gets a new RAG method, how can they compare it to other methods out there?
@ChristianBernhardt-tp1tn6 ай бұрын
i have an question. In the rag-fusion part in the fusion_rank function: why u using the index (rank) to upgrade your scores ? Isnt it better to use the variable "previous_score" ?? the variable rank is just an index, wich descripes in witch order you read in the chunks. btw ty for the video you are an livesaver
@therealtuyen5 ай бұрын
Thanks for sharing
@iuseh6 ай бұрын
llama 3 in 15T tokens, chart would be different if you released video 3 days later :)
@muhammadteguhpranantoАй бұрын
I want to ask. Sometimes, you passed the docs as the context, but sometimes you passed the retriever to the context. Is there any reason or when I should pass the retriever instead of docs itself ? Thanks!
@riccardomanta3725 ай бұрын
Is it just me or are there straight up always errors based on changed libs? Even tho when you simply try to execute their code. It's really frustrating working with langchain at this point.
@xxxxx82005 ай бұрын
Amen.
@harshgupta-vn1tl4 ай бұрын
Great Video
@izzatullobaltabayev86196 ай бұрын
Thank you !
@siddhantverma5324 ай бұрын
Shouldn't we just pass the text from retrieved docs instead of the string representation of Document class?
@tubege6 ай бұрын
Question:. Is it possible to do RAG across different vector stores that use different embedding strategies?
@eduardoconcepcion48996 ай бұрын
What is the best way to manage the chunk size?
@pr00f5 ай бұрын
Where can we find this video about Chunking?
@Thomas_Leo7 ай бұрын
Amazing videos but how does this translate into careers or jobs? What positions are employers looking for? Would they even hire anyone without experience? How do you even get started? I'm aware this channel mostly focuses on the coding and hands-on experience but I wish there was an actual channel focused on employment. I'm pretty sure there are channels out there and if anyone has recommendations, I'll be grateful.
@samtx6 ай бұрын
look on linkedin jobs title descriptions keywords if any with llm ai
@vcool6 ай бұрын
Start working on some AI projects first, on your own, in your spare time. Show some results. Once you have two to show, getting a job should be easier. You don't actually need this langchain thing. As for how you get started, if you've already used GPT-4, etc., thinking about larger workflows that chain inputs and outputs in creative ways to solve problems. Also think about when you need to use embeddings for distance computation. You can use the LLM+embedding APIs directly or via an SDK, optionally sometimes with a local vector database. You don't need to go fancy.
@samtx6 ай бұрын
@@vcool what results? What you done
@tannerhillison44303 ай бұрын
Does anyone know why I get a RateLimitError when I try using my OpenAI API Key with my paid ChatGPT account? It says: "You exceeded your current quota, please check your plan and billing details" but I have never done anything with this API Key before. Does anyone know how to fix this? Alternatively, does anyone know how to perform RAG without this? Please help!
@vincent43842 ай бұрын
dang, there still isn't a way to connect Ollama instead of an openai key at time of writing. if someone has figured out before this has been made very available and you see this- please enlighten us!
@ggg9gg2 ай бұрын
What do you mean, there is ChatOllama() for langchain, no?
@encianhoratiu53012 ай бұрын
The hallucination testing part doesn't work very well
@FranciscoJPolo7 ай бұрын
Great!!
@sanjaybhatikar6 ай бұрын
Thanks for the excellent video! If your goal is to democratize gen AI to as diverse an audience as possible, I suggest you stop using OpenAI in these tutorials. In many parts of the world, having a credit card is not an option and OpenAI quickly backs you into that corner. Use, promote and support open-sources alternatives instead. Thank you.
@84y872 ай бұрын
1:27:00
@janeslt7 ай бұрын
Thank you!!!
@drlordbasil5 ай бұрын
Lance uploaded this himself, this seems copy and paste no?
@namwkim85Ай бұрын
what's the diagram tool used? anyone knows?
@ivant_true7 ай бұрын
Thanks
@Yomi4D6 ай бұрын
Amazing.
@karimelfa13947 ай бұрын
thank you
@teddysalas35906 ай бұрын
is it possible i can do rag and combine data with huggingface models?
@afrazhussain37787 ай бұрын
great content
@VipinAp-iy9tt6 ай бұрын
How to add coverstional memory to it?
@Mopark253 ай бұрын
Can this be done without LangChain?
@Andrew-Tsegaye7 ай бұрын
LLM Agents plzzz... ❤
@willcheng82577 ай бұрын
Like first and then watch
@GeandersonLenz6 ай бұрын
What the name of this screen recorder used by Lance?
@devanshoo4 ай бұрын
loom most prolly
@vannyelsyavitrianibowo82544 ай бұрын
how to get langchain api key
@Josholsan6 ай бұрын
Hello, at 27:13 why is he using itemgetter to pass the question? What's the difference between doing that and setting a RunnablePassthrough() in there?
@flaviobrienza60816 ай бұрын
No difference, just that with RunnablePassThrough() you don't need a dictionary in the invoke
@andyhall70327 ай бұрын
And there was me thinking "how can it take over 2 hours to talk about applying RAG status to your project plans"
@utkarshkapil7 ай бұрын
GOLD
@iCeTainment7 ай бұрын
❤❤❤
@roberth87377 ай бұрын
This is the way...
@ajaykumarreddy88414 ай бұрын
Tbh, this is not from scratch if you are using an heavily abstracted framework(Langchain). Its misleading.
@MichaelWoodrum3 ай бұрын
The second I saw langchain involved, I'm out. I'll abstract my own from scratch for real. Embeddings are easy and postgresql supports through extensions. The rest is similarity search.
@bigdreams5554Ай бұрын
I agree. I manufacture my own CPU and GPU chips myself, solder the circuit boards, then move on to SSD cards before building the PC. This really is the first step.. anything else is not really from scratch.
@MichaelWoodrumАй бұрын
@@bigdreams5554 obviously you know what I mean and if not, that's on you. Langchain makes it harder to create custom setups. You hit a wall eventually. Build using what you want. Model updates break compatibility. Build on top of models. Don't build on top of multiple layers building on models. Studio code channel put out a great rag video that's legit a few days after this one. Didn't claim scratch build and it was as close as scratch built as you can get. Rag is easier than these people make it.
@eshanchawla349818 күн бұрын
@@MichaelWoodrumStudio code ? Can you provide an accurate description of the content you are referring to please ?
@MichaelWoodrum18 күн бұрын
@@eshanchawla3498 Microsoft visual studio code channel.
@TommyFink-y6cАй бұрын
Taylor Sharon Robinson Donald Lopez Barbara
@imaspacecreature6 ай бұрын
Good job, but ' dict ' never again 😂
@vcool6 ай бұрын
Does this video have everyone brainwashed? If you know basic programming, you don't need langchain at all. I don't like unnecessary abstractions.
@Haz22886 ай бұрын
Yeah but the building blocks are useful. Do you write your own sorting functions?
@Lioncodes6 ай бұрын
@vcool would you mind expanding on what the alternative is to langchain? Genuinely curious on learning, not attacking
@vcool6 ай бұрын
This seems like it's going to pigeonhole me and tie my hands into a small dogmatic set of patterns, when what I need is broader freedom that I can accomplish without it.
@JugglingJuggalos7 ай бұрын
This video getting massive viewership🇮🇱
@blacklight83187 ай бұрын
You mean 🇵🇸 ?
@JugglingJuggalos7 ай бұрын
@@blacklight8318 No
@farhanlabib28337 ай бұрын
Get and eat dudu
@junaidiqbal41047 ай бұрын
Palestine
@JugglingJuggalos7 ай бұрын
Thanks, but that was on the menu yesterday... today is rice and beans
@memezone32565 ай бұрын
bad
@anjanikumar82985 ай бұрын
may be not the best, explanation and not understandable easily