00:00 - Introduction by Jerry, co-founder of llama index 07:32 - Overview of building a knowledge assistant using RAG 16:45 - Components of RAG: data parsing and ingestion, data querying 27:18 - Challenges with naive RAG systems 35:40 - Importance of data quality in RAG systems 43:12 - Improving data quality through data processing 51:30 - Improving query complexity in RAG systems 58:45 - Goal of building a more advanced research assistant 01:05:20 - Conclusion and future directions of RAG development
@SaeedVayghani20 күн бұрын
This is a well structured intro to advanced RAG, I highly recommend watching this video
@monostechhelp11432 ай бұрын
this is just what i needed
@andydataguyАй бұрын
I almost didn't click on this video. If you put the name of the speaker on the title I would have instant clicked this and watched immediately months ago.
@YanasChanell4 ай бұрын
20:50 Im not sure I understand how to do this, can someone explain, please?
@SimonBransfieldGarth4 ай бұрын
I believe what he is suggesting is you extract key bits of information from the document as specific items (for example the title, or even just individual sentences) and then feed this to the LLM alongside the raw text. I assume this is to give the LLM some additional clues about the structure of the data. Its almost like adding some kind of a markup on the document alongside the raw data.
@artmusic6937Ай бұрын
what he means regarding the embeddings creation and retrieval separation is, that you create for every sentence of your page one embeddings vector. That means that if you do similarity search, you retrieve a lot of similar sentences. But you don't include the sentences in your LLM as context. You check on what page the chunks basically are referencing. That will definitely end up with a lot of chunks referencing to the same page but this is good. lets assume you retrieve 10 sentences (chunks), 4 reference to the same page and the others to another page. you will still retrieve the whole page for context. You could also pick both pages. But the idea is, that you search much more detailed but give the model always a whole page to preserve the context of your document. Its like a combination of small size chunking, to find better information and big size retrieval to have better context. I hope someone can validate that :D
@leoxiao2751Ай бұрын
I also have a question about this part. When he said "it allows optimizing retrieval separately from optimizing synthesis ." What does "optimizing synthesis" refer to?
@artmusic6937Ай бұрын
@ i Never heard synthesis in this Context But i can only Imagine the Same as i have already mentioned. You optimize for Retrieval with Small Chunks, but you also optimize for better understanding for the model with more context. Maybe the process of feeding a llm is what he is preferring to ??
@artmusic6937Ай бұрын
I asked ChatGPT:) In the context of large language models (LLMs), synthesis typically refers to the model’s ability to generate coherent and meaningful responses by combining learned information and patterns from vast amounts of training data. Synthesis in this sense involves creating new, original text outputs that are contextually appropriate and informative. So he refers it to the ability to generate responses based on the given chunks