Building Production RAG Over Complex Documents

Рет қаралды 10,861

Күн бұрын

Пікірлер: 10

@ankitsrivastava513 3 ай бұрын

00:00 - Introduction by Jerry, co-founder of llama index 07:32 - Overview of building a knowledge assistant using RAG 16:45 - Components of RAG: data parsing and ingestion, data querying 27:18 - Challenges with naive RAG systems 35:40 - Importance of data quality in RAG systems 43:12 - Improving data quality through data processing 51:30 - Improving query complexity in RAG systems 58:45 - Goal of building a more advanced research assistant 01:05:20 - Conclusion and future directions of RAG development

@monostechhelp1143 Ай бұрын

this is just what i needed

@andydataguy 11 күн бұрын

I almost didn't click on this video. If you put the name of the speaker on the title I would have instant clicked this and watched immediately months ago.

@YanasChanell 3 ай бұрын

20:50 Im not sure I understand how to do this, can someone explain, please?

@SimonBransfieldGarth 3 ай бұрын

I believe what he is suggesting is you extract key bits of information from the document as specific items (for example the title, or even just individual sentences) and then feed this to the LLM alongside the raw text. I assume this is to give the LLM some additional clues about the structure of the data. Its almost like adding some kind of a markup on the document alongside the raw data.

@artmusic6937 26 күн бұрын

what he means regarding the embeddings creation and retrieval separation is, that you create for every sentence of your page one embeddings vector. That means that if you do similarity search, you retrieve a lot of similar sentences. But you don't include the sentences in your LLM as context. You check on what page the chunks basically are referencing. That will definitely end up with a lot of chunks referencing to the same page but this is good. lets assume you retrieve 10 sentences (chunks), 4 reference to the same page and the others to another page. you will still retrieve the whole page for context. You could also pick both pages. But the idea is, that you search much more detailed but give the model always a whole page to preserve the context of your document. Its like a combination of small size chunking, to find better information and big size retrieval to have better context. I hope someone can validate that :D

@leoxiao2751 15 күн бұрын

I also have a question about this part. When he said "it allows optimizing retrieval separately from optimizing synthesis ." What does "optimizing synthesis" refer to?

@artmusic6937 15 күн бұрын

@ i Never heard synthesis in this Context But i can only Imagine the Same as i have already mentioned. You optimize for Retrieval with Small Chunks, but you also optimize for better understanding for the model with more context. Maybe the process of feeding a llm is what he is preferring to ??

@artmusic6937 15 күн бұрын

I asked ChatGPT:) In the context of large language models (LLMs), synthesis typically refers to the model’s ability to generate coherent and meaningful responses by combining learned information and patterns from vast amounts of training data. Synthesis in this sense involves creating new, original text outputs that are contextually appropriate and informative. So he refers it to the ability to generate responses based on the given chunks