Contextual Retrieval with Any LLM: A Step-by-Step Guide

Рет қаралды 21,625

Prompt Engineering

Күн бұрын

Пікірлер: 29

@out_and_about08 5 күн бұрын

Thanks! Can you please create a video on hybrid rag - vector + graph based retrieval

@lolwhatsmeta 5 күн бұрын

i love this type of explain the code from the llm provider 🥰thank you so much.

@gramnegrod 5 күн бұрын

Thx for explaining how to generalize the anthropic trick! It’s very germane. And thanks for all the different RAG approaches to consider. Could u do a video on how to evaluate these different methods with metric driven analysis. Now, I just eyeball the results and besides being time consuming;I’m not that good at distinguishing small improvements between RAG models.

@buanadaruokta8766 5 күн бұрын

Thank you for creating this content. It's very useful for completing the bachelor's thesis I'm currently working on. I'd like to ask a question: When the chat history reaches thousands of entries, and this chatbot is potentially used for a mobile app, is a vector database needed (for storing the data)? If so, should each data query (session ID, query, and answers) be stored? Or is there something else to consider? In this case, I want the vector database to address the limitation of LLMs, which is the context window constraint.

@DanielBowne 5 күн бұрын

Would have loved to see this done with Antropic, mostly due to if you wanted to do this on larger documents, context caching from anthropic would be ideal.

@NLPprompter 4 күн бұрын

Thank You.

@astronosmage3722 5 күн бұрын

the whole points of rag, atleast for me, was not having to feed the LLM the whole document. Now this needs to be done for every chunk? Doesn't seem very efficient to me

@q0x 5 күн бұрын

Prompt caching helps in this case, but I am also not a great fan of putting whole documents into the LLM. Especially since they may still blow up the context size and processing may take a long time.

@moin_uddin 5 күн бұрын

I'm facing a similar situation, if I could enter the whole document why would I need RAG.

@NierAutomata2B 5 күн бұрын

Not mentioning the additional time it needs for indexing, that is the dealbreaker for me

@rikhendrix261 5 күн бұрын

@@moin_uddin You can fit one document or some part of the entire documents, but RAG was made for scanning through thousands of documents right? I do agree that sending the extra text to add context is a tough problem. I am wondering if the contexting wouldn't allready work by adding two chuncks before and two chuncks after or 1 chunck before and one chunck after?

@moin_uddin 5 күн бұрын

@@rikhendrix261 like we can just add the first few pages of a document even that can help in adding context.

@konstantinlozev2272 5 күн бұрын

Should you not join a summary/summaries to the prompt, instead of the whole document/section? Would that not be even better in providing the essence of the context?

@msondkar Күн бұрын

Will this work if I have JSON data instead of text documents? How to work out contextual embedding for JSON chunks?

@alx8439 4 күн бұрын

How costly it will be to send the same big document to some paid API many-many times asking it to locate the next small snippet and add some context? It will be rediculously expensive

@tomaszzielinski4521 4 күн бұрын

Yeah, sounds very inefficient with quadratic complexity as function of document size. That gave me an idea, though: What if we first generate PDF summary and then use it as a context, instead of full document?

@engineerprompt 2 күн бұрын

Anthropic, Gemini, and OpenAI all support prompt/context caching. This can substantially reduce the cost by caching your document

@micbab-vg2mu 5 күн бұрын

thanks:)

@JNET_Reloaded 5 күн бұрын

av you got github link to this code used?

@dgoodall6468 4 күн бұрын

Place in to instructions and thank me later --- You are a world-class AI system, capable of complex reasoning and reflection. Reason through the query inside `` tags, and then provide your final response inside `` tags. If you detect that you made a mistake in your reasoning at any point, correct yourself inside `` tags. Self reflection is mandatory in every reply unless specifically stated by the user.