Thanks! Can you please create a video on hybrid rag - vector + graph based retrieval
@lolwhatsmeta5 күн бұрын
i love this type of explain the code from the llm provider 🥰thank you so much.
@gramnegrod5 күн бұрын
Thx for explaining how to generalize the anthropic trick! It’s very germane. And thanks for all the different RAG approaches to consider. Could u do a video on how to evaluate these different methods with metric driven analysis. Now, I just eyeball the results and besides being time consuming;I’m not that good at distinguishing small improvements between RAG models.
@buanadaruokta87665 күн бұрын
Thank you for creating this content. It's very useful for completing the bachelor's thesis I'm currently working on. I'd like to ask a question: When the chat history reaches thousands of entries, and this chatbot is potentially used for a mobile app, is a vector database needed (for storing the data)? If so, should each data query (session ID, query, and answers) be stored? Or is there something else to consider? In this case, I want the vector database to address the limitation of LLMs, which is the context window constraint.
@DanielBowne5 күн бұрын
Would have loved to see this done with Antropic, mostly due to if you wanted to do this on larger documents, context caching from anthropic would be ideal.
@NLPprompter4 күн бұрын
Thank You.
@astronosmage37225 күн бұрын
the whole points of rag, atleast for me, was not having to feed the LLM the whole document. Now this needs to be done for every chunk? Doesn't seem very efficient to me
@q0x5 күн бұрын
Prompt caching helps in this case, but I am also not a great fan of putting whole documents into the LLM. Especially since they may still blow up the context size and processing may take a long time.
@moin_uddin5 күн бұрын
I'm facing a similar situation, if I could enter the whole document why would I need RAG.
@NierAutomata2B5 күн бұрын
Not mentioning the additional time it needs for indexing, that is the dealbreaker for me
@rikhendrix2615 күн бұрын
@@moin_uddin You can fit one document or some part of the entire documents, but RAG was made for scanning through thousands of documents right? I do agree that sending the extra text to add context is a tough problem. I am wondering if the contexting wouldn't allready work by adding two chuncks before and two chuncks after or 1 chunck before and one chunck after?
@moin_uddin5 күн бұрын
@@rikhendrix261 like we can just add the first few pages of a document even that can help in adding context.
@konstantinlozev22725 күн бұрын
Should you not join a summary/summaries to the prompt, instead of the whole document/section? Would that not be even better in providing the essence of the context?
@msondkarКүн бұрын
Will this work if I have JSON data instead of text documents? How to work out contextual embedding for JSON chunks?
@alx84394 күн бұрын
How costly it will be to send the same big document to some paid API many-many times asking it to locate the next small snippet and add some context? It will be rediculously expensive
@tomaszzielinski45214 күн бұрын
Yeah, sounds very inefficient with quadratic complexity as function of document size. That gave me an idea, though: What if we first generate PDF summary and then use it as a context, instead of full document?
@engineerprompt2 күн бұрын
Anthropic, Gemini, and OpenAI all support prompt/context caching. This can substantially reduce the cost by caching your document
@micbab-vg2mu5 күн бұрын
thanks:)
@JNET_Reloaded5 күн бұрын
av you got github link to this code used?
@dgoodall64684 күн бұрын
Place in to instructions and thank me later --- You are a world-class AI system, capable of complex reasoning and reflection. Reason through the query inside `` tags, and then provide your final response inside `` tags. If you detect that you made a mistake in your reasoning at any point, correct yourself inside `` tags. Self reflection is mandatory in every reply unless specifically stated by the user.