Stop Losing Context! How Late Chunking Can Enhance Your Retrieval Systems

Рет қаралды 21,092

Prompt Engineering

Күн бұрын

Пікірлер: 25

@TheMariolino2005 2 ай бұрын

Your videos are always clear and super explicatives. Thanks. Keep on going like this!😇

@harshilpatel4989 2 ай бұрын

Great video. I am a college student, and your videos are helping me do my projects. Thank you for such content.

@engineerprompt 2 ай бұрын

Thank you and glad you are finding it useful.

@Retburstjk 2 ай бұрын

Late chunking seems very cost-efficient compared to others approach, ty for the share !

@ypsehlig 2 ай бұрын

Super interesting once again 👍

@durand101 2 ай бұрын

This seems like a great technique to also help with entity confusion during retrieval. Sometimes I've noticed that embeddings models don't really capture nuanced but important differences between chunks that talk about once company vs another and that ends up confusing the LLM as well.

@niallmoore8240 19 күн бұрын

Awesome, I’m subscribed

@GeobotPY 2 ай бұрын

Intersting! But I still think Naive RAG is a bit underrated. To properly build up contextual retrieval or any RAG system for that matter a Naive approach lays as a foundation. It is also cheap and fast and if done correctly works very well. Only thing is Naive RAG for tables works quite bad, but for text it can work very well.

@awakenwithoutcoffee 2 ай бұрын

use a specific Agent for SQL and use a Router :)

@sahil0094 2 ай бұрын

Most documents for rag would be more than 50 pages so I don’t think there’s any embedding model with that huge context. Pls correct me if I’m wrong. I don’t see this approach being effective for rag systems

@engineerprompt 2 ай бұрын

I think these needs some clarifications. the 8k max token doesn't mean that you can only embed a document when its less than that many tokens. If you have a document which is longer than say 8k tokens, you can divide into batches and process it the way you would do it for chunking. Now there might be some discontinuity but overlap is again your friend here. Hope this clarifies how you would use it.

@revanthphanisaimedukonduru1177 2 ай бұрын

If we chunk paragraphs, and yeah we keep overlap, still the main property of Late chunking is to hold semantic meaning of whole context, which is not useful. It's like an intermediate solution, because embedding limit is a challenge.

@zyzhang1130 2 ай бұрын

@@engineerpromptthat defeats the very purpose of late chunking doesn’t it

@HugoCortijo-q4q 2 ай бұрын

Does the embedding dimension refer to the output length of the response?

@davidperedahernandez4190 2 ай бұрын

y, i have the same question

@GeobotPY 2 ай бұрын

Its the length of the vector. So a dimension = 3 would be [2, 3, 5]

@richardkuhne5054 2 ай бұрын

So how to combine that with VRAG and Context extension in localVisionGPT?

@remusomega 2 ай бұрын

Is there an application here where this can enhance knowledge graph generatin?

@barackobama4552 2 ай бұрын

Thanks!

@doaamostafa4759 2 ай бұрын

Could we use Hybrid search with Late chunking ? or late Chunking is enough ?

@doaamostafa4759 2 ай бұрын

@Prompt Engineering

@engineerprompt 2 ай бұрын

Hybrid will always help. It's hard to beat BM25 :) that is usually really helpful when you have a lot of keywords in your dataset

@tecnom7133 2 ай бұрын

Thanks

@stunspot 2 ай бұрын

Ah, chunking. I love the late chunking idea, but personally have found optimizing my document formatting for a specific RAG to be to best approach. Making sure it gets chunked sensibly. Pain in the ass, frankly, and can be largely avoided with fractal structuring. But you can't usually do that. Sigh.

@engineerprompt 2 ай бұрын

I totally agree with this approach and have been advocating for it for a while now with clients I work with. Non of this is magic. You have to spend time with your data to understand it and then build on top of it. The unfortunate part is people don't want to do the dirty work for the most part.