Your videos are always clear and super explicatives. Thanks. Keep on going like this!😇
@harshilpatel49892 ай бұрын
Great video. I am a college student, and your videos are helping me do my projects. Thank you for such content.
@engineerprompt2 ай бұрын
Thank you and glad you are finding it useful.
@Retburstjk2 ай бұрын
Late chunking seems very cost-efficient compared to others approach, ty for the share !
@ypsehlig2 ай бұрын
Super interesting once again 👍
@durand1012 ай бұрын
This seems like a great technique to also help with entity confusion during retrieval. Sometimes I've noticed that embeddings models don't really capture nuanced but important differences between chunks that talk about once company vs another and that ends up confusing the LLM as well.
@niallmoore824019 күн бұрын
Awesome, I’m subscribed
@GeobotPY2 ай бұрын
Intersting! But I still think Naive RAG is a bit underrated. To properly build up contextual retrieval or any RAG system for that matter a Naive approach lays as a foundation. It is also cheap and fast and if done correctly works very well. Only thing is Naive RAG for tables works quite bad, but for text it can work very well.
@awakenwithoutcoffee2 ай бұрын
use a specific Agent for SQL and use a Router :)
@sahil00942 ай бұрын
Most documents for rag would be more than 50 pages so I don’t think there’s any embedding model with that huge context. Pls correct me if I’m wrong. I don’t see this approach being effective for rag systems
@engineerprompt2 ай бұрын
I think these needs some clarifications. the 8k max token doesn't mean that you can only embed a document when its less than that many tokens. If you have a document which is longer than say 8k tokens, you can divide into batches and process it the way you would do it for chunking. Now there might be some discontinuity but overlap is again your friend here. Hope this clarifies how you would use it.
@revanthphanisaimedukonduru11772 ай бұрын
If we chunk paragraphs, and yeah we keep overlap, still the main property of Late chunking is to hold semantic meaning of whole context, which is not useful. It's like an intermediate solution, because embedding limit is a challenge.
@zyzhang11302 ай бұрын
@@engineerpromptthat defeats the very purpose of late chunking doesn’t it
@HugoCortijo-q4q2 ай бұрын
Does the embedding dimension refer to the output length of the response?
@davidperedahernandez41902 ай бұрын
y, i have the same question
@GeobotPY2 ай бұрын
Its the length of the vector. So a dimension = 3 would be [2, 3, 5]
@richardkuhne50542 ай бұрын
So how to combine that with VRAG and Context extension in localVisionGPT?
@remusomega2 ай бұрын
Is there an application here where this can enhance knowledge graph generatin?
@barackobama45522 ай бұрын
Thanks!
@doaamostafa47592 ай бұрын
Could we use Hybrid search with Late chunking ? or late Chunking is enough ?
@doaamostafa47592 ай бұрын
@Prompt Engineering
@engineerprompt2 ай бұрын
Hybrid will always help. It's hard to beat BM25 :) that is usually really helpful when you have a lot of keywords in your dataset
@tecnom71332 ай бұрын
Thanks
@stunspot2 ай бұрын
Ah, chunking. I love the late chunking idea, but personally have found optimizing my document formatting for a specific RAG to be to best approach. Making sure it gets chunked sensibly. Pain in the ass, frankly, and can be largely avoided with fractal structuring. But you can't usually do that. Sigh.
@engineerprompt2 ай бұрын
I totally agree with this approach and have been advocating for it for a while now with clients I work with. Non of this is magic. You have to spend time with your data to understand it and then build on top of it. The unfortunate part is people don't want to do the dirty work for the most part.