Semantic Chunking for RAG

  Рет қаралды 15,655

James Briggs

James Briggs

Күн бұрын

Semantic chunking for RAG allows us to build more concise chunks for our RAG pipelines, chatbots, and AI agents. We can pair this with various LLMs and embedding models from OpenAI, Cohere, Anthropic, etc, and libraries like LangChain or CrewAI to build potentially improved Retrieval Augmented Generation (RAG) pipelines.
📌 Code:
github.com/pinecone-io/exampl...
🚩 Intro to Semantic Chunking:
www.aurelio.ai/learn/semantic...
🌲 Subscribe for Latest Articles and Videos:
www.pinecone.io/newsletter-si...
👋🏼 AI Consulting:
aurelio.ai
👾 Discord:
/ discord
Twitter: / jamescalam
LinkedIn: / jamescalam
00:00 Semantic Chunking for RAG
00:45 What is Semantic Chunking
03:31 Semantic Chunking in Python
12:17 Adding Context to Chunks
13:41 Providing LLMs with More Context
18:11 Indexing our Chunks
20:27 Creating Chunks for the LLM
27:18 Querying for Chunks
#artificialintelligence #ai #nlp #chatbot #openai

Пікірлер: 51
@aaronsmyth7943
@aaronsmyth7943 25 күн бұрын
At this point, you are practically Captain Chunk.
@lalamax3d
@lalamax3d Ай бұрын
best i have seen so far about understanding core concept of chunking , thanks
@jamesbriggs
@jamesbriggs Ай бұрын
glad it was helpful :)
@gullyburns1280
@gullyburns1280 Ай бұрын
Another killer video. Great work!
@AaronJOlson
@AaronJOlson 28 күн бұрын
Thank you! I’ve been doing this for a while, but did not have a good name for it.
@xuantungnguyen9719
@xuantungnguyen9719 Ай бұрын
Need a video on cross-chunk attention. Wasn’t attention all about key query and val anyway
@MrMoonsilver
@MrMoonsilver Ай бұрын
Amazing video, thank you so much!!
@rodgerb2645
@rodgerb2645 23 күн бұрын
Love all your content sir!
@shameekm2146
@shameekm2146 Ай бұрын
Thank you so much for this. Will test it out on the RAG flow in the company.
@jamesbriggs
@jamesbriggs Ай бұрын
welcome, would love to hear how it goes
@naromsky
@naromsky Ай бұрын
King of Chunk
@jamesbriggs
@jamesbriggs Ай бұрын
a title I have always wanted
@AdrienSales
@AdrienSales Ай бұрын
Excellent content and explanation , espeicialy chunking core concepts and challenges. Keep going your work it's so precisous to learn 👍
@jamesbriggs
@jamesbriggs Ай бұрын
Glad to hear it helps
@baskarjayaraman5821
@baskarjayaraman5821 24 күн бұрын
Great video. Thanks for posting. I have been thinking of document chunking but using the LLM itself via prompting + k-shot. The approach you show will be cheaper of course but curious to see how these two approaches will compare in terms of any relevant non-cost metrics.
@trn450
@trn450 Ай бұрын
Great material. 🙏
@klik24
@klik24 Ай бұрын
Just what i eas trying to lewrn ...awesome mate, thanks
@jamesbriggs
@jamesbriggs Ай бұрын
Nice np
@scottmiller2591
@scottmiller2591 Ай бұрын
"Grab complete thoughts" is an obvious good and expensive thing. Except for tables, for instance.
@jamesbriggs
@jamesbriggs Ай бұрын
yeah tables need to handled differently - doable if you are identifying text vs. table elements in your processing pipeline
@NhatNguyen-bq6jj
@NhatNguyen-bq6jj 13 күн бұрын
Can you introduce some articles related to this topic? Thanks!
@MrMoonsilver
@MrMoonsilver Ай бұрын
Can this be used to create chunks for creating a training dataset as well? It would be great to chunk a document into 'statements' and use those statements for a dataset. In essence have a LLM create questions for each of those statements and use those pairs for training. Could you make a video to show how that works?
@AGI-Bingo
@AGI-Bingo Ай бұрын
Hi James , would you please tell me how you would tackle this one.. How would you design a realtime updating rag system? For example, let's say our clients updated some details in some watched doc, I want the old chunks to be removed, and rechunked automatically. Have you seen such pipeline existing already? No one seems to cover this and I think it sets apart fun projects and actual production system. Thanks and all the best! Love your channel ❤
@shameekm2146
@shameekm2146 Ай бұрын
I have achieved this for one of the sources in my RAG bot. It has an api provided to access the data. So i run the embedding script on the delta changes.
@AGI-Bingo
@AGI-Bingo Ай бұрын
@@shameekm2146 amazing, would you please opensource it so we can all improve the pipeline as a community? 🌈
@luciolrv
@luciolrv Ай бұрын
How does Parent Document Rag fits in your in your new techniques?
@talesfromthetrailz
@talesfromthetrailz Ай бұрын
Dude already embedded whole documents of texts into PC haha would've helped a month ago. But awesome thanks for this! 🤘🏾
@jamesbriggs
@jamesbriggs Ай бұрын
Maybe for the next project 😅
@talesfromthetrailz
@talesfromthetrailz Ай бұрын
@@jamesbriggs quick question man. Is the objective of semantic chunking to achieve broader search results? Or to decrease query times? I'm thinking of it in terms of medium sized text docs, for example movies summaries and such. Thanks!
@nikhilmaddirala
@nikhilmaddirala 4 күн бұрын
What's a good way to use the metadata for retrieval and ranking of the chunks?
@MrDespik
@MrDespik 29 күн бұрын
Hi James. Excuse me, maybe I missed it. But how you handle the situation that when we use semantic chunking we miss pages numbers for chunks? Is it possible to receive it with using this package?
@botondvasvari5758
@botondvasvari5758 21 күн бұрын
and how can I use big models from huggingface ? I can't load them into memory because many of them are bigger than 15gb, some of them are 130gb+ . Any thoughts?
@GeertBaeke
@GeertBaeke Ай бұрын
We use a simple combination of Microsoft's Document Intelligence with markdown output and a simple markdown splitter. The improvement is noticeable although the Document Intelligence models do come at an additional cost.
@jamesbriggs
@jamesbriggs Ай бұрын
yeah it depends on what you need ofcourse, I'm mostly interested in further abstraction and more analytics methods for chunking not for where it is now, but for where this type of experimentation might lead to in the future - I could see a few more iterations and improvements to more intelligent doc parsing and chunking to become increasingly more performant - but we'll see
@alivecoding4995
@alivecoding4995 Ай бұрын
Do you have a link for this markdown processing? :) We are using Document Intelligence as well, but not for layout analysis, yet.
@user-os6uo8xq9g
@user-os6uo8xq9g Ай бұрын
@@alivecoding4995you can also use layoutpdf reader from llmsherpra
@dinoscheidt
@dinoscheidt Ай бұрын
People since GPT2: Simply ask an LLM recursively to please insert “{split}“ where a topic change etc happens according to a summary of prior text. Get embeddings. Use to separate and group. 2024: We would like to introduce a novel concept called Semantic Chunking with a sliding Context…….. Beginners must be truly lost 😮‍💨
@brianferrell9454
@brianferrell9454 19 күн бұрын
Do you think this causes the results to be biased towards smaller chunks? Because the user will only query probably no more than 10 words . So the highest semantic similar results may also only be 10 words and the chunks that are 400 tokens wouldn't have as high as a score unless you provide more context to the query?
@amantandon-ln9xx
@amantandon-ln9xx 23 күн бұрын
I see the #abstract is also with #title ideally both should be in different chunks so that LLM can understand better semantics.
@bastabey2652
@bastabey2652 18 күн бұрын
using a high end LLM like GPT-4 or Opus or Gemini Ultra or Pro might be effective in performing semantic chunking.. Google large context window seems suitable for chunking large files.. we need to introduce LLM in automating the RAG stack
@jamesbriggs
@jamesbriggs 16 күн бұрын
Yeah I’d like to introduce an LLM chunker and see how they compare
@bastabey2652
@bastabey2652 16 күн бұрын
@@jamesbriggs better than any non LLM chunker.. if we aim to empower user's with AI, why not empower the developer? chunking is not easy
@FatherNovelty
@FatherNovelty Ай бұрын
At ~4:40, you mention that you should use the same encoder for the chunking and the encoding. Why? A chunk size captures a "single meaning", so why would it matter that the same encoder is used? If you look at the chunking as a clutering algorithim that creates meaningful chunks, then what does it matter that the encoders match? What am I missing?
@jamesbriggs
@jamesbriggs Ай бұрын
good point - yes they are capturing the "single meaning" and that single meaning will (hopefully) overlap a lot, but embedding models are not perfect and so they will not align between themselves. Similar to if someone asked myself and you to chunk an article, we'd likely overlap for the majority of the article, but I'm sure there would be differences
@mrchongnoi
@mrchongnoi 28 күн бұрын
Why not chunk based on paragraphs, lists, and tables.
@jimmc448
@jimmc448 Ай бұрын
My son just asked if you were the Rock
@jamesbriggs
@jamesbriggs Ай бұрын
I hope you said yes
@saqqara6361
@saqqara6361 Ай бұрын
"What is the title of the document?" -> 99% of RAG pipelines fail, because there is not answer in the document as it is embedded,
@jamesbriggs
@jamesbriggs Ай бұрын
in that case we can try including the title in our chunk, and possibly consider different routing logic for this type of query - something that triggers when a user asks for metadata about a received document we trigger a function that identifies the document ID in previously retrieved contexts, and uses that to pull in the document metadata for the answer to be generated by the LLM
@itzuditsharma
@itzuditsharma 28 күн бұрын
I am facing the problem in my jupyter notebook as this, please help 2024-05-10 10:59:50 WARNING semantic_router.utils.logger Retrying in 2 seconds...
NVIDIA's NEW AI Workbench for AI Engineers
22:05
James Briggs
Рет қаралды 4 М.
Semantic Chunking - 3 Methods for Better RAG
10:13
James Briggs
Рет қаралды 4,1 М.
Miracle Doctor Saves Blind Girl ❤️
00:59
Alan Chikin Chow
Рет қаралды 60 МЛН
Black Magic 🪄 by Petkit Pura Max #cat #cats
00:38
Sonyakisa8 TT
Рет қаралды 37 МЛН
КАРМАНЧИК 2 СЕЗОН 5 СЕРИЯ
27:21
Inter Production
Рет қаралды 605 М.
The 5 Levels Of Text Splitting For Retrieval
1:09:00
Greg Kamradt (Data Indy)
Рет қаралды 47 М.
LangGraph 101: it's better than LangChain
32:26
James Briggs
Рет қаралды 44 М.
GraphRAG: LLM-Derived Knowledge Graphs for RAG
15:40
Alex Chao
Рет қаралды 65 М.
RAG for LLMs explained in 3 minutes
3:15
Manny Bernabe
Рет қаралды 14 М.
Semantic Chunking for RAG
1:03:16
AI Makerspace
Рет қаралды 4,5 М.
RAG But Better: Rerankers with Cohere AI
23:43
James Briggs
Рет қаралды 51 М.
Выложил СВОЙ АЙФОН НА АВИТО #shorts
0:42
Дмитрий Левандовский
Рет қаралды 1,8 МЛН
Индуктивность и дроссель.
1:00
Hi Dev! – Электроника
Рет қаралды 1,6 МЛН
Очень странные дела PS 4 Pro
1:00
ТЕХНОБЛОГ ГУБАРЕВ СЕРГЕЙ
Рет қаралды 416 М.
как спасти усилитель?
0:35
KS Customs
Рет қаралды 531 М.
What model of phone do you have?
0:16
Hassyl Joon
Рет қаралды 76 М.