NEW Transformer for RAG: ModernBERT

  Рет қаралды 3,919

Discover AI

Discover AI

Күн бұрын

Пікірлер
@code4AI
@code4AI 4 күн бұрын
With the automatic audio dubbing from KZbin /Google you hear a synthetic voice in your regional language. To hear my original voice in English, switch to "Default" or "English" in the settings. Thank you.
@CharlotteLopez-n3i
@CharlotteLopez-n3i 9 күн бұрын
Modern BERT = game-changer! Flash Attention and savvy design make it a hot topic for info retrieval and rec systems. Kudos to the team for this awesome open-source release.
@malikrumi1206
@malikrumi1206 7 күн бұрын
3 burning 🔥 questions: 1. What about DSPy? I haven’t seen a lot of conversation or adoption of that idea. How does one prompt modernBERT? 2. With NVIDIA in the mix, what are the chances we can get this on Apple Silicon/MLX anytime soon, if ever? btw, I can use some small open source models on my Silicon Mac without MLX… 3. Did you say modernBERT doesn’t need or use a tokenizer?!?
@andrewandreas5795
@andrewandreas5795 9 күн бұрын
Nice video! could you please explain where exactly in a RAG pipeline could this new model be employed? Not in the generative part, or?
@chegouparalutar
@chegouparalutar 8 күн бұрын
You use it in the R of RAG :) Explicitly, 1) to embed your documents into a vectorstore and then again 2) to embed your query to calculate its similarity to the ones in the vectorstore. Now, you can RETRIEVE the most "similar" documents, to AUGMENT the context of your GENERATIVE model.
@andrewandreas5795
@andrewandreas5795 7 күн бұрын
@@chegouparalutar Thanks for your answer. I am a bit confused, so is this an embedding model?
@chegouparalutar
@chegouparalutar 7 күн бұрын
@@andrewandreas5795 Shortly: Yes, for a typical RAG-Chat Model, yes. Longer answer: Typically, one would use models with causal attention (tokens only attend one direction = backwards) for **auto-regressive** generative tasks (chatting), since it is way more efficient (check out KV-caching if interested) compared to bi-directional attention but theocratically you may also use it. Models with bi-directional (both backwards and forwards) attention are way more powerful in understanding the text, since the context of a word does not only depends on the words behind it, as well as the words in front. Therefore many generative tasks that do not really on an auto-regressive pattern like translation, summarization are popularly done via bi-directional models, as well as text embeddings. Therefore, you may also use ModernBERT not only as your embedding model but also as a generative model in your RAG pipeline whether to summarize text or answer questions. My take: As far as I understand, the current ModernBERT release is not fine-tuned for any of those tasks. Looking at the models they are comparing to (like Nomic and GTE) they are aiming for an embedding model, and the Hugginface team are releasing notebooks to use it as an embedding model. IMO, it is time for a bi-directional model that generates short and structured answers and I am kinda sad that they are not going that direction.
@carlhealy
@carlhealy 9 күн бұрын
Oh wow, this is really exciting. Thank you for sharing!
@wdonno
@wdonno 9 күн бұрын
Thanks for covering this!
@IvarDaigon
@IvarDaigon 8 күн бұрын
I'm waiting for the Quantum Enabled Version called Q*Bert.
@aliettienne2907
@aliettienne2907 9 күн бұрын
Building a reliable LLM architecture that can retrieve information faster with every unit of information is just the ideal conditions to obtain. It's like having an interpreter that can interpret a foreign language between you and a foreign person much faster. And if you can receive the interpreted language faster with quick listening skills then even that will be a plus for you. 😎💯💪🏾👍🏾
@irbsurfer1585
@irbsurfer1585 9 күн бұрын
YES! Sweet! Beautiful!
@asimabusallam3147
@asimabusallam3147 10 күн бұрын
❤ thanx
@john23232
@john23232 9 күн бұрын
English only for a RAG… Same problem as the initial BERT. Good to show off the technics but useless in practice by most of us. Llama 3.3 70B makes a better candidate to be honest, speaking of only open-source models.
@code4AI
@code4AI 9 күн бұрын
You mix encoder-only with decoder-only transformer. And you do not speak for "most of us".
@raymond_luxury_yacht
@raymond_luxury_yacht 9 күн бұрын
What about dogbert
@raymond_luxury_yacht
@raymond_luxury_yacht 9 күн бұрын
Bruh you're not helping by not linking!
@NE0123456789
@NE0123456789 9 күн бұрын
hi @code4AI hast du eine kontaktmöglichkeit bitte 🙏
LCM: The Ultimate Evolution of AI? Large Concept Models
30:13
Discover AI
Рет қаралды 46 М.
Goodbye RAG - Smarter CAG w/ KV Cache Optimization
26:19
Discover AI
Рет қаралды 460
Мен атып көрмегенмін ! | Qalam | 5 серия
25:41
Try this prank with your friends 😂 @karina-kola
00:18
Andrey Grechka
Рет қаралды 9 МЛН
Finite Potential Well
55:05
For the Love of Physics
Рет қаралды 968
Anthropic Revealed Secrets to Building Powerful Agents
19:06
Matthew Berman
Рет қаралды 71 М.
How massive Cerebras chips rival Nvidia GPUs for AI
41:27
Dr Waku
Рет қаралды 19 М.
Multi-Modal RAG with Unstructured IO & LangChain #multimodal #rag
11:43
How might LLMs store facts | DL7
22:43
3Blue1Brown
Рет қаралды 875 М.
Top Minds in AI Explain What’s Coming After GPT-4o | EP #130
25:30
Peter H. Diamandis
Рет қаралды 830 М.
NEW Knowledge Graph based RAG: SimGRAG (no training)
18:42
Discover AI
Рет қаралды 6 М.
Таким раствором работать одно удовольствие
1:00
Профессия созидатели
Рет қаралды 954 М.
When the PHONE CASE can be your BEST FRIEND! #shorts
0:33
One More
Рет қаралды 18 МЛН
три кошака и ростелеком
0:26
Мистер Денала
Рет қаралды 2,4 МЛН
три кошака и ростелеком
0:26
Мистер Денала
Рет қаралды 2,4 МЛН
SH - Anh trai & Em gái || Brother & Sister #shorts
0:58
Su Hao
Рет қаралды 48 МЛН
Making of Marble in Factory #shorts #ashortaday #indianstreetfood
0:59
Indian Food Vlogs
Рет қаралды 6 МЛН