Fine-Tuning Text Embeddings For Domain-specific Search (w/ Python)

  Рет қаралды 2,415

Shaw Talebi

Shaw Talebi

Күн бұрын

Пікірлер: 20
@ShawhinTalebi
@ShawhinTalebi 11 күн бұрын
Excited to share another fine-tuning video! Check out links to the code, dataset, and model in the description :)
@TonyCerone
@TonyCerone 7 күн бұрын
Thank you @Shaw ! Very good material. Your pedagogy is powerfull 🙂
@ShawhinTalebi
@ShawhinTalebi 7 күн бұрын
Thanks Tony! Glad it was clear :)
@ifycadeau
@ifycadeau 11 күн бұрын
Love this video Shaw!
@pauliusztin
@pauliusztin 11 күн бұрын
Amazing video, Shaw 🤟
@ShawhinTalebi
@ShawhinTalebi 10 күн бұрын
Thanks Paul 😁
@sndrstpnv8419
@sndrstpnv8419 11 күн бұрын
very good material thanks for sharing
@ShawhinTalebi
@ShawhinTalebi 11 күн бұрын
Thanks! Glad it was helpful :)
@gustavojuantorena
@gustavojuantorena 11 күн бұрын
Great!
@pasan-i5e
@pasan-i5e 11 күн бұрын
Can you please explain me what actually do you suggest here? Use fine tuned llm instead of just using a llm for output generation ?
@ShawhinTalebi
@ShawhinTalebi 9 күн бұрын
It's always worth exploring non-finetuning-based improvements since these will be relatively quicker to iterate on e.g. improving prompts, chunking strategy, retrieval strategy. However, if further optimizations are needed then fine-tuning can make sense.
@sndrstpnv8419
@sndrstpnv8419 11 күн бұрын
can you share code and video for Fine-Tuning Text Embeddings For Domain-specific on openai? or other good quality cloud based LLM (google , aws )? or recent bert model modernbert ?
@ShawhinTalebi
@ShawhinTalebi 11 күн бұрын
Unfortunately, OpenAI doesn’t have their embeddings models available for finetuning. But I’ll take a look at those other options you mentioned 😁
@sndrstpnv8419
@sndrstpnv8419 11 күн бұрын
@@ShawhinTalebi you are the best. but you do video for recent bert model modernbert ?
@cynorsense
@cynorsense 11 күн бұрын
Why only BERT?
@ShawhinTalebi
@ShawhinTalebi 9 күн бұрын
There is a rich ecosystem of BERT-based embedding models and tools to develop them (e.g. the Sentence Transformers lib). One benefit of BERT is that is relatively small so it's easy to experiment with. In principle, however, you can do exactly the same thing with latent representations from more modern LLMs. Happy to do a video on that if there is interest :)
@sndrstpnv8419
@sndrstpnv8419 11 күн бұрын
your write w/ Python , but where is link to python code ?
@ShawhinTalebi
@ShawhinTalebi 11 күн бұрын
GitHub repo link (and other resources) in the description! Repo link: github.com/ShawhinT/KZbin-Blog/tree/main/LLMs/fine-tuning-embeddings
@believer8754
@believer8754 10 күн бұрын
Do u put the same content in a blog ?
@ShawhinTalebi
@ShawhinTalebi 10 күн бұрын
Yes! Coming soon :)
Fine-tuning Multimodal Embeddings on Custom Text-Image Pairs
27:56
Compressing Large Language Models (LLMs) | w/ Python Code
24:04
Shaw Talebi
Рет қаралды 6 М.
СИНИЙ ИНЕЙ УЖЕ ВЫШЕЛ!❄️
01:01
DO$HIK
Рет қаралды 3,3 МЛН
“Don’t stop the chances.”
00:44
ISSEI / いっせい
Рет қаралды 62 МЛН
How to Improve LLMs with RAG (Overview + Python Code)
21:41
Shaw Talebi
Рет қаралды 97 М.
Vectoring Words (Word Embeddings) - Computerphile
16:56
Computerphile
Рет қаралды 303 М.
Turn ANY Website into LLM Knowledge in SECONDS
18:44
Cole Medin
Рет қаралды 174 М.
Fine-Tuning BERT for Text Classification (w/ Example Code)
23:24
Shaw Talebi
Рет қаралды 14 М.
Why is every React site so slow?
13:52
Theo - t3․gg
Рет қаралды 150 М.
Attention in transformers, step-by-step | DL6
26:10
3Blue1Brown
Рет қаралды 2,1 МЛН
Building a fully local "deep researcher" with DeepSeek-R1
14:21
LangChain
Рет қаралды 150 М.
СИНИЙ ИНЕЙ УЖЕ ВЫШЕЛ!❄️
01:01
DO$HIK
Рет қаралды 3,3 МЛН