Deep dive: model merging

  Рет қаралды 8,310

Julien Simon

Julien Simon

Күн бұрын

*** Part 2 is now available at • Deep dive: model mergi... : Model Breadcrumbs, Model Stock, DELLA
Model merging is an increasingly popular technique that makes it possible to add or remove capabilities to transformer models, without the need for any additional training.
In this video, we first introduce what model merging is. Then, we discuss different merging algorithms implemented in the mergekit library (github.com/arcee-ai): model soups, SLERP, Task Arithmetic, TIES, DARE, and Franken-merging.
⭐️⭐️⭐️ Don't forget to subscribe to be notified of future videos. Follow me on Medium at / julsimon or Substack at julsimon.substack.com. ⭐️⭐️⭐️
01:16 What is model merging?
07:10 Model soups
14:00 Spherical Linear Interpolation (SLERP)
20:35 Task Arithmetic
27:15 Trim, Extract Sign and Merge (TIES)
36:20 Drop and Rescale (DARE)
43:40 Franken-merging

Пікірлер: 16
@kenchang3456
@kenchang3456 4 ай бұрын
Thank you for this video. I gotta give this a try 🙂
@juliensimonfr
@juliensimonfr 4 ай бұрын
You're welcome, and yes, you should :)
@melikanobakhtian6018
@melikanobakhtian6018 9 күн бұрын
That was great and it helped me so much! Is there this possibility to have the presentation slides?
@gnibu42
@gnibu42 4 ай бұрын
Super intersting Julien, thanks a lot for sharing
@juliensimonfr
@juliensimonfr 4 ай бұрын
Glad you enjoyed it
@SrikanthIyer
@SrikanthIyer 4 ай бұрын
Thanks for the fantastic video. Loved how you simplified almost all the methods to merge the models!
@juliensimonfr
@juliensimonfr 4 ай бұрын
Glad it was helpful!
@subhamkundu5043
@subhamkundu5043 4 ай бұрын
Hey @Julien, great vide. I have a question regarding the scale factor in TIES method. How we determine the scale factor?
@juliensimonfr
@juliensimonfr 4 ай бұрын
Thank you. It's up to you, depending on how much you want to "influence" the base model. mergekit has a parameter called 'density': fraction of weights in differences from the base model to retain. Example at github.com/arcee-ai/mergekit/blob/edd3817e4a470c7a959ef4c505f52a650a46ff07/examples/ties.yml
@uygarkurtai
@uygarkurtai 3 ай бұрын
Great viedo thank you! What I didn't grasp quite well is that, let's say I'm merging 2 models. One is trained on maths, other is trained on coding. Do we expect the merged model to perform high level in both tasks?
@juliensimonfr
@juliensimonfr 3 ай бұрын
Yes, that's the expectation :)
@abse-mj8pw
@abse-mj8pw 2 ай бұрын
I can't help wondering if there is an experiment which really fully discovers those technique like applying to all kinds of models or combining different methods together?
@juliensimonfr
@juliensimonfr 2 ай бұрын
Check out arcee.ai, their platform is definitely going that way.
@abse-mj8pw
@abse-mj8pw 2 ай бұрын
@@juliensimonfr Thanks for your answer!! I've found some interesting blogs about it!
@AbdennacerAyeb
@AbdennacerAyeb 4 ай бұрын
This is a random comment to boost your channel. Thank you.
@juliensimonfr
@juliensimonfr 4 ай бұрын
LOL, thank you.
Deep dive: model merging, part 2
32:15
Julien Simon
Рет қаралды 4,9 М.
10 weird algorithms
9:06
Fireship
Рет қаралды 1,2 МЛН
Викторина от МАМЫ 🆘 | WICSUR #shorts
00:58
Бискас
Рет қаралды 4,6 МЛН
World’s Largest Jello Pool
01:00
Mark Rober
Рет қаралды 102 МЛН
Doing This Instead Of Studying.. 😳
00:12
Jojo Sim
Рет қаралды 8 МЛН
Получилось у Миланы?😂
00:13
ХАБИБ
Рет қаралды 4,8 МЛН
Why Does Diffusion Work Better than Auto-Regression?
20:18
Algorithmic Simplicity
Рет қаралды 255 М.
Deep Dive: Optimizing LLM inference
36:12
Julien Simon
Рет қаралды 19 М.
The moment we stopped understanding AI [AlexNet]
17:38
Welch Labs
Рет қаралды 839 М.
It's Been a Good Run, Drywall.
20:48
LRN2DIY
Рет қаралды 3,1 МЛН
What are AI Agents?
12:29
IBM Technology
Рет қаралды 110 М.
Water powered timers hidden in public restrooms
13:12
Steve Mould
Рет қаралды 721 М.
Why Is This Basic Computer Science Problem So Hard?
8:34
Quanta Magazine
Рет қаралды 94 М.
How I'd Learn AI (If I Had to Start Over)
15:04
Thu Vu data analytics
Рет қаралды 764 М.
$1 vs $100,000 Slow Motion Camera!
0:44
Hafu Go
Рет қаралды 28 МЛН
Vision Pro наконец-то доработали! Но не Apple!
0:40
ÉЖИ АКСЁНОВ
Рет қаралды 88 М.
Хакер взломал компьютер с USB кабеля. Кевин Митник.
0:58
Последний Оплот Безопасности
Рет қаралды 2,2 МЛН