Deep Dive: Quantizing Large Language Models, part 1

  Рет қаралды 8,584

Julien Simon

Julien Simon

5 ай бұрын

Quantization is an excellent technique to compress Large Language Models (LLM) and accelerate their inference.
In this video, we discuss model quantization, first introducing what it is, and how to get an intuition of rescaling and the problems it creates. Then we introduce the different types of quantization: dynamic post-training quantization, static post-training quantization, and quantization-aware training. Finally, we start looking at and comparing actual quantization techniques: PyTorch, ZeroQuant, and bitsandbytes.
In part 2 • Deep Dive: Quantizing ... , we look at and compare more advanced quantization techniques: SmoothQuant, GPTQ, AWQ, HQQ, and the Hugging Face Optimum Intel library based on Intel Neural Compressor and Intel OpenVINO.
⭐️⭐️⭐️ Don't forget to subscribe to be notified of future videos. Follow me on Medium at / julsimon or Substack at julsimon.substack.com. ⭐️⭐️⭐️
02:05 What is quantization?
06:50 Rescaling weights and activations
08:17 The mapping function
12:38 Picking the input range
16:15 Getting rid of outliers
19:50 When can we apply quantization?
26:00 Dynamic post-training quantization with PyTorch
28:42 ZeroQuant
34:50 bitsandbytes

Пікірлер: 17
@itayatelis2898
@itayatelis2898 2 ай бұрын
Love your content! thank you!
@juliensimonfr
@juliensimonfr 2 ай бұрын
Glad you enjoy it!
@bibiworm
@bibiworm 2 ай бұрын
I have been wanting to understand quantization for a very long time. Thank you! Would you mind sharing the slides please? Thank you.
@matthewrice7590
@matthewrice7590 5 ай бұрын
Thanks for this, Julien
@juliensimonfr
@juliensimonfr 5 ай бұрын
you're welcome :)
@road2nohand
@road2nohand 4 ай бұрын
Glorious Content :D
@juliensimonfr
@juliensimonfr 4 ай бұрын
Glad you like it!
@Joe-nh9fy
@Joe-nh9fy 4 ай бұрын
Great explanation! I have one question... Is it common practice to regularize the LLM cost function like with L2 to reduce the weight "outliers" while training?
@juliensimonfr
@juliensimonfr 4 ай бұрын
I don't think there is a strong consensus. It looks like regularization during fine-tuning can help with generalization. There are new ideas too, like noisy embeddings wandb.ai/byyoung3/ml-news/reports/A-New-Method-For-LLM-Regularization--Vmlldzo1ODIyMzIw
@jacehua7334
@jacehua7334 5 ай бұрын
🔥 🔥 🔥
@juliensimonfr
@juliensimonfr 5 ай бұрын
:)
@monishostwal8255
@monishostwal8255 4 ай бұрын
what is meant by calibration dataset? is it eqivalent to evaluation set?
@juliensimonfr
@juliensimonfr 4 ай бұрын
Pretty much, yes. It's used to figure out the "best" hyperparameter values.
@monishostwal8255
@monishostwal8255 4 ай бұрын
okay got it thanks
@caiyu538
@caiyu538 4 ай бұрын
👍
@juliensimonfr
@juliensimonfr 4 ай бұрын
😃
@joaogalego7229
@joaogalego7229 5 ай бұрын
Watching this at 1.25x speed. High-quality content as usual. Keep it up, Julien 💪
Deep Dive: Quantizing Large Language Models, part 2
27:13
Julien Simon
Рет қаралды 1 М.
Deep Dive: Optimizing LLM inference
36:12
Julien Simon
Рет қаралды 19 М.
Викторина от МАМЫ 🆘 | WICSUR #shorts
00:58
Бискас
Рет қаралды 4,9 МЛН
Gym belt !! 😂😂  @kauermotta
00:10
Tibo InShape
Рет қаралды 18 МЛН
Clown takes blame for missing candy 🍬🤣 #shorts
00:49
Yoeslan
Рет қаралды 47 МЛН
Doing This Instead Of Studying.. 😳
00:12
Jojo Sim
Рет қаралды 14 МЛН
What are AI Agents?
12:29
IBM Technology
Рет қаралды 113 М.
Deep dive: model merging
47:26
Julien Simon
Рет қаралды 8 М.
The moment we stopped understanding AI [AlexNet]
17:38
Welch Labs
Рет қаралды 845 М.
Quantization vs Pruning vs Distillation: Optimizing NNs for Inference
19:46
Generative AI in a Nutshell - how to survive and thrive in the age of AI
17:57
LoRA explained (and a bit about precision and quantization)
17:07
This is why Deep Learning is really weird.
2:06:38
Machine Learning Street Talk
Рет қаралды 377 М.
8 Товаров с Алиэкспресс, о которых ты мог и не знать!
49:47
РасПаковка ДваПаковка
Рет қаралды 175 М.
Tag him😳💕 #miniphone #iphone #samsung #smartphone #fy
0:11
Pockify™
Рет қаралды 3,9 МЛН
Looks very comfortable. #leddisplay #ledscreen #ledwall #eagerled
0:19
LED Screen Factory-EagerLED
Рет қаралды 11 МЛН