Democratizing Foundation Models via k-bit Quantization - Tim Dettmers | Stanford MLSys #82

  Рет қаралды 3,858

Stanford MLSys Seminars

Stanford MLSys Seminars

Күн бұрын

Episode 82 of the Stanford MLSys Seminar Series!
Democratizing Foundation Models via k-bit Quantization
Speaker: Tim Dettmers
Abstract:
Foundation models are effective tools for many tasks but are challenging to finetune and inference due to their GPU memory requirements. Compressing foundation models with k-bit quantization makes them more accessible with minimal resources, but k-bit quantization can lead to degradation in model quality. In this lecture, I will talk about fundamental insights into how to compress foundation models with quantization while maintaining their predictive performance. We will learn about emergent outliers in large language models (LLMs) and how they affect performance during 8-bit quantization. We will learn how to do effective k-bit compression of pretrained large language models such that we maximize their density of predictive performance per bit. We will also talk about how to do efficient fine-tuning of quantized 4-bit LLMs (QLoRA) and how this helps to build state-of-the-art chatbots.
Bio:
Tim Dettmers is a graduating PhD student advised by Luke Zettlemoyer at the University of Washington in Seattle. He holds degrees in applied math and computer science and has a background in industrial automation. His primary research goal is to democratize foundation models by making them more efficient and accessible through quantization, sparsification, and building machine learning systems that use consumer-grade hardware. He is the creator of the bitsandbytes library. Tim runs a blog about deep learning, GPUs, and PhD life at timdettmers.com.
--
Stanford MLSys Seminar hosts: Simran Arora, Dan Fu
Twitter:
/ simran_s_arora
/ realdanfu​
--
Check out our website for the schedule: mlsys.stanford.edu
Join our mailing list to get weekly updates: groups.google....
#machinelearning #ai #artificialintelligence #systems #mlsys #computerscience #stanford

Пікірлер: 5
@JurekOK
@JurekOK 10 ай бұрын
This guy is the current-day international folk hero! Thank you Tim!
@BenViggiano
@BenViggiano 10 ай бұрын
Super interesting! Thanks so much!
@joshuaahn584
@joshuaahn584 Ай бұрын
Are the slides available somewhere? These were extremely insightful and would be helpful to look at!
@amelieschreiber6502
@amelieschreiber6502 10 ай бұрын
Amazing, as usual 😊Thank you!
@JurekOK
@JurekOK 10 ай бұрын
@35:16 and what if one would use the Guanaco to judge the answers? or any other model e.g. Vicuna, to be the judge? The results would be different, but by how much?
Training LLMs at Scale - Deepak Narayanan | Stanford MLSys #83
55:59
Stanford MLSys Seminars
Рет қаралды 8 М.
Monarch Mixer: Making Foundation Models More Efficient - Dan Fu | Stanford MLSys #86
56:32
My daughter is creative when it comes to eating food #funny #comedy #cute #baby#smart girl
00:17
QLoRA: Efficient Finetuning of Quantized LLMs | Tim Dettmers
30:48
Applied Machine Learning Days
Рет қаралды 3 М.
Text2SQL: The Dream versus Reality - Laurel Orr | Stanford MLSys #89
57:05
Stanford MLSys Seminars
Рет қаралды 4,7 М.
The Next 100x - Gavin Uberti | Stanford MLSys #92
59:21
Stanford MLSys Seminars
Рет қаралды 5 М.
MedAI #78: Foundation Models for Medical AI | Vivek Natarajan
1:04:31
Stanford MedAI
Рет қаралды 4,7 М.
Hardware-aware Algorithms for Sequence Modeling - Tri Dao | Stanford MLSys #87
1:19:06
Serving 100s of LLMs on 1 GPU with LoRAX - Travis Addair | Stanford MLSys #84
59:17
EVO: DNA Foundation Models - Eric Nguyen | Stanford MLSys #96
1:04:36
Stanford MLSys Seminars
Рет қаралды 2,1 М.
Notes on AI Hardware - Benjamin Spector | Stanford MLSys #88
1:16:48
Stanford MLSys Seminars
Рет қаралды 4,5 М.