Democratizing Foundation Models via k-bit Quantization - Tim Dettmers

Democratizing Foundation Models via k-bit Quantization - Tim Dettmers | Stanford MLSys #82

Рет қаралды 3,858

Күн бұрын

Episode 82 of the Stanford MLSys Seminar Series!
Democratizing Foundation Models via k-bit Quantization
Speaker: Tim Dettmers
Abstract:
Foundation models are effective tools for many tasks but are challenging to finetune and inference due to their GPU memory requirements. Compressing foundation models with k-bit quantization makes them more accessible with minimal resources, but k-bit quantization can lead to degradation in model quality. In this lecture, I will talk about fundamental insights into how to compress foundation models with quantization while maintaining their predictive performance. We will learn about emergent outliers in large language models (LLMs) and how they affect performance during 8-bit quantization. We will learn how to do effective k-bit compression of pretrained large language models such that we maximize their density of predictive performance per bit. We will also talk about how to do efficient fine-tuning of quantized 4-bit LLMs (QLoRA) and how this helps to build state-of-the-art chatbots.
Bio:
Tim Dettmers is a graduating PhD student advised by Luke Zettlemoyer at the University of Washington in Seattle. He holds degrees in applied math and computer science and has a background in industrial automation. His primary research goal is to democratize foundation models by making them more efficient and accessible through quantization, sparsification, and building machine learning systems that use consumer-grade hardware. He is the creator of the bitsandbytes library. Tim runs a blog about deep learning, GPUs, and PhD life at timdettmers.com.
--
Stanford MLSys Seminar hosts: Simran Arora, Dan Fu
Twitter:
/ simran_s_arora
/ realdanfu
--
Check out our website for the schedule: mlsys.stanford.edu
Join our mailing list to get weekly updates: groups.google....
#machinelearning #ai #artificialintelligence #systems #mlsys #computerscience #stanford

Пікірлер: 5

@JurekOK 10 ай бұрын

This guy is the current-day international folk hero! Thank you Tim!

@BenViggiano 10 ай бұрын

Super interesting! Thanks so much!

@joshuaahn584 Ай бұрын

Are the slides available somewhere? These were extremely insightful and would be helpful to look at!

@amelieschreiber6502 10 ай бұрын

Amazing, as usual 😊Thank you!

@JurekOK 10 ай бұрын

@35:16 and what if one would use the Guanaco to judge the answers? or any other model e.g. Vicuna, to be the judge? The results would be different, but by how much?