Training LLMs at Scale - Deepak Narayanan | Stanford MLSys #83

  Рет қаралды 7,012

Stanford MLSys Seminars

Stanford MLSys Seminars

8 ай бұрын

Episode 83 of the Stanford MLSys Seminar Series!
Training Large Language Models at Scale
Speaker: Deepak Narayanan
Abstract:
Training LLMs efficiently is challenging for a few reasons: training can require yottaFLOPs of compute, and accelerators have limited memory capacity making it impossible to fit large models on even a multi-GPU server. Consequently, new methods of model parallelism such as tensor and pipeline parallelism have been proposed. Unfortunately, naïve usage of these methods leads to scaling issues at thousands of GPUs. In this talk, I describe various systems innovations incorporated into Megatron-LM (github.com/nvidia/megatron-lm) that allow us to run training iterations for models with up to a trillion parameters on thousands of GPUs.
Bio:
Deepak is a Senior Applied Deep Learning Research Scientist in the ADLR group at NVIDIA, where he builds software systems to more efficiently train and serve LLMs. He graduated from Stanford with a Ph.D. in Computer Science in September 2021, where he was advised by Prof. Matei Zaharia.
--
Stanford MLSys Seminar hosts: Simran Arora, Dan Fu
Twitter:
/ simran_s_arora
/ realdanfu​
--
Check out our website for the schedule: mlsys.stanford.edu
Join our mailing list to get weekly updates: groups.google.com/forum/#!for...
#machinelearning #ai #artificialintelligence #systems #mlsys #computerscience #stanford

Пікірлер: 4
@bread7393
@bread7393 8 ай бұрын
Good to see Dr. Narayanan at this seminar.
@smsubham342
@smsubham342 3 ай бұрын
Can we also have the slides?
@user-dg4oe2dc6x
@user-dg4oe2dc6x 8 ай бұрын
ح
@_s.i.s.u.
@_s.i.s.u. 8 ай бұрын
ح
Serving 100s of LLMs on 1 GPU with LoRAX - Travis Addair | Stanford MLSys #84
59:17
Notes on AI Hardware - Benjamin Spector | Stanford MLSys #88
1:16:48
Stanford MLSys Seminars
Рет қаралды 4 М.
Каха и суп
00:39
К-Media
Рет қаралды 6 МЛН
Double Stacked Pizza @Lionfield @ChefRush
00:33
albert_cancook
Рет қаралды 65 МЛН
Large Language Models (LLMs) - Everything You NEED To Know
25:20
Matthew Berman
Рет қаралды 66 М.
Efficient Large-Scale Language Model Training on GPU Clusters
22:58
Text2SQL: The Dream versus Reality - Laurel Orr | Stanford MLSys #89
57:05
Stanford MLSys Seminars
Рет қаралды 3,8 М.
What Creates Consciousness?
45:45
World Science Festival
Рет қаралды 109 М.
Monarch Mixer: Making Foundation Models More Efficient - Dan Fu | Stanford MLSys #86
56:32
Teaching LLMs to Use Tools at Scale - Shishir Patil | Stanford MLSys #98
1:06:34
Stanford MLSys Seminars
Рет қаралды 4,7 М.
PyTorch at Tesla - Andrej Karpathy, Tesla
11:11
PyTorch
Рет қаралды 514 М.
Foundation Models on Consumer Devices - Tianqi Chen | Stanford MLSys #85
47:35
Stanford MLSys Seminars
Рет қаралды 3,5 М.
Klavye İle Trafik Işığını Yönetmek #shorts
0:18
Osman Kabadayı
Рет қаралды 6 МЛН
Как распознать поддельный iPhone
0:44
PEREKUPILO
Рет қаралды 2 МЛН
Top 50 Amazon Prime Day 2024 Deals 🤑 (Updated Hourly!!)
12:37
The Deal Guy
Рет қаралды 1,4 МЛН