Training LLMs at Scale - Deepak Narayanan | Stanford MLSys #83

Serving 100s of LLMs on 1 GPU with LoRAX - Travis Addair | Stanford MLSys #84

Notes on AI Hardware - Benjamin Spector | Stanford MLSys #88

Каха и суп

아들이 똑똑하면 생기는 일! 설마 천재??? 아닌가?? 뭔가 이상해 ㅋㅋㅋ

Double Stacked Pizza @Lionfield @ChefRush

Литвин - свадьба с Адель, конфликт с Моргеном, кража челленджей Меллстроем

Training LLMs at Scale - Deepak Narayanan | Stanford MLSys #83

Рет қаралды 7,012

Stanford MLSys Seminars

Stanford MLSys Seminars

8 ай бұрын

Episode 83 of the Stanford MLSys Seminar Series!
Training Large Language Models at Scale
Speaker: Deepak Narayanan
Abstract:
Training LLMs efficiently is challenging for a few reasons: training can require yottaFLOPs of compute, and accelerators have limited memory capacity making it impossible to fit large models on even a multi-GPU server. Consequently, new methods of model parallelism such as tensor and pipeline parallelism have been proposed. Unfortunately, naïve usage of these methods leads to scaling issues at thousands of GPUs. In this talk, I describe various systems innovations incorporated into Megatron-LM (github.com/nvidia/megatron-lm) that allow us to run training iterations for models with up to a trillion parameters on thousands of GPUs.
Bio:
Deepak is a Senior Applied Deep Learning Research Scientist in the ADLR group at NVIDIA, where he builds software systems to more efficiently train and serve LLMs. He graduated from Stanford with a Ph.D. in Computer Science in September 2021, where he was advised by Prof. Matei Zaharia.
--
Stanford MLSys Seminar hosts: Simran Arora, Dan Fu
Twitter:
/ simran_s_arora
/ realdanfu
--
Check out our website for the schedule: mlsys.stanford.edu
Join our mailing list to get weekly updates: groups.google.com/forum/#!for...
#machinelearning #ai #artificialintelligence #systems #mlsys #computerscience #stanford

Пікірлер: 4

@bread7393 8 ай бұрын

Good to see Dr. Narayanan at this seminar.

@smsubham342 3 ай бұрын

Can we also have the slides?

@user-dg4oe2dc6x

@user-dg4oe2dc6x 8 ай бұрын

ح

@_s.i.s.u. 8 ай бұрын

ح

Serving 100s of LLMs on 1 GPU with LoRAX - Travis Addair | Stanford MLSys #84

59:17

Serving 100s of LLMs on 1 GPU with LoRAX - Travis Addair | Stanford MLSys #84

Stanford MLSys Seminars

Рет қаралды 6 М.

Notes on AI Hardware - Benjamin Spector | Stanford MLSys #88

1:16:48

Notes on AI Hardware - Benjamin Spector | Stanford MLSys #88

Stanford MLSys Seminars

Рет қаралды 4 М.

00:39

Каха и суп

К-Media

Рет қаралды 6 МЛН

아들이 똑똑하면 생기는 일! 설마 천재??? 아닌가?? 뭔가 이상해 ㅋㅋㅋ

00:18

아들이 똑똑하면 생기는 일! 설마 천재??? 아닌가?? 뭔가 이상해 ㅋㅋㅋ

MariAndFriends

Рет қаралды 13 МЛН

Double Stacked Pizza @Lionfield @ChefRush

00:33

Double Stacked Pizza @Lionfield @ChefRush

albert_cancook

Рет қаралды 65 МЛН

Литвин - свадьба с Адель, конфликт с Моргеном, кража челленджей Меллстроем

2:47:23

Литвин - свадьба с Адель, конфликт с Моргеном, кража челленджей Меллстроем

ВПИСКА

Рет қаралды 8 МЛН

Large Language Models (LLMs) - Everything You NEED To Know

25:20

Large Language Models (LLMs) - Everything You NEED To Know

Matthew Berman

Рет қаралды 66 М.

Efficient Large-Scale Language Model Training on GPU Clusters

22:58

Efficient Large-Scale Language Model Training on GPU Clusters

Databricks

Рет қаралды 4,6 М.

Text2SQL: The Dream versus Reality - Laurel Orr | Stanford MLSys #89

57:05

Text2SQL: The Dream versus Reality - Laurel Orr | Stanford MLSys #89

Stanford MLSys Seminars

Рет қаралды 3,8 М.

What Creates Consciousness?

45:45

What Creates Consciousness?

World Science Festival

Рет қаралды 109 М.

Monarch Mixer: Making Foundation Models More Efficient - Dan Fu | Stanford MLSys #86

56:32

Monarch Mixer: Making Foundation Models More Efficient - Dan Fu | Stanford MLSys #86

Stanford MLSys Seminars

Рет қаралды 3,7 М.

Teaching LLMs to Use Tools at Scale - Shishir Patil | Stanford MLSys #98

1:06:34

Teaching LLMs to Use Tools at Scale - Shishir Patil | Stanford MLSys #98

Stanford MLSys Seminars

Рет қаралды 4,7 М.

PyTorch at Tesla - Andrej Karpathy, Tesla

11:11

PyTorch at Tesla - Andrej Karpathy, Tesla

PyTorch

Рет қаралды 514 М.

What is generative AI and how does it work? - The Turing Lectures with Mirella Lapata

46:02

What is generative AI and how does it work? - The Turing Lectures with Mirella Lapata

The Royal Institution

Рет қаралды 899 М.

QLoRA paper explained (Efficient Finetuning of Quantized LLMs)

11:44

QLoRA paper explained (Efficient Finetuning of Quantized LLMs)

AI Bites

Рет қаралды 10 М.

Foundation Models on Consumer Devices - Tianqi Chen | Stanford MLSys #85

47:35

Foundation Models on Consumer Devices - Tianqi Chen | Stanford MLSys #85

Stanford MLSys Seminars

Рет қаралды 3,5 М.

Klavye İle Trafik Işığını Yönetmek #shorts

0:18

Klavye İle Trafik Işığını Yönetmek #shorts

Osman Kabadayı

Рет қаралды 6 МЛН

Как распознать поддельный iPhone

0:44

Как распознать поддельный iPhone

PEREKUPILO

Рет қаралды 2 МЛН

ПЫШНЫЙ СМАРТФОН с 36 ГБ оперативы? 😲 DOOGEE V Max Plus за 1 минуту

1:00

ПЫШНЫЙ СМАРТФОН с 36 ГБ оперативы? 😲 DOOGEE V Max Plus за 1 минуту

i-shoppers обзоры

Рет қаралды 1,4 МЛН

Top 50 Amazon Prime Day 2024 Deals 🤑 (Updated Hourly!!)

12:37

Top 50 Amazon Prime Day 2024 Deals 🤑 (Updated Hourly!!)

The Deal Guy

Рет қаралды 1,4 МЛН

Galaxy Z Fold 6😈 vs SAMSUNG S24 ULTRA vs S23 vs S22 vs A35 5G vs - FREEFIRE TEST #freefire #shorts

0:20

Galaxy Z Fold 6😈 vs SAMSUNG S24 ULTRA vs S23 vs S22 vs A35 5G vs - FREEFIRE TEST #freefire #shorts

Sameer Gaming

Рет қаралды 2,4 МЛН

ВОЗМОЖНО ЛИ ПОЧИСТИТЬ КЛАВИАТУРУ КЛЕЕМ?🤔 #shorts

1:00

ВОЗМОЖНО ЛИ ПОЧИСТИТЬ КЛАВИАТУРУ КЛЕЕМ?🤔 #shorts

Winden

Рет қаралды 10 МЛН

POCO X6 PRO😈 Vs iPHONE 15 PRO💀Vs POCO F6 PRO😱 VsiQOO 12Vs 8GBvs4GBVs-PUBG TEST #pocox6pro #iPhone

0:16

POCO X6 PRO😈 Vs iPHONE 15 PRO💀Vs POCO F6 PRO😱 VsiQOO 12Vs 8GBvs4GBVs-PUBG TEST #pocox6pro #iPhone

Nir Gaming

Рет қаралды 4 МЛН