Deep Dive: Advanced distributed training with Hugging Face LLMs and AWS Trainium

  Рет қаралды 1,019

Julien Simon

Julien Simon

6 ай бұрын

Following up on the "Hugging Face on AWS accelerators" deep dive ( • Deep Dive: Hugging Fac... ) this video zooms in on distributed training with NeuronX Distributed, Optimum Neuron, and AWS Trainium.
First, we explain the basics and benefits of advanced distributed techniques like tensor parallelism, pipeline parallelism, sequence parallelism, and DeepSpeed ZeRO. Then, we discuss how these techniques are implemented in NeuronX Distributed and Optimum. Finally, we launch an Amazon EC2 Trainium-powered instance and demonstrate these techniques with distributed training runs on the TinyLlama and Llama 2 7B models. Of course, we share results on training time and cost, which will probably surprise you!
⭐️⭐️⭐️ Don't forget to subscribe to be notified of future videos. Follow me on Medium at / julsimon or Substack at julsimon.substack.com. ⭐️⭐️⭐️
This video focuses on the software details essential for achieving peak performance. Access relevant code snippets and developer resources, suitable for both newcomers and experienced professionals. Whether you're familiar with AWS Trainium or approaching it for the first time, this technical walkthrough ensures your readiness for success in training Hugging Face models on AWS.
01:20 NeuronX Distributed
05:20 Tensor Parallelism
11:10 Pipeline Parallelism
16:55 Sequence Parallelism
20:32 Optimum Neuron
21:40 Optimum Neuron with Zero-1
25:55 Optimum Neuron with Tensor Parallelism and Sequence Parallelism
29:20 Amazon Machine Images (AMI) for Neuron devices
30:10 Launching an Amazon EC2 trn1n.32xlarge instance with the Hugging Face Neuron AMI
33:10 Fine-tuning TinyLlama with Optimum Neuron
41:15 Fine-tuning Llama 2 7B with Optimum Neuron
Links:
- Hugging Face Optimum Neuron: huggingface.co/docs/optimum-n...
- Source code for supported models: github.com/huggingface/optimu...
- Release notes: github.com/huggingface/optimu...
- Distributed training docs: huggingface.co/docs/optimum-n...
- TinyLlama: huggingface.co/TinyLlama/Tiny...
- Llama 2 7B: huggingface.co/meta-llama/Lla...

Пікірлер: 4
@yacinezahidi7206
@yacinezahidi7206 6 ай бұрын
Thanks. That's great content!
@juliensimonfr
@juliensimonfr 6 ай бұрын
Glad you liked it!
@andriimelashchenko2201
@andriimelashchenko2201 6 ай бұрын
Can you please share the link to supported models?
@juliensimonfr
@juliensimonfr 6 ай бұрын
- Source code for supported models: github.com/huggingface/optimum-neuron/tree/main/optimum/neuron/distributed - Release notes: github.com/huggingface/optimum-neuron/releases
Phi-2 on Intel Meteor Lake - Coding question
0:26
Julien Simon
Рет қаралды 452
Fine-tuning LLMs with PEFT and LoRA
15:35
Sam Witteveen
Рет қаралды 120 М.
Alex hid in the closet #shorts
00:14
Mihdens
Рет қаралды 17 МЛН
Как бесплатно замутить iphone 15 pro max
00:59
ЖЕЛЕЗНЫЙ КОРОЛЬ
Рет қаралды 8 МЛН
What are AI Agents?
12:29
IBM Technology
Рет қаралды 112 М.
Key Value Cache in Large Language Models Explained
17:37
Tensordroid
Рет қаралды 748
Enabling Cost-Efficient LLM Serving with Ray Serve
30:28
Anyscale
Рет қаралды 4,7 М.
The moment we stopped understanding AI [AlexNet]
17:38
Welch Labs
Рет қаралды 844 М.
Fine-tuning Large Language Models (LLMs) | w/ Example Code
28:18
Shaw Talebi
Рет қаралды 286 М.
Microsoft DeepSpeed introduction at KAUST
1:11:36
KAUST Supercomputing Laboratory
Рет қаралды 6 М.
Can you solve for the angle?
10:56
MindYourDecisions
Рет қаралды 11 М.
Looks very comfortable. #leddisplay #ledscreen #ledwall #eagerled
0:19
LED Screen Factory-EagerLED
Рет қаралды 11 МЛН
iPhone 15 Pro в реальной жизни
24:07
HUDAKOV
Рет қаралды 488 М.
Хакер взломал компьютер с USB кабеля. Кевин Митник.
0:58
Последний Оплот Безопасности
Рет қаралды 2,2 МЛН
#samsung #retrophone #nostalgia #x100
0:14
mobijunk
Рет қаралды 13 МЛН