Accelerating Transformers with Optimum Neuron, AWS Trainium and AWS Inferentia2

  Рет қаралды 1,924

Julien Simon

Julien Simon

Күн бұрын

In this video, I show you how to accelerate Transformer training and inference with the Hugging Face Optimum Neuron library, a hardware acceleration library dedicated to AWS Trainium and AWS Inferentia 2, two custom AI chips designed by AWS.
First, changing a single line of code, I show you how to train a Vision Transformer model on the food101 datasets (75K training images). On a trn1.32xlarge instance, the model trains in under a minute per epoch.
Then, I show you how to export a DistilBERT model from the hub to Inferentia2. Running a benchmark on a inf2.xlarge instance, we get over 2000 predictions per second and P99 1-millisecond latency!
⭐️⭐️⭐️ Don't forget to subscribe to be notified of future videos ⭐️⭐️⭐️
Amazon EC2 Trn1: aws.amazon.com...
Amazon EC2 Inf2: aws.amazon.com...
Hugging Face Neuron AMI: aws.amazon.com...
Optimum Neuron documentation: huggingface.co...
Optimum Neuron Github: github.com/hug...
Code: gitlab.com/jul...

Пікірлер: 4
@caiyu538
@caiyu538 Жыл бұрын
Great lectures. Great teacher.
@smartin2minutes
@smartin2minutes Жыл бұрын
Hi @Julien Simon, thanks a lot for the video. I have checked your code. Looks like something I can use. One question, that I could not find anywhere, is there a way to use 'pipelines` with a Optimum Neuron model? I have a token classification task and pipeline just makes things easier to maintain. Will be very helpful if you have any examples. Thanks.
@FalahgsGate
@FalahgsGate Жыл бұрын
thanks for best video ❤
@juliensimonfr
@juliensimonfr Жыл бұрын
Most welcome
Transformer training shootout, part 2: AWS Trainium vs. NVIDIA V100
10:10
Accelerate Transformer training with AWS Trainium
19:30
Julien Simon
Рет қаралды 1,6 М.
ROSÉ & Bruno Mars - APT. (Official Music Video)
02:54
ROSÉ
Рет қаралды 86 МЛН
Accelerating Transformers with Hugging Face Optimum and Infinity
1:28:19
MLOps World: Machine Learning in Production
Рет қаралды 421
30 Programming Truths I know at 30 that I Wish I Knew at 20
17:41
Accelerate Transformer inference with AWS Inferentia
20:25
Julien Simon
Рет қаралды 2,4 М.
Accelerate PyTorch Transformers with Intel Sapphire Rapids, part 1
20:25
Accelerate Transformer inference on CPU with Optimum and ONNX
16:32
Julien Simon
Рет қаралды 4,7 М.