Accelerate Transformer inference on CPU with Optimum and Intel OpenVINO

  Рет қаралды 2,373

Julien Simon

Julien Simon

Күн бұрын

In this video, I show you how to accelerate Transformer inference with Optimum, an open source library by Hugging Face, and Intel OpenVINO.
I start from a Vision Transformer model fine-tuned for image classification, and quantize it with OpenVINO. Running benchmarks on an AWS c6i instance (Intel Ice Lake architecture), we speed up the original model more than 20% and divide its size by almost 4, with just a few lines of simple Python code and just a tiny accuracy drop!
⭐️⭐️⭐️ Don't forget to subscribe to be notified of future videos ⭐️⭐️⭐️
⭐️⭐️⭐️ Want to buy me a coffee? I can always use more :) www.buymeacoffee.com/julsimon ⭐️⭐️⭐️
- Optimum: github.com/huggingface/optimum
- Optimum docs: huggingface.co/docs/optimum/o...
- Intel OpenVINO: docs.openvino.ai/latest/index...
- Original model: huggingface.co/juliensimon/au...
- Code: gitlab.com/juliensimon/huggin...

Пікірлер: 4
@douglasdbs7139
@douglasdbs7139 4 ай бұрын
So here I am searching on youtube some short tutorials on how to use OpenVINO for machine learning and then I found this video of this dude with a huge MAIDEN'S BEST OF THE BEAST album poster hanging on the wall/ceiling hahahaha. This album changed my life decades ago and made me a metal head for life. Oh, and thanks by the tutorial by the way. :)
@juliensimonfr
@juliensimonfr 4 ай бұрын
A double win, then. Thank you 🤘🤘🤘
@toufiqmusah6480
@toufiqmusah6480 2 ай бұрын
Thanks for the tutorial. Greatly appreciated..
@juliensimonfr
@juliensimonfr 2 ай бұрын
You are welcome!
Accelerate Transformer inference with AWS Inferentia
20:25
Julien Simon
Рет қаралды 2,3 М.
Why Does Diffusion Work Better than Auto-Regression?
20:18
Algorithmic Simplicity
Рет қаралды 255 М.
ОБЯЗАТЕЛЬНО СОВЕРШАЙТЕ ДОБРО!❤❤❤
00:45
Эффект Карбонаро и нестандартная коробка
01:00
История одного вокалиста
Рет қаралды 10 МЛН
Heartwarming Unity at School Event #shorts
00:19
Fabiosa Stories
Рет қаралды 24 МЛН
The moment we stopped understanding AI [AlexNet]
17:38
Welch Labs
Рет қаралды 839 М.
Water powered timers hidden in public restrooms
13:12
Steve Mould
Рет қаралды 725 М.
AI without GPUs: Using Intel AMX CPUs on VMware vSphere for LLMs
27:02
Intel Xeon CPUs vs Intel Core CPUs. What's the difference?
7:16
Scan Business
Рет қаралды 54 М.
Accelerate Transformer inference on CPU with Optimum and ONNX
16:32
Julien Simon
Рет қаралды 4,3 М.
A1111 WebUI with OpenVINO™ Toolkit for Intel® Arc™ GPUs
17:29
Intel Graphics
Рет қаралды 41 М.
Как бесплатно замутить iphone 15 pro max
0:59
ЖЕЛЕЗНЫЙ КОРОЛЬ
Рет қаралды 8 МЛН
iPhone 15 Pro Max vs IPhone Xs Max  troll face speed test
0:33
Vision Pro наконец-то доработали! Но не Apple!
0:40
ÉЖИ АКСЁНОВ
Рет қаралды 88 М.
Xiaomi SU-7 Max 2024 - Самый быстрый мобильник
32:11
Клубный сервис
Рет қаралды 540 М.