HC2023-T1.1: ML Inference

  Рет қаралды 2,237

hotchipsvideos

hotchipsvideos

10 ай бұрын

Tutorial 1, part 1, Hot Chips 2023, Sunday, August 27, 2023.
Organizers: Nathan Kalyanasundharam, CXL Board & AMD
This tutorial gives a brief introduction to basic concepts underlying ML inference and then gives overviews of several hot areas where current research is improving the performance and capabilities of ML inference. After the introduction, the hot areas covered in this part of the tutorial are how quantization of weights and parameters can be utilized to increase inference efficiency and techniques used to optimize inference in small mobile devices.
ML Inference Overview
Micah Villmow, NVIDIA
Quantization Methods for Efficient ML Inference
Amir Gholami, UC Berkeley
ML Inference at the Edge
Felix Baum, Qualcomm

Пікірлер
HC2023-T1.2: ML Inference
1:14:58
hotchipsvideos
Рет қаралды 780
HC2023-S7: ML-Inference
2:01:48
hotchipsvideos
Рет қаралды 1,7 М.
Миллионер | 2 - серия
16:04
Million Show
Рет қаралды 1,1 МЛН
Миллионер | 1 - серия
34:31
Million Show
Рет қаралды 2,7 МЛН
小蚂蚁会选到什么呢!#火影忍者 #佐助 #家庭
00:47
火影忍者一家
Рет қаралды 91 МЛН
Стойкость Фёдора поразила всех!
00:58
МИНУС БАЛЛ
Рет қаралды 7 МЛН
HC2023-S4: CPU 2
1:29:25
hotchipsvideos
Рет қаралды 1,1 М.
Query, Key and Value Matrix for Attention Mechanisms in Large Language Models
18:21
Quantization vs Pruning vs Distillation: Optimizing NNs for Inference
19:46
HC2023-S1: Processing in Memory
1:01:51
hotchipsvideos
Рет қаралды 1,8 М.
Demo: Optimizing Gemma inference on NVIDIA GPUs with TensorRT-LLM
12:21
Google for Developers
Рет қаралды 2,8 М.
HC2023-S5: ML-Training
1:07:00
hotchipsvideos
Рет қаралды 1,3 М.
Generative AI in a Nutshell - how to survive and thrive in the age of AI
17:57
HC2023-T2.2: Chiplets/UCI
1:54:23
hotchipsvideos
Рет қаралды 1,7 М.
Миллионер | 2 - серия
16:04
Million Show
Рет қаралды 1,1 МЛН