Рет қаралды 2,237
Tutorial 1, part 1, Hot Chips 2023, Sunday, August 27, 2023.
Organizers: Nathan Kalyanasundharam, CXL Board & AMD
This tutorial gives a brief introduction to basic concepts underlying ML inference and then gives overviews of several hot areas where current research is improving the performance and capabilities of ML inference. After the introduction, the hot areas covered in this part of the tutorial are how quantization of weights and parameters can be utilized to increase inference efficiency and techniques used to optimize inference in small mobile devices.
ML Inference Overview
Micah Villmow, NVIDIA
Quantization Methods for Efficient ML Inference
Amir Gholami, UC Berkeley
ML Inference at the Edge
Felix Baum, Qualcomm