Introduction to Mixture-of-Experts (MoE)

  Рет қаралды 1,199

AI Papers Academy

AI Papers Academy

Күн бұрын

In this video we go back to the extremely important Google paper which introduced the Mixture-of-Experts (MoE) layer with authors including Geoffrey Hinton.
The paper is titled Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer. MoE today is widely used in various top Large Language Models and interestingly, it was published at the beginning of 2017, while the Attention All You Need paper which introduced Transformers was published later that year, also by Google. It this video the purpose is to understand why the Mixture-of-Experts method is important and how it works.
Paper page - arxiv.org/abs/1701.06538
-----------------------------------------------------------------------------------------------
✉️ Join the newsletter - aipapersacademy.com/newsletter/
👍 Please like & subscribe if you enjoy this content
-----------------------------------------------------------------------------------------------
Chapters:
0:00 Why MoE is needed?
1:33 Sparse MoE Layer
3:41 MoE Paper's Figure

Пікірлер: 1
The moment we stopped understanding AI [AlexNet]
17:38
Welch Labs
Рет қаралды 858 М.
Large Language Models (LLMs) - Everything You NEED To Know
25:20
Matthew Berman
Рет қаралды 73 М.
Каха заблудился в горах
00:57
К-Media
Рет қаралды 10 МЛН
Slow motion boy #shorts by Tsuriki Show
00:14
Tsuriki Show
Рет қаралды 10 МЛН
World’s Largest Jello Pool
01:00
Mark Rober
Рет қаралды 110 МЛН
Soft Mixture of Experts - An Efficient Sparse Transformer
7:31
AI Papers Academy
Рет қаралды 4,6 М.
Coding a ChatGPT Like Transformer From Scratch in PyTorch
31:11
StatQuest with Josh Starmer
Рет қаралды 34 М.
AI in banking: TOP use cases and examples
4:53
Jelvix | TECH IN 5 MINUTES
Рет қаралды 3,5 М.
Mistral 8x7B Part 1- So What is a Mixture of Experts Model?
12:33
Sam Witteveen
Рет қаралды 40 М.
How to set up RAG - Retrieval Augmented Generation (demo)
19:52
Don Woodlock
Рет қаралды 22 М.
Intro to RAG for AI (Retrieval Augmented Generation)
14:31
Matthew Berman
Рет қаралды 48 М.
How I'd Learn AI (If I Had to Start Over)
15:04
Thu Vu data analytics
Рет қаралды 766 М.
Fast Inference of Mixture-of-Experts Language Models with Offloading
11:58
AI Papers Academy
Рет қаралды 1,2 М.
MIT Introduction to Deep Learning | 6.S191
1:09:58
Alexander Amini
Рет қаралды 429 М.
Мой новый мега монитор!🤯
1:00
Корнеич
Рет қаралды 907 М.
low battery 🪫
0:10
dednahype
Рет қаралды 1,8 МЛН
Как бесплатно замутить iphone 15 pro max
0:59
ЖЕЛЕЗНЫЙ КОРОЛЬ
Рет қаралды 8 МЛН