Mixture of Experts LLM - MoE explained in simple terms

Mistral 8x7B Part 1- So What is a Mixture of Experts Model?

Mind Evolution: Deeper Thinking at Inference (by Google)

coco在求救？ #小丑 #天使 #shorts

Cat mode and a glass of water #family #humor #fun

СИНИЙ ИНЕЙ УЖЕ ВЫШЕЛ!❄️

人是不能做到吗？#火影忍者 #家人 #佐助

Mixture of Experts LLM - MoE explained in simple terms

Рет қаралды 15,041

Discover AI

Discover AI

Күн бұрын

Пікірлер: 21

@HugoCatarino Жыл бұрын

What a great class! Very much appreciated 🙌👏👏🙏

@javiergimenezmoya86

@javiergimenezmoya86 Жыл бұрын

Video implementation with MoE training with several swiching Lora layers would be great!

@patxigonzalez4206

@patxigonzalez4206 Жыл бұрын

Woah...thanks a lot for this clean and powerful explanation about this dense topics, as a representative of average people, I appreciate it so much.

@TylerLali Жыл бұрын

Hopefully this doesn’t sound entitled, but rather expresses my gratitude towards your excellent work - yesterday I did a KZbin search for MOE on this topic and saw several videos but decided not to watch others and rather wait for your analysis- and here I am today and this video enters my feed automatically :) Thanks for all you do for your community!

@suleimanshehu5839

@suleimanshehu5839 Жыл бұрын

Please create a video on Fine tuning a MoE LLM using LoRA adapters. Can one train individual expert LLM within a MoE such as Mixtral 8x7B

@hoangvanhao7092

@hoangvanhao7092 Жыл бұрын

00:02 Mixture of Experts LLM enables efficient computation and research allocation for AI models. 02:46 Mixture of Experts LLM uses different gating functions to assign tokens to specific expert systems. 05:24 Mega Blocks addressed limitations of classical MoE system and optimized block sparse computations. 08:12 Mixture of Experts selects the top K expert system based on scores. 10:59 Mixture of Experts LLM enhances model parameters without computational expense 13:33 Mixture of Experts LLM - MoE efficiently organizes student-teacher distribution 16:07 Block Spar formulation ensures no token is left behind 18:35 Mixture of Expert system dynamically adjusts block sizes for more efficiency in matrix multiplication 20:57 Mixture of expert layer consists of independent feed-forward experts with an intelligence gating functionality.

@yinghaohu8784 10 ай бұрын

In autoregressive model, the generation of the token is progressively. However, when will the router works? Is it in each pass or the routing will be decided at the very beginning ?

@TheDoomerBlox 7 ай бұрын

Is this where I raise the obvious question of "wouldn't a Grokked(tm) model be the perfect fit for an Expert-Picking mechanism?"

@darknessbelowth1409

@darknessbelowth1409 Жыл бұрын

very nice, thank you for a great vid.

@ricardocosta9336

@ricardocosta9336 Жыл бұрын

yaya!🎉🎉🎉🎉🎉 ty so much once again

@YashNimavat-b3s

@YashNimavat-b3s Жыл бұрын

which PDF reader you are using to read the research paper?

@LNJP13579 10 ай бұрын

Can you please share a link to your Presentation. Need to use the content to make my own abridged notes.

@Jason-ju7df Жыл бұрын

I wonder if I can get them to do RPA

@krishanSharma.69.69f

@krishanSharma.69.69f Жыл бұрын

I made them do SEX. I was tough but I managed.

@davidamberweatherspoon6131

@davidamberweatherspoon6131 Жыл бұрын

Can you explain to me how to mix MoE with Lora adapters?

@densonsmith2 Жыл бұрын

Do you have a patreon or other paid subscription?

@cecilsalas8721

@cecilsalas8721 Жыл бұрын

🤩🤩🤩🥳🥳🥳👍

@matten_zero Жыл бұрын

Hello!

@PaulSchwarzer-ou9sw

@PaulSchwarzer-ou9sw Жыл бұрын

❤

@omaribrahim5519

@omaribrahim5519 Жыл бұрын

cool but MoE is so fool

@EssentiallyAI Жыл бұрын

You're not Indian! 😁

Mistral 8x7B Part 1- So What is a Mixture of Experts Model?

12:33

Mistral 8x7B Part 1- So What is a Mixture of Experts Model?

Sam Witteveen

Рет қаралды 43 М.

Mind Evolution: Deeper Thinking at Inference (by Google)

19:54

Mind Evolution: Deeper Thinking at Inference (by Google)

Discover AI

Рет қаралды 228

coco在求救？ #小丑 #天使 #shorts

00:29

coco在求救？ #小丑 #天使 #shorts

好人小丑

Рет қаралды 120 МЛН

Cat mode and a glass of water #family #humor #fun

00:22

Cat mode and a glass of water #family #humor #fun

Kotiki_Z

Рет қаралды 42 МЛН

СИНИЙ ИНЕЙ УЖЕ ВЫШЕЛ!❄️

01:01

СИНИЙ ИНЕЙ УЖЕ ВЫШЕЛ!❄️

DO$HIK

Рет қаралды 3,3 МЛН

人是不能做到吗？#火影忍者 #家人 #佐助

00:20

人是不能做到吗？#火影忍者 #家人 #佐助

火影忍者一家

Рет қаралды 20 МЛН

Scale AI CEO Alexandr Wang on U.S.-China AI race: We need to unleash U.S. energy to enable AI boom

7:50

Scale AI CEO Alexandr Wang on U.S.-China AI race: We need to unleash U.S. energy to enable AI boom

CNBC Television

Рет қаралды 298 М.

NEW Transformer2: Self Adaptive PEFT Expert LLMs in TTA

36:52

NEW Transformer2: Self Adaptive PEFT Expert LLMs in TTA

Discover AI

Рет қаралды 3,7 М.

Google’s New AI Is Recreating the Whole World to Unlock Superhuman Intelligence

9:16

Google’s New AI Is Recreating the Whole World to Unlock Superhuman Intelligence

AI Revolution

Рет қаралды 202 М.

Finally: Grokking Solved - It's Not What You Think

27:02

Finally: Grokking Solved - It's Not What You Think

Discover AI

Рет қаралды 16 М.

Understanding Mixture of Experts

28:01

Understanding Mixture of Experts

Trelis Research

Рет қаралды 10 М.

Attention in transformers, step-by-step | DL6

26:10

Attention in transformers, step-by-step | DL6

3Blue1Brown

Рет қаралды 2,1 МЛН

host ALL your AI locally

24:20

host ALL your AI locally

NetworkChuck

Рет қаралды 1,6 МЛН

What is Mixture of Experts?

7:58

What is Mixture of Experts?

IBM Technology

Рет қаралды 12 М.

Transformers (how LLMs work) explained visually | DL5

27:14

Transformers (how LLMs work) explained visually | DL5

3Blue1Brown

Рет қаралды 4,4 МЛН

DeepSeek R1 vs o1: AI EXPLAINS Autonomy of Experts (a better MoE)

35:51

DeepSeek R1 vs o1: AI EXPLAINS Autonomy of Experts (a better MoE)

Discover AI

Рет қаралды 2,6 М.

coco在求救？ #小丑 #天使 #shorts

00:29

coco在求救？ #小丑 #天使 #shorts

好人小丑

Рет қаралды 120 МЛН