Speech LLMs: Models that listen and talk back

  Рет қаралды 1,447

Efficient NLP

Efficient NLP

Күн бұрын

Пікірлер: 10
@escesc1
@escesc1 29 күн бұрын
Very interesting video, as usual!
@isaakcarteraugustus1819
@isaakcarteraugustus1819 16 күн бұрын
Can you also make a video about Moshi or Mimi and how they have been trained? Edit: maybe also mini-omni2?
@EfficientNLP
@EfficientNLP 15 күн бұрын
Thanks for the suggestion; I will keep it in mind for the next video!
@NLPprompter
@NLPprompter 28 күн бұрын
wow this is exactly what I've been looking for, subscribed instantly. do you interested to cover more models? such as kyutai moshi, hertz-dev? they seems use different architecture.
@EfficientNLP
@EfficientNLP 28 күн бұрын
Great suggestions! I haven't looked at these two, but they are certainly relevant.
@NLPprompter
@NLPprompter 28 күн бұрын
@EfficientNLP awesome, can't wait until next video. and... well they are pretty similar but i think the architecture inside is different, however they aren't as smart as openai realtime API. oh this one = llama-omni this one base on llama 3 with similar realtime AI Conversation
@lounes9777
@lounes9777 21 күн бұрын
didn't check moshi from Kyutai ??
@EfficientNLP
@EfficientNLP 20 күн бұрын
You are correct; this is a relevant model, and the field is evolving rapidly. However, the principles in this video should still apply.
@weizhou6544
@weizhou6544 28 күн бұрын
Can it support RAG?
@EfficientNLP
@EfficientNLP 27 күн бұрын
Neither of the two models in this video have RAG, but it is possible to add a retrieval system prior to generation, since text tokens can be interleaved into speech LLMs.
Speculative Decoding: When Two LLMs are Faster than One
12:46
Efficient NLP
Рет қаралды 14 М.
Quantization vs Pruning vs Distillation: Optimizing NNs for Inference
19:46
Симбу закрыли дома?! 🔒 #симба #симбочка #арти
00:41
Симбочка Пимпочка
Рет қаралды 6 МЛН
Noodles Eating Challenge, So Magical! So Much Fun#Funnyfamily #Partygames #Funny
00:33
Thank you Santa
00:13
Nadir Show
Рет қаралды 52 МЛН
Can Whisper be used for real-time streaming ASR?
8:41
Efficient NLP
Рет қаралды 12 М.
RAG vs. Fine Tuning
8:57
IBM Technology
Рет қаралды 70 М.
Fine-tuning Whisper to learn my Chinese dialect (Teochew)
28:10
Efficient NLP
Рет қаралды 8 М.
Residual Vector Quantization for Audio and Speech Embeddings
13:53
Efficient NLP
Рет қаралды 3,6 М.
AI-generated text: Detection methods and countermeasures
14:42
Efficient NLP
Рет қаралды 658
Rotary Positional Embeddings: Combining Absolute and Relative
11:17
Efficient NLP
Рет қаралды 38 М.
Run ALL Your AI Locally in Minutes (LLMs, RAG, and more)
20:19
Cole Medin
Рет қаралды 265 М.
[1hr Talk] Intro to Large Language Models
59:48
Andrej Karpathy
Рет қаралды 2,3 МЛН
Симбу закрыли дома?! 🔒 #симба #симбочка #арти
00:41
Симбочка Пимпочка
Рет қаралды 6 МЛН