Speech LLMs: Models that listen and talk back

Рет қаралды 1,447

Efficient NLP

Күн бұрын

Пікірлер: 10

@escesc1 29 күн бұрын

Very interesting video, as usual!

@isaakcarteraugustus1819 16 күн бұрын

Can you also make a video about Moshi or Mimi and how they have been trained? Edit: maybe also mini-omni2?

@EfficientNLP 15 күн бұрын

Thanks for the suggestion; I will keep it in mind for the next video!

@NLPprompter 28 күн бұрын

wow this is exactly what I've been looking for, subscribed instantly. do you interested to cover more models? such as kyutai moshi, hertz-dev? they seems use different architecture.

@EfficientNLP 28 күн бұрын

Great suggestions! I haven't looked at these two, but they are certainly relevant.

@NLPprompter 28 күн бұрын

@EfficientNLP awesome, can't wait until next video. and... well they are pretty similar but i think the architecture inside is different, however they aren't as smart as openai realtime API. oh this one = llama-omni this one base on llama 3 with similar realtime AI Conversation

@lounes9777 21 күн бұрын

didn't check moshi from Kyutai ??

@EfficientNLP 20 күн бұрын

You are correct; this is a relevant model, and the field is evolving rapidly. However, the principles in this video should still apply.

@weizhou6544 28 күн бұрын

Can it support RAG?

@EfficientNLP 27 күн бұрын

Neither of the two models in this video have RAG, but it is possible to add a retrieval system prior to generation, since text tokens can be interleaved into speech LLMs.