The ONLY Real Time Speech AI that can run locally!!!

Рет қаралды 7,995

1littlecoder

Күн бұрын

Пікірлер: 92

@1littlecoder Ай бұрын

I'm continuing to mess up French Names 😭

@adriangpuiu Ай бұрын

ive got comforTABLE with that :))

@Nid_All Ай бұрын

how can i run it on windows ?

@1littlecoder Ай бұрын

if you have nvidia gpu then download the pytorch models or use the candle ones

@JankJank-om1op Ай бұрын

the demo is comedy gold

@MasonJared-ms4tw Ай бұрын

😂😂😂😂

@saintkamus14 Ай бұрын

I've noticed that this model is fast enough to start answering before you even finish asking a question. This is ideal for real time translations, and another use case I have in mind. This would be perfect for my use case if I could train custom voices.

@1littlecoder Ай бұрын

Couldn't agree more! Just has to be more stable

@doglibrary Ай бұрын

@@1littlecoder 🤣

@mohitranka9840 Ай бұрын

What is your usecase?

@saintkamus14 Ай бұрын

@@mohitranka9840 similar to real time translations, but "style" translations instead.

@RadiantNij Ай бұрын

Yeah I'll wait until its stable moshi is unhinged I've heard it say terrible things on wes roth's channel

@burncloud-com Ай бұрын

Thank you for your work, I got it running.

@1littlecoder Ай бұрын

Awesome how was the experience M

@burncloud-com Ай бұрын

@@1littlecoder I am unable to reach you on Discord.

@techfren Ай бұрын

omg yes they finally released it!

@ParvathyKapoor Ай бұрын

thank god its available in Pinokio

@1littlecoder Ай бұрын

Did you try it ?

@ParvathyKapoor Ай бұрын

@@1littlecoder Ya works good! the model download is like 15gb!! still better

@BabylonBaller Ай бұрын

This has to be the funniest interaction I've ever seen

@OliNorwell Ай бұрын

I agree, this is a great release, everything there, PDF of the paper, Github code, runs locally inside 24GB VRAM. I'm astonished it works on a local machine, and the way it records the calls too is extremely cool. Yeah of course it is like talking to a moody 10 year old, but hey, we have to start somewhere.

@1littlecoder Ай бұрын

Absolutely, I need to explore further to see how to customize the demo and programmatic access

@oldfairy Ай бұрын

what a funny conversation

@RickySupriyadi Ай бұрын

so this one doesn't need to convert our speech into text then feed the text into some llm? it just speech into llm directly?

@alx8439 Ай бұрын

"What's your humor setting, TARS?" (c)

@1littlecoder Ай бұрын

🤣

@alx8439 Ай бұрын

@@1littlecoder I have a guess, with some bigger quant it might be bit more intelligent. Anyways, that was great demo. Thanks a lot mate, love all your videos

@1littlecoder Ай бұрын

@@alx8439 thanks for the kind words! Appreciate the support. I didn't test the model with different configurations. Also the one I tested was a quanrized one.

@KevinKreger Ай бұрын

Great fun. Should run on mobile soon?

@dr.mikeybee Ай бұрын

I installed this on my M1 mac mini, and it's too slow to be usable.

@Ratonomist Ай бұрын

normal it's shit for AI

@saintkamus14 Ай бұрын

Moshi is mean AF. it told me "So let's solve this problem already" to which I said: "what problem, what problem do you want to solve?" and it said: "the problem of your stupidity" 😆

@imaginationscene Ай бұрын

💀

@shApYT Ай бұрын

Glados tier

@juanjesusligero391 Ай бұрын

Nice! :D Do you know what are the hardware requirements to run it on Windows? I've got an 8GB Nvidia GPU, and I'd like to test this, it seems really fun ^^

@1littlecoder Ай бұрын

The Quantized version should definitely run fine on this machine

@juanjesusligero391 Ай бұрын

@@1littlecoder That's great! :D Which one? q8? q4? (I think they should add the hardware requirements, but nobody usually does that ^^U)

@lakshyakumarpandey382 Ай бұрын

Heyy !! Just a small question I got can we just set it up like attach it with other llm ( may be gpt , llama or any other )to get the text and then use it for the text to speech translation??

@aa-xn5hc Ай бұрын

how to install in WSL ? int8 version

@marcfruchtman9473 Ай бұрын

Traffic here was pretty awful too! heheh Not sure why you would pick an AI mode that was so adversarial? Thanks for the demo.

@kundanmitra34 Ай бұрын

Can we fine tune this the llm?

@emmanuelkolawole6720 Ай бұрын

Can we add context for the llm

@bakuleshsuhasrane8734 Ай бұрын

Hey any suggestions on building VLM micro from scratch ?

@1littlecoder Ай бұрын

Curious why do you want to build from scratch?

@bakuleshsuhasrane8734 Ай бұрын

@@1littlecoder There are specialised applications like Geospatial, TexttoSQL, Robotics, Credit Data etc field specialised as it's in Research that fine-tuning only works well on already trained data if done on New Data it hallusinates Quality Data = Quality Output on Edge Devices

@bakuleshsuhasrane8734 Ай бұрын

Beside like Moondancer2 cannot read Bill Pdfs

@bakuleshsuhasrane8734 Ай бұрын

@@1littlecoder Specialised Usage Faster Processing Accurate one Fine-tuning causes more hallucinations - Research if it's not done on existing trained data Geospatial, Text2SQL with Screen , Credit Data , IOT etc applications

@rubbercable 26 күн бұрын

I'm still pessimistic on this: I don't have ARM/RISC processors. - My PC is AMD, - My Mac is Intel (the Apple version requires an M2 processor) I hope a viable option is provided on day.

@juliana.2120 Ай бұрын

i laughed so hard at 4:03 xD bro was just lying straight to your face before

@shruthirao7352 Ай бұрын

Is rag possible with this. On custom knowledge and plug it as api or sdk

@1littlecoder Ай бұрын

I'm still exploring if there's a programmatic way to access this

@VigneshK-o3l Ай бұрын

how to run windows

@tvwithtiffani Ай бұрын

I seen someone else demo this library and it had the same glitch. It really feels like something is inverted....a vector or a prompt or something is being flipped at some point in that library. There's probably a typo somewhere in the code.

@captainoddessy Ай бұрын

Although it has 5 minute limitation, but I think we can use it for customer support

@1littlecoder Ай бұрын

I'm still not very sure if it is stable enough to be deployed in production. That'll be an interesting use case. We probably need more control over it

@rahim_khan_iitg Ай бұрын

it is cute AI in pronounciation

@xXWillyxWonkaXx Ай бұрын

Its really good. Really. But its no way close to the recently released Google Gemini Live's AI voice. It feels...a bit human like. Im not sure if it has something to do with the synthesizer engine

@1littlecoder Ай бұрын

The fascinating part for me is it just runs, nothing fancy. I don't know what hardware google's AI voice is running. I still don't have access to Live.

@gidmanone Ай бұрын

The most important thing is if it can be interrupted while talking.

@jason_v12345 Ай бұрын

Why is this the second video on Moshi I've seen today? Isn't this old news?

@1littlecoder Ай бұрын

They just released the models a couple of days back. They had made the announcement long back

@InAMinute-ws3yv Ай бұрын

its not supported for windows. also web demo version is lagging even though internet speed is very fast. so overall not as hyped as in these videos

@darksushi9000 Ай бұрын

seriously I had it running on Windows 11 with a 3090. WAs super responsive but also liked telling me that it lived in the bronx new york and J-Lo was a Gangsters wife

@__________________________6910 Ай бұрын

I think you should change the TTS, I don't like the TTS voice

@shApYT Ай бұрын

They gave goody2 a voice

@mvasa2582 Ай бұрын

maybe ChatGPT can get it incorporated ...?

@surflaweb Ай бұрын

Im sorry but I dont 😅

@1littlecoder Ай бұрын

😭

@MGTOWUNIVERSE Ай бұрын

LMAO, This bot is a troll!

@juliana.2120 Ай бұрын

my moshi told me it tried heroin and it wasnt as bad as it thought lmaaoo

@MichealScott24 Ай бұрын

❤🫡😂😂

@Sujal-ow7cj Ай бұрын

😂😂😂😂😂

@sykexz6793 Ай бұрын

and like alwas TTS is the bottleneck.

@1littlecoder Ай бұрын

why would you say so?

@sykexz6793 Ай бұрын

@@1littlecoder because the quality not that good, also it is not multilingual.

@1littlecoder Ай бұрын

Got it

@Lorv0 Ай бұрын

@@sykexz6793 let's remember this is the worse an open source model like this will ever be. This will be up as a foundation to other and better models in no time...

@sykexz6793 Ай бұрын

@@Lorv0 i agree, but i don't see alot of progress in open source tts department especially on device. On the other hand we already got really good solutions for asr and llms on device.

@nauseouscustody1440 Ай бұрын

No. I'm sorry. 😂😂

@Macorelppa Ай бұрын

Who cares about running it locally if you can use top class OpenAI advanced voice mode

@juanjesusligero391 Ай бұрын

Not your models, not your mind

@quebono100 Ай бұрын

Let me guess you never was interested to play with me in chess.

@1littlecoder Ай бұрын

I just play Bullet most of the time, so not a good chess player

@quebono100 Ай бұрын

@@1littlecoder I play bullet as well 1 minute right now I play the most

@shekharkumar1902 Ай бұрын

What's point of playing with a mess and wasting time ? I have created a talking RAG , will using OpenAI in backend. It is fast and furious.😊

@1littlecoder Ай бұрын

You can play with open AI if you're okay to send your data to some server. Solution is not like that. In fact, the solution is completely different in terms of architecture. If wasting time is what we are talking about. Llms when they started were nothing like this. People thought we are wasting time. The same with stable diffusion initials were so ugly. In fact, they were so ugly that they became memes, but now we have one of the best realistic pictures. The future is only going to get better from now.

@xhridhar Ай бұрын

What’s the voice model are you using?