if you have nvidia gpu then download the pytorch models or use the candle ones
@JankJank-om1opАй бұрын
the demo is comedy gold
@MasonJared-ms4twАй бұрын
😂😂😂😂
@saintkamus14Ай бұрын
I've noticed that this model is fast enough to start answering before you even finish asking a question. This is ideal for real time translations, and another use case I have in mind. This would be perfect for my use case if I could train custom voices.
@1littlecoderАй бұрын
Couldn't agree more! Just has to be more stable
@doglibraryАй бұрын
@@1littlecoder 🤣
@mohitranka9840Ай бұрын
What is your usecase?
@saintkamus14Ай бұрын
@@mohitranka9840 similar to real time translations, but "style" translations instead.
@RadiantNijАй бұрын
Yeah I'll wait until its stable moshi is unhinged I've heard it say terrible things on wes roth's channel
@burncloud-comАй бұрын
Thank you for your work, I got it running.
@1littlecoderАй бұрын
Awesome how was the experience M
@burncloud-comАй бұрын
@@1littlecoder I am unable to reach you on Discord.
@techfrenАй бұрын
omg yes they finally released it!
@ParvathyKapoorАй бұрын
thank god its available in Pinokio
@1littlecoderАй бұрын
Did you try it ?
@ParvathyKapoorАй бұрын
@@1littlecoder Ya works good! the model download is like 15gb!! still better
@BabylonBallerАй бұрын
This has to be the funniest interaction I've ever seen
@OliNorwellАй бұрын
I agree, this is a great release, everything there, PDF of the paper, Github code, runs locally inside 24GB VRAM. I'm astonished it works on a local machine, and the way it records the calls too is extremely cool. Yeah of course it is like talking to a moody 10 year old, but hey, we have to start somewhere.
@1littlecoderАй бұрын
Absolutely, I need to explore further to see how to customize the demo and programmatic access
@oldfairyАй бұрын
what a funny conversation
@RickySupriyadiАй бұрын
so this one doesn't need to convert our speech into text then feed the text into some llm? it just speech into llm directly?
@alx8439Ай бұрын
"What's your humor setting, TARS?" (c)
@1littlecoderАй бұрын
🤣
@alx8439Ай бұрын
@@1littlecoder I have a guess, with some bigger quant it might be bit more intelligent. Anyways, that was great demo. Thanks a lot mate, love all your videos
@1littlecoderАй бұрын
@@alx8439 thanks for the kind words! Appreciate the support. I didn't test the model with different configurations. Also the one I tested was a quanrized one.
@KevinKregerАй бұрын
Great fun. Should run on mobile soon?
@dr.mikeybeeАй бұрын
I installed this on my M1 mac mini, and it's too slow to be usable.
@RatonomistАй бұрын
normal it's shit for AI
@saintkamus14Ай бұрын
Moshi is mean AF. it told me "So let's solve this problem already" to which I said: "what problem, what problem do you want to solve?" and it said: "the problem of your stupidity" 😆
@imaginationsceneАй бұрын
💀
@shApYTАй бұрын
Glados tier
@juanjesusligero391Ай бұрын
Nice! :D Do you know what are the hardware requirements to run it on Windows? I've got an 8GB Nvidia GPU, and I'd like to test this, it seems really fun ^^
@1littlecoderАй бұрын
The Quantized version should definitely run fine on this machine
@juanjesusligero391Ай бұрын
@@1littlecoder That's great! :D Which one? q8? q4? (I think they should add the hardware requirements, but nobody usually does that ^^U)
@lakshyakumarpandey382Ай бұрын
Heyy !! Just a small question I got can we just set it up like attach it with other llm ( may be gpt , llama or any other )to get the text and then use it for the text to speech translation??
@aa-xn5hcАй бұрын
how to install in WSL ? int8 version
@marcfruchtman9473Ай бұрын
Traffic here was pretty awful too! heheh Not sure why you would pick an AI mode that was so adversarial? Thanks for the demo.
@kundanmitra34Ай бұрын
Can we fine tune this the llm?
@emmanuelkolawole6720Ай бұрын
Can we add context for the llm
@bakuleshsuhasrane8734Ай бұрын
Hey any suggestions on building VLM micro from scratch ?
@1littlecoderАй бұрын
Curious why do you want to build from scratch?
@bakuleshsuhasrane8734Ай бұрын
@@1littlecoder There are specialised applications like Geospatial, TexttoSQL, Robotics, Credit Data etc field specialised as it's in Research that fine-tuning only works well on already trained data if done on New Data it hallusinates Quality Data = Quality Output on Edge Devices
@bakuleshsuhasrane8734Ай бұрын
Beside like Moondancer2 cannot read Bill Pdfs
@bakuleshsuhasrane8734Ай бұрын
@@1littlecoder Specialised Usage Faster Processing Accurate one Fine-tuning causes more hallucinations - Research if it's not done on existing trained data Geospatial, Text2SQL with Screen , Credit Data , IOT etc applications
@rubbercable26 күн бұрын
I'm still pessimistic on this: I don't have ARM/RISC processors. - My PC is AMD, - My Mac is Intel (the Apple version requires an M2 processor) I hope a viable option is provided on day.
@juliana.2120Ай бұрын
i laughed so hard at 4:03 xD bro was just lying straight to your face before
@shruthirao7352Ай бұрын
Is rag possible with this. On custom knowledge and plug it as api or sdk
@1littlecoderАй бұрын
I'm still exploring if there's a programmatic way to access this
@VigneshK-o3lАй бұрын
how to run windows
@tvwithtiffaniАй бұрын
I seen someone else demo this library and it had the same glitch. It really feels like something is inverted....a vector or a prompt or something is being flipped at some point in that library. There's probably a typo somewhere in the code.
@captainoddessyАй бұрын
Although it has 5 minute limitation, but I think we can use it for customer support
@1littlecoderАй бұрын
I'm still not very sure if it is stable enough to be deployed in production. That'll be an interesting use case. We probably need more control over it
@rahim_khan_iitgАй бұрын
it is cute AI in pronounciation
@xXWillyxWonkaXxАй бұрын
Its really good. Really. But its no way close to the recently released Google Gemini Live's AI voice. It feels...a bit human like. Im not sure if it has something to do with the synthesizer engine
@1littlecoderАй бұрын
The fascinating part for me is it just runs, nothing fancy. I don't know what hardware google's AI voice is running. I still don't have access to Live.
@gidmanoneАй бұрын
The most important thing is if it can be interrupted while talking.
@jason_v12345Ай бұрын
Why is this the second video on Moshi I've seen today? Isn't this old news?
@1littlecoderАй бұрын
They just released the models a couple of days back. They had made the announcement long back
@InAMinute-ws3yvАй бұрын
its not supported for windows. also web demo version is lagging even though internet speed is very fast. so overall not as hyped as in these videos
@darksushi9000Ай бұрын
seriously I had it running on Windows 11 with a 3090. WAs super responsive but also liked telling me that it lived in the bronx new york and J-Lo was a Gangsters wife
@__________________________6910Ай бұрын
I think you should change the TTS, I don't like the TTS voice
@shApYTАй бұрын
They gave goody2 a voice
@mvasa2582Ай бұрын
maybe ChatGPT can get it incorporated ...?
@surflawebАй бұрын
Im sorry but I dont 😅
@1littlecoderАй бұрын
😭
@MGTOWUNIVERSEАй бұрын
LMAO, This bot is a troll!
@juliana.2120Ай бұрын
my moshi told me it tried heroin and it wasnt as bad as it thought lmaaoo
@MichealScott24Ай бұрын
❤🫡😂😂
@Sujal-ow7cjАй бұрын
😂😂😂😂😂
@sykexz6793Ай бұрын
and like alwas TTS is the bottleneck.
@1littlecoderАй бұрын
why would you say so?
@sykexz6793Ай бұрын
@@1littlecoder because the quality not that good, also it is not multilingual.
@1littlecoderАй бұрын
Got it
@Lorv0Ай бұрын
@@sykexz6793 let's remember this is the worse an open source model like this will ever be. This will be up as a foundation to other and better models in no time...
@sykexz6793Ай бұрын
@@Lorv0 i agree, but i don't see alot of progress in open source tts department especially on device. On the other hand we already got really good solutions for asr and llms on device.
@nauseouscustody1440Ай бұрын
No. I'm sorry. 😂😂
@MacorelppaАй бұрын
Who cares about running it locally if you can use top class OpenAI advanced voice mode
@juanjesusligero391Ай бұрын
Not your models, not your mind
@quebono100Ай бұрын
Let me guess you never was interested to play with me in chess.
@1littlecoderАй бұрын
I just play Bullet most of the time, so not a good chess player
@quebono100Ай бұрын
@@1littlecoder I play bullet as well 1 minute right now I play the most
@shekharkumar1902Ай бұрын
What's point of playing with a mess and wasting time ? I have created a talking RAG , will using OpenAI in backend. It is fast and furious.😊
@1littlecoderАй бұрын
You can play with open AI if you're okay to send your data to some server. Solution is not like that. In fact, the solution is completely different in terms of architecture. If wasting time is what we are talking about. Llms when they started were nothing like this. People thought we are wasting time. The same with stable diffusion initials were so ugly. In fact, they were so ugly that they became memes, but now we have one of the best realistic pictures. The future is only going to get better from now.
@xhridharАй бұрын
What’s the voice model are you using?
@zyxwvutsrqponmlkhАй бұрын
This is STT to Text LLM to TTS, its not S2S. Useless garbage.
@RickySupriyadiАй бұрын
kyutai CEO we must innovate on something! employee releasing i don't know bot