The ONLY Real Time Speech AI that can run locally!!!

  Рет қаралды 7,995

1littlecoder

1littlecoder

Күн бұрын

Пікірлер: 92
@1littlecoder
@1littlecoder Ай бұрын
I'm continuing to mess up French Names 😭
@adriangpuiu
@adriangpuiu Ай бұрын
ive got comforTABLE with that :))
@Nid_All
@Nid_All Ай бұрын
how can i run it on windows ?
@1littlecoder
@1littlecoder Ай бұрын
if you have nvidia gpu then download the pytorch models or use the candle ones
@JankJank-om1op
@JankJank-om1op Ай бұрын
the demo is comedy gold
@MasonJared-ms4tw
@MasonJared-ms4tw Ай бұрын
😂😂😂😂
@saintkamus14
@saintkamus14 Ай бұрын
I've noticed that this model is fast enough to start answering before you even finish asking a question. This is ideal for real time translations, and another use case I have in mind. This would be perfect for my use case if I could train custom voices.
@1littlecoder
@1littlecoder Ай бұрын
Couldn't agree more! Just has to be more stable
@doglibrary
@doglibrary Ай бұрын
@@1littlecoder 🤣
@mohitranka9840
@mohitranka9840 Ай бұрын
What is your usecase?
@saintkamus14
@saintkamus14 Ай бұрын
@@mohitranka9840 similar to real time translations, but "style" translations instead.
@RadiantNij
@RadiantNij Ай бұрын
Yeah I'll wait until its stable moshi is unhinged I've heard it say terrible things on wes roth's channel
@burncloud-com
@burncloud-com Ай бұрын
Thank you for your work, I got it running.
@1littlecoder
@1littlecoder Ай бұрын
Awesome how was the experience M
@burncloud-com
@burncloud-com Ай бұрын
@@1littlecoder I am unable to reach you on Discord.
@techfren
@techfren Ай бұрын
omg yes they finally released it!
@ParvathyKapoor
@ParvathyKapoor Ай бұрын
thank god its available in Pinokio
@1littlecoder
@1littlecoder Ай бұрын
Did you try it ?
@ParvathyKapoor
@ParvathyKapoor Ай бұрын
@@1littlecoder Ya works good! the model download is like 15gb!! still better
@BabylonBaller
@BabylonBaller Ай бұрын
This has to be the funniest interaction I've ever seen
@OliNorwell
@OliNorwell Ай бұрын
I agree, this is a great release, everything there, PDF of the paper, Github code, runs locally inside 24GB VRAM. I'm astonished it works on a local machine, and the way it records the calls too is extremely cool. Yeah of course it is like talking to a moody 10 year old, but hey, we have to start somewhere.
@1littlecoder
@1littlecoder Ай бұрын
Absolutely, I need to explore further to see how to customize the demo and programmatic access
@oldfairy
@oldfairy Ай бұрын
what a funny conversation
@RickySupriyadi
@RickySupriyadi Ай бұрын
so this one doesn't need to convert our speech into text then feed the text into some llm? it just speech into llm directly?
@alx8439
@alx8439 Ай бұрын
"What's your humor setting, TARS?" (c)
@1littlecoder
@1littlecoder Ай бұрын
🤣
@alx8439
@alx8439 Ай бұрын
@@1littlecoder I have a guess, with some bigger quant it might be bit more intelligent. Anyways, that was great demo. Thanks a lot mate, love all your videos
@1littlecoder
@1littlecoder Ай бұрын
@@alx8439 thanks for the kind words! Appreciate the support. I didn't test the model with different configurations. Also the one I tested was a quanrized one.
@KevinKreger
@KevinKreger Ай бұрын
Great fun. Should run on mobile soon?
@dr.mikeybee
@dr.mikeybee Ай бұрын
I installed this on my M1 mac mini, and it's too slow to be usable.
@Ratonomist
@Ratonomist Ай бұрын
normal it's shit for AI
@saintkamus14
@saintkamus14 Ай бұрын
Moshi is mean AF. it told me "So let's solve this problem already" to which I said: "what problem, what problem do you want to solve?" and it said: "the problem of your stupidity" 😆
@imaginationscene
@imaginationscene Ай бұрын
💀
@shApYT
@shApYT Ай бұрын
Glados tier
@juanjesusligero391
@juanjesusligero391 Ай бұрын
Nice! :D Do you know what are the hardware requirements to run it on Windows? I've got an 8GB Nvidia GPU, and I'd like to test this, it seems really fun ^^
@1littlecoder
@1littlecoder Ай бұрын
The Quantized version should definitely run fine on this machine
@juanjesusligero391
@juanjesusligero391 Ай бұрын
@@1littlecoder That's great! :D Which one? q8? q4? (I think they should add the hardware requirements, but nobody usually does that ^^U)
@lakshyakumarpandey382
@lakshyakumarpandey382 Ай бұрын
Heyy !! Just a small question I got can we just set it up like attach it with other llm ( may be gpt , llama or any other )to get the text and then use it for the text to speech translation??
@aa-xn5hc
@aa-xn5hc Ай бұрын
how to install in WSL ? int8 version
@marcfruchtman9473
@marcfruchtman9473 Ай бұрын
Traffic here was pretty awful too! heheh Not sure why you would pick an AI mode that was so adversarial? Thanks for the demo.
@kundanmitra34
@kundanmitra34 Ай бұрын
Can we fine tune this the llm?
@emmanuelkolawole6720
@emmanuelkolawole6720 Ай бұрын
Can we add context for the llm
@bakuleshsuhasrane8734
@bakuleshsuhasrane8734 Ай бұрын
Hey any suggestions on building VLM micro from scratch ?
@1littlecoder
@1littlecoder Ай бұрын
Curious why do you want to build from scratch?
@bakuleshsuhasrane8734
@bakuleshsuhasrane8734 Ай бұрын
@@1littlecoder There are specialised applications like Geospatial, TexttoSQL, Robotics, Credit Data etc field specialised as it's in Research that fine-tuning only works well on already trained data if done on New Data it hallusinates Quality Data = Quality Output on Edge Devices
@bakuleshsuhasrane8734
@bakuleshsuhasrane8734 Ай бұрын
Beside like Moondancer2 cannot read Bill Pdfs
@bakuleshsuhasrane8734
@bakuleshsuhasrane8734 Ай бұрын
​@@1littlecoder Specialised Usage Faster Processing Accurate one Fine-tuning causes more hallucinations - Research if it's not done on existing trained data Geospatial, Text2SQL with Screen , Credit Data , IOT etc applications
@rubbercable
@rubbercable 26 күн бұрын
I'm still pessimistic on this: I don't have ARM/RISC processors. - My PC is AMD, - My Mac is Intel (the Apple version requires an M2 processor) I hope a viable option is provided on day.
@juliana.2120
@juliana.2120 Ай бұрын
i laughed so hard at 4:03 xD bro was just lying straight to your face before
@shruthirao7352
@shruthirao7352 Ай бұрын
Is rag possible with this. On custom knowledge and plug it as api or sdk
@1littlecoder
@1littlecoder Ай бұрын
I'm still exploring if there's a programmatic way to access this
@VigneshK-o3l
@VigneshK-o3l Ай бұрын
how to run windows
@tvwithtiffani
@tvwithtiffani Ай бұрын
I seen someone else demo this library and it had the same glitch. It really feels like something is inverted....a vector or a prompt or something is being flipped at some point in that library. There's probably a typo somewhere in the code.
@captainoddessy
@captainoddessy Ай бұрын
Although it has 5 minute limitation, but I think we can use it for customer support
@1littlecoder
@1littlecoder Ай бұрын
I'm still not very sure if it is stable enough to be deployed in production. That'll be an interesting use case. We probably need more control over it
@rahim_khan_iitg
@rahim_khan_iitg Ай бұрын
it is cute AI in pronounciation
@xXWillyxWonkaXx
@xXWillyxWonkaXx Ай бұрын
Its really good. Really. But its no way close to the recently released Google Gemini Live's AI voice. It feels...a bit human like. Im not sure if it has something to do with the synthesizer engine
@1littlecoder
@1littlecoder Ай бұрын
The fascinating part for me is it just runs, nothing fancy. I don't know what hardware google's AI voice is running. I still don't have access to Live.
@gidmanone
@gidmanone Ай бұрын
The most important thing is if it can be interrupted while talking.
@jason_v12345
@jason_v12345 Ай бұрын
Why is this the second video on Moshi I've seen today? Isn't this old news?
@1littlecoder
@1littlecoder Ай бұрын
They just released the models a couple of days back. They had made the announcement long back
@InAMinute-ws3yv
@InAMinute-ws3yv Ай бұрын
its not supported for windows. also web demo version is lagging even though internet speed is very fast. so overall not as hyped as in these videos
@darksushi9000
@darksushi9000 Ай бұрын
seriously I had it running on Windows 11 with a 3090. WAs super responsive but also liked telling me that it lived in the bronx new york and J-Lo was a Gangsters wife
@__________________________6910
@__________________________6910 Ай бұрын
I think you should change the TTS, I don't like the TTS voice
@shApYT
@shApYT Ай бұрын
They gave goody2 a voice
@mvasa2582
@mvasa2582 Ай бұрын
maybe ChatGPT can get it incorporated ...?
@surflaweb
@surflaweb Ай бұрын
Im sorry but I dont 😅
@1littlecoder
@1littlecoder Ай бұрын
😭
@MGTOWUNIVERSE
@MGTOWUNIVERSE Ай бұрын
LMAO, This bot is a troll!
@juliana.2120
@juliana.2120 Ай бұрын
my moshi told me it tried heroin and it wasnt as bad as it thought lmaaoo
@MichealScott24
@MichealScott24 Ай бұрын
❤🫡😂😂
@Sujal-ow7cj
@Sujal-ow7cj Ай бұрын
😂😂😂😂😂
@sykexz6793
@sykexz6793 Ай бұрын
and like alwas TTS is the bottleneck.
@1littlecoder
@1littlecoder Ай бұрын
why would you say so?
@sykexz6793
@sykexz6793 Ай бұрын
@@1littlecoder because the quality not that good, also it is not multilingual.
@1littlecoder
@1littlecoder Ай бұрын
Got it
@Lorv0
@Lorv0 Ай бұрын
@@sykexz6793 let's remember this is the worse an open source model like this will ever be. This will be up as a foundation to other and better models in no time...
@sykexz6793
@sykexz6793 Ай бұрын
@@Lorv0 i agree, but i don't see alot of progress in open source tts department especially on device. On the other hand we already got really good solutions for asr and llms on device.
@nauseouscustody1440
@nauseouscustody1440 Ай бұрын
No. I'm sorry. 😂😂
@Macorelppa
@Macorelppa Ай бұрын
Who cares about running it locally if you can use top class OpenAI advanced voice mode
@juanjesusligero391
@juanjesusligero391 Ай бұрын
Not your models, not your mind
@quebono100
@quebono100 Ай бұрын
Let me guess you never was interested to play with me in chess.
@1littlecoder
@1littlecoder Ай бұрын
I just play Bullet most of the time, so not a good chess player
@quebono100
@quebono100 Ай бұрын
@@1littlecoder I play bullet as well 1 minute right now I play the most
@shekharkumar1902
@shekharkumar1902 Ай бұрын
What's point of playing with a mess and wasting time ? I have created a talking RAG , will using OpenAI in backend. It is fast and furious.😊
@1littlecoder
@1littlecoder Ай бұрын
You can play with open AI if you're okay to send your data to some server. Solution is not like that. In fact, the solution is completely different in terms of architecture. If wasting time is what we are talking about. Llms when they started were nothing like this. People thought we are wasting time. The same with stable diffusion initials were so ugly. In fact, they were so ugly that they became memes, but now we have one of the best realistic pictures. The future is only going to get better from now.
@xhridhar
@xhridhar Ай бұрын
What’s the voice model are you using?
@zyxwvutsrqponmlkh
@zyxwvutsrqponmlkh Ай бұрын
This is STT to Text LLM to TTS, its not S2S. Useless garbage.
@RickySupriyadi
@RickySupriyadi Ай бұрын
kyutai CEO we must innovate on something! employee releasing i don't know bot
Pixtral is REALLY Good - Open-Source Vision Model
11:15
Matthew Berman
Рет қаралды 62 М.
Contextual RAG is stupidly brilliant!
15:03
1littlecoder
Рет қаралды 16 М.
Random Emoji Beatbox Challenge #beatbox #tiktok
00:47
BeatboxJCOP
Рет қаралды 62 МЛН
БУ, ИСПУГАЛСЯ?? #shorts
00:22
Паша Осадчий
Рет қаралды 2,1 МЛН
風船をキャッチしろ!🎈 Balloon catch Challenges
00:57
はじめしゃちょー(hajime)
Рет қаралды 63 МЛН
Каха и лужа  #непосредственнокаха
00:15
Run ALL Your AI Locally in Minutes (LLMs, RAG, and more)
20:19
Cole Medin
Рет қаралды 228 М.
Can Whisper be used for real-time streaming ASR?
8:41
Efficient NLP
Рет қаралды 11 М.
How I animated Text like 3Blue1Brown with AI
13:33
1littlecoder
Рет қаралды 4 М.
Run your own AI (but private)
22:13
NetworkChuck
Рет қаралды 1,6 МЛН
Moshi The Talking AI
15:29
Sam Witteveen
Рет қаралды 17 М.
Kyutais New "VOICE AI" is INSANE (and open source)
13:10
Wes Roth
Рет қаралды 48 М.
Run a GOOD ChatGPT Alternative Locally! - LM Studio Overview
15:16
MattVidPro AI
Рет қаралды 46 М.
ell: A Powerful, Robust Framework for Prompt Engineering
15:04
Ian Wootten
Рет қаралды 30 М.
Random Emoji Beatbox Challenge #beatbox #tiktok
00:47
BeatboxJCOP
Рет қаралды 62 МЛН