Coding an AI Voice Bot from Scratch: Real-Time Conversation with Python

Рет қаралды 52,046

AssemblyAI

Күн бұрын

Пікірлер: 77

@NatGreenOnline 8 ай бұрын

Using Groq / Mistral AI instead of OpenAI will greatly reduce the latency issue you have in your demo.

@logannon 8 ай бұрын

can you fine tune groq?

@AssemblyAI 7 ай бұрын

Great suggestion, we will explore this in the next tutorial. This one was meant to be as accessible as possible so that people could build quickly.

@조바이든-r6r 6 ай бұрын

@@logannon no its impossible to fine tune groq. thats the problem. you have to use rag instead of fine tuning. but if you wanna make chatbot for specific domain you should try other service

@TrilioniME 3 ай бұрын

How much does Mistral API cost?

@fatmayonca1723 4 ай бұрын

How is it from scratch? You are using 3 Api. Also assembly api doesn't transcribe live audio streams without setting up billing. You have to put minimum 10 dollars in it for that too. I don't have a problem with that. But I have a problem you not telling this in advance, at the start of the video. You actually never mention this in anywhere in the video. It doesn't respond after the introduction. That's how you find out the problem is billing. Not from the video. That was quite annoying to be honest. Potentially great video ruined by lack of transparency.

@yashmehta9299 14 күн бұрын

If you wish to make an apple pie from scratch, you must first invent the universe - Carl Sagan

@thesohailjafri Күн бұрын

I guess they have "Start building with the $50 free credit!" policy now

@PalashDandge 6 ай бұрын

i am getting error "Cannot find reference 'generate' in '__init__.py' " on from elevenlabs import generate, stream line can you please help me to resolve this issue

@JeffreyJohnson-vy1zm 8 ай бұрын

Two questions: How can we improve the latency between the patient's response and the AI voice reply? and What can be done for the AI Voice to account for patient input if the patient speaks while the AI voice is speaking?

@AssemblyAI 7 ай бұрын

Hi Jeffrey, two very good questions! These deserve a video on their own, to be honest. To improve latency one thing you could try is running the LLM locally so you can get a faster inference over calling openai's API. As for handling overlapping speech, I've written the program to stop listening when the AI voice is responding back. But what you could do, is run another thread that is still listening while the AI voice is speaking.

@EvertvanBrussel 7 ай бұрын

As for the latency, I was assuming the majority of the latency was actually coming from ElevenLabs? And likely also from whatever functions might be needed to actually check the availability of the dentist and then also to schedule the actual appointment in the end. Am I wrong? So yeah I think running the LLM locally will surely help, or using Groq, but I'm not convinced yet that that is the biggest bottleneck.

@mehmetbakideniz 3 ай бұрын

would you consider adding a web UI like gradio to this app so that we can send the demo to anyone if needed. this version only works if you run the actual code in your own environment.

@simonsandeep4977 6 ай бұрын

The programming is not responding after the first introduction ,as shown in the video ;though even after using the github code. Any alternative with step by step instruction video ?

@FaisalKhrisan 6 ай бұрын

But I still have problems it says that [from elevenlabs import generate, stream ImportError: cannot import name 'generate' from 'elevenlabs'] how come

@Ghosty0069 5 ай бұрын

i have the exact same error did you fix it ?

@LO-FI_walah_BABA 22 күн бұрын

change the version of python to 1.10 or +

@JokerJarvis-cy2sw 8 ай бұрын

Please a tutorial on llava vision model to analyze video live with cv2 And I am unable to get my API token from assembly AI website please fix it

@randotkatsenko5157 5 ай бұрын

Hi nice tutorial. I have coded real-time voice bot for phone conversations in Twilio. The latency comes from text-to-speech mostly and gpt response time. I'm guesing if either ones speed can be reduced about 2-3x, then the response time would be fast enough. In human conversation, we expect the response within 1 second....and anything above that seems more unnatural. I'm sure the speed issues will be solved with new Nvidia GPU-s or other hardware innovations.

@rammohanbethi 5 ай бұрын

Hi, can you please let me know how you developed the voice bot using Twilio’s, even I’m looking for such kind of bot. It would be helpful

@randotkatsenko5157 5 ай бұрын

@@rammohanbethi Hi, how can you let you know - its a lot of complicated server side code in node js and some python... The setup is too complex to explain in a comment. We make this as part of AI automation services for businesses.

@Sibixpur 3 ай бұрын

@@randotkatsenko5157 bro speaking as if he coded all the logic voice bot , bruhh you're just hitting API's that ain't complex....

@yitaowang8547 2 ай бұрын

Thank you! Such a useful application and well explained ❤

@uttamdwivedi7709 7 ай бұрын

I followed this tutorial then in the end I realized .. assemblyAI doesn't provide the support for the Japanese language in the live Reltimetranscriber. Which sucks .. lol can't use it. Any help? @assemblyAI

@bens4446 5 ай бұрын

Thanks. First time I hear of AssemblyAI. Everyone talks about faster_whisper and Deepgram. Is AssemblyAI better for STT?

@GoDFazel2 Ай бұрын

no its not

@daeralbra 7 ай бұрын

The only downside is the fact it takes a while to respond with voice.

@iainhmunro 6 ай бұрын

Hi There - I was just looking at the code. Where is the appointment setting details / info coming from ?

@AssemblyAI 6 ай бұрын

All that is coming from the LLM we are using, so it's not hard-coded.

@shissncg 2 ай бұрын

How do you grab the audio once the RealtimeTranscript has finalized? For example, could you pass the audio rather than the text to generate_ai_response?

@TheBestgoku 7 ай бұрын

why not chunk text and output instead of output after all text is generated?

@thebackpainmiracle 6 ай бұрын

Exactly what I was intending on making. Thanks!

@MuskaanKhan.31 4 ай бұрын

Hey there are you learning to create generative ai models If yes please reply I have project for you By creating this project you can practice how to create ai model as well as you can include this in your resume for job search and this will also be help full for me

@avataraang3334 4 ай бұрын

@@MuskaanKhan.31 I am interested in a project! Just need required data and the objective you have in mind

@jhinaouiroudayna4275 4 ай бұрын

assembly ai APIs requires a credit card for this task

@nagarajdoddamani697 Ай бұрын

in py laptop the brew not installing, and also in program is not working

@abdulazad8432 2 ай бұрын

Can it be inducted into Aurdino board?

@yuchengpeng7706 7 ай бұрын

This video is so great! I'm following your video but now I ran into this problem, I can install the package in Pycharm with Windows system, but I got this error: OSError: Cannot find mpv-1.dll, mpv-2.dll or libmpv-2.dll in your system %PATH%. I'm a researcher in the art field with only a debutant python knowledge, could you help me solve this problem? Thanks a lot!

@abibusiness1085 20 күн бұрын

How to install mpv on windows?

@euginekholmogorov5196 7 ай бұрын

amazing lady and also an engineer omg)) thank you a million, I'll just add this to my stack

@urekmazino1327 6 ай бұрын

why are you saying fro. scratch if you're only using api

@theghostyced 6 ай бұрын

how would you handle interruptions while the ai is talking?

@sarap.sadegh4691 7 ай бұрын

hi thanks for your video . i want Api real time conversation with python for Farsi language . the LLM support Farsi language?

@Akash-nb9sv 2 ай бұрын

may how to install brew does not have for windows other option for windows

@Alex-qo5je 7 ай бұрын

How can i conect to my phone number and google calendar?🙏🏼

@AssemblyAI 6 ай бұрын

You can make use of the Google API for google calendar and something like Twilio's API for making phone calls.

@vishalsaichindepalli2798 8 ай бұрын

For some reason, the microphone isn't picking up my voice. I enabled all permissions on my mac and am still having trouble. Is there any way to fix this?

@michaelnumnum 7 ай бұрын

I think you need to pay for the real-time transcription for this at AssemblyAI

@Vrilogs 6 ай бұрын

streaming from assembly ai is a paid service. So, first you need add balance into your account. If you have not done that yet. Hope that helps :)

@JR-joren Ай бұрын

nice but the lagging time is too long unfortunately.

@jeevanjaison9646 6 ай бұрын

The assembly ai api is not free.

@urekmazino1327 6 ай бұрын

any way to make one with adam voice like the one in elevenlabs?😊

@alifetechgenius3804 2 ай бұрын

Source code Not Available

@CharlesZulu-v8g Ай бұрын

your free api does not work in my project

@mrunexpected10 8 ай бұрын

can u make just a chat bot word to voice

@pawanmaurya1554 27 күн бұрын

❤❤❤❤❤so wonderful project

@ac3inlondon531 5 ай бұрын

why are you using Mac omg

@viditsharma6990 7 ай бұрын

i am facing the mpv value error on windows i already installed it many times how can i fix that

@sethuraman9884 6 ай бұрын

just use vlc instead mpv bro

@조바이든-r6r 6 ай бұрын

@@sethuraman9884 thank you guys

@조바이든-r6r 6 ай бұрын

or check environment path of mpv. when you command mpv --version on cmd. you have to see its running

@nithishreddy7684 6 ай бұрын

An error occured: Could not connect to the real-time service: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:997) what to do with this error?

@islamicinterestofficial 6 ай бұрын

same error. You found the solution?

@chittisai47 6 ай бұрын

most likely your microphone is switched off pls check

@rachid6904 5 ай бұрын

i've got same: An error occured: Could not connect to the real-time service: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1006)