Creating JARVIS - Python Voice Virtual Assistant (ChatGPT, ElevenLabs, Deepgram, Taipy)

Рет қаралды 19,609

Күн бұрын

Check out the GitHub repository here:
github.com/AlexandreSajus/JARVIS
0:00 Talking to JARVIS
0:58 Intro
1:52 How JARVIS works
3:12 How to setup JARVIS
4:05 Getting API keys
5:05 Installing JARVIS
6:49 Running JARVIS
7:44 Talking to JARVIS
9:18 How to mod JARVIS for your use case
10:45 Recording audio using Pyaudio
12:25 Transcribing to text using Deepgram
12:45 Sending prompts to OpenAI GPT
13:14 Changing JARVIS' personality (context)
14:10 Generating voice using ElevenLabs
14:50 Playing audio using Pygame
15:15 Displaying the convo in a webpage with Taipy
16:40 Use cases and limitations

Пікірлер: 174

@joeternasky 7 ай бұрын

Fantastic project. Love how you connected these services and packages together. Thanks for going over the project, posting this video, etc. I learned quite a bit.

@alexandresajus 7 ай бұрын

Thank you very much!

@dwilson7230 5 ай бұрын

Bro this is sick as hell! Thanks for posting a video about it.

@alexandresajus 5 ай бұрын

Thanks! Glad you liked it!

@isagiyoichi5207 6 ай бұрын

this is actually really incredible thanks for the video

@alexandresajus 6 ай бұрын

Thanks!

@chrsl3 7 ай бұрын

Fantastic work and video, thank you!!

@alexandresajus 7 ай бұрын

Thanks!!!

@gr8tbigtreehugger 3 ай бұрын

Many thanks for this super helpful tutorial! My next step is voice ID, so the AI knows it's me!

@alexandresajus 3 ай бұрын

Thanks! Good luck!

@iandanforth 7 ай бұрын

Impressive! One key bit of the UX of ChatGPT mobile are the "clicks" that indicate when the model has 1. Stopped listening and 2. Stopped talking. A very small touch that makes a world of difference.

@alexandresajus 7 ай бұрын

Yes I should definitely find better ways to convey to the user when he is being listened to

@taylorsmith1720 4 ай бұрын

🎯 Key Takeaways for quick navigation: 01:02 *🚀 Overview of Voice Virtual Assistant Development* - Explanation of building a voice virtual assistant similar to Jarvis from Iron Man. - Overview of the backend workflow involving voice input, transcription, response generation, and audio output. - Introduction to third-party services like Deepgram, OpenAI, 11 Labs, and Taipy used in the development process. 03:21 *🔧 Installation Instructions for the Voice Virtual Assistant* - Cloning the GitHub repository and installing necessary requirements. - Setting up API keys for Deepgram, OpenAI, and 11 Labs. - Creating an environment file to store API keys securely. - Executing installation commands and waiting for requirements to install. 08:33 *🛠️ Running the Voice Virtual Assistant* - Instructions for running the display interface (`display.py`) and the main script (`main.py`). - Description of how the assistant listens, transcribes, generates responses, and displays conversations. - Example interaction demonstrating the assistant's response to user input. 09:28 *💡 Customization and Modification of the Voice Virtual Assistant* - Guidance on modifying the assistant for specific use cases. - Suggestions for changing context, models, and voices for customization. - Discussion of potential improvements, such as integrating news, adding memory, and overcoming latency limitations. Made with HARPA AI

@alexandresajus 4 ай бұрын

Now THAT is how you should advertise a product. Great summary!

@shawnmuok542 2 ай бұрын

hello i have a problem when i try to run main.py it shows me no moduel deepgram found

@xgodwhitex 7 ай бұрын

Amazing job!

@alexandresajus 7 ай бұрын

Thanks!!!

@JanikJanesch 22 күн бұрын

Do you know why thee is an error that it says i inly have 12 xaracters left but my request needs 42 caracters? even tho i have 20$ account balance on chatgpt.

@mikew2883 7 ай бұрын

Good stuff! 👍

@alexandresajus 7 ай бұрын

Thanks!

@muhammadilyasrasyid5817 7 ай бұрын

thank you very much sir

@PandaLorian14 Ай бұрын

dose noone get same code on deepgram me and zou dont got same code

@sebaperalta2001 7 ай бұрын

Nice work! Is it possible to have it answering only on activation word? Like if you don't say Jarvis, then it would not answer. So the program is always listening, but activates on context.

@alexandresajus 7 ай бұрын

Thanks! Yes this should be easy to do, just add a condition: if the activation word is not in the transcript, continue (restart the loop without answering)

@adben001 Ай бұрын

Will That generate Costs throught the API or is that for free?

@painperdu6740 7 ай бұрын

LETS GOOO NEW ALEXANDRE SAJUS VIDEO I CLICK LIKE I SUBSCRIBEEE

@alexandresajus 7 ай бұрын

XD J’en peux plus de toi

@crprp4769 5 ай бұрын

Awesome video! Thanks for sharing, but I've got a question. How can I implement a pre-trained OpenAI assistant into Taipy?

@alexandresajus 5 ай бұрын

Thanks! It should be quite simple. Just replace the model variable line 53 at 12:52 with your own model ("ft:gpt-3.5-turbo:my-org:custom_suffix:id") and it should work. Let me know if you need more help.

@Threecommaaclub 5 ай бұрын

Hey Alex, I'm using a Linux Device running python 3.11 venv, when i try to run main.py i get the following error " no module name pyaudio. i go about using the simple command pip install pyaudio, however when running that command i get greeted with this error, "could not build wheels for py audio, which is required to install pyproject.toml-based projects, i was hoping you may be able to share some insight into why this may be happening. Great video btw, i await your speedy response :)

@alexandresajus 5 ай бұрын

Were you able to solve this by creating a new virtual environment. Otherwise, I have no idea how to fix this, let me know if you find a solution

@Threecommaaclub 5 ай бұрын

@@alexandresajusyeah man we were able to make it happen once we used the virtual env thanks again

@alexandresajus 5 ай бұрын

@@Threecommaaclub Perfect!

@Firebabys89 3 ай бұрын

u are amazing dude

@alexandresajus 3 ай бұрын

Thank you!

@tismine 2 ай бұрын

Hey Alex! Thanks a lot for the video, can you please explain a good way to create a neat requirements.txt file after I'm done with a project?

@alexandresajus 2 ай бұрын

Sure! Use « pip list » in terminal to check which package versions you are using. Then create a requirements.txt at the root of your project with on each line « package_name==version » for only the packages you import within the code (not their dependencies)

@handlepersonthing 7 ай бұрын

Awesome work! I wonder if using the GPT-4 model would speed things up a bit?

@alexandresajus 7 ай бұрын

Thank you very much! Unfortunately, I don’t think switching the model would do a lot. Profiling here is 1s for transcribing, 1s for gpt and 2s for generating audio. The best way to reduce latency would be using smaller/quantized models or streaming data instead of doing each task sequentially

@serenditymuse 7 ай бұрын

@@alexandresajus larger models often take longer thinking.

@marouane9682 7 ай бұрын

i love it maaaaaaaan thank u for sharing .. pls keep sharing wiith us ur magic

@alexandresajus 7 ай бұрын

Thank you!

@marouane9682 6 ай бұрын

@@alexandresajus brother help me pls on my questions, .. how can i make jarvis able to transcribe and talk in french instead of english ?

@alexandresajus 6 ай бұрын

@@marouane9682 This shoud not be too hard, you just need to add a few parameters for Deepgram and Elevenlabs. For Elevenlabs, just change the voice parameter to "Pierre" or another french voice at line 116 of main.py. For Deepgram it is a bit more complicated, you will have to add a PrerecordedOptions parameter at line 72 of main.py which contains a language="fr" parameter. It's a bit too much to write in a comment so I invite you to take a look at the Deepgram doc (github.com/deepgram/deepgram-python-sdk/blob/main/README.md) Let me know if you need more help

@marouane9682 6 ай бұрын

@@alexandresajus thank you so much cheef

@pntra1220 7 ай бұрын

Nice project bro! Do you know how can I use deepgram to transcribe spanish voice? I already figured it out for elevenlabs but not for deeprgram. Thank you for taking the time to read this and continue making this videos!

@alexandresajus 7 ай бұрын

Thanks! I have not tried but there does seem to be the option to transcribe Spanish voice by using their nova-2 model and adding the parameter "language=es" to the query developers.deepgram.com/docs/language developers.deepgram.com/docs/models-languages-overview

@omjondhalefyco-9953 4 ай бұрын

What alternative can be used for elvenlabs

@alexandresajus 3 ай бұрын

I have not tried anything apart from Elevenlabs and google_tts. I was not impressed with the quality of google_tts, but it was way faster. I'm sure you'll find better answers online

@nightmare6159 3 ай бұрын

I need help, When I do pip install -requirements.txt it says there is no such directory even tho I see the file

@alexandresajus 3 ай бұрын

Make sure that you are in the right directory in your terminal. You can use ls in the terminal to check the contents of the directory you are in. You can switch directory using cd in the terminal or using "Open Folder..." in VSCode. In general, the syntax should be "pip install -r [PATH-TO-TXT]"

@edosetiawan9589 3 ай бұрын

Awesome!! How to make this project to access custom data

@alexandresajus 3 ай бұрын

A quick way to do this would simply be adding the data as a string in the context. This has its limitations (the context has a max length). If you want a chatbot that knows information from documents. I suggest you look into RAG models

@aashishkumarlohra277 2 ай бұрын

when i run python main.py . i get this error Traceback (most recent call last): File "E:\JARVIS_TEST\JARVIS\main.py", line 15, in from record import speech_to_text File "E:\JARVIS_TEST\JARVIS ecord.py", line 8, in from rhasspysilence import WebRtcVadRecorder, VoiceCommand, VoiceCommandResult ModuleNotFoundError: No module named 'rhasspysilence'

@alexandresajus 2 ай бұрын

Check this issue: github.com/AlexandreSajus/JARVIS/issues/4 Also try creating a new clean virtual env before installing requirements. Check if there are no errors during installation. Check that you are running main.py from that env. Check that rhasspysilence is installed with pip list

@PenguinjitsuX 6 ай бұрын

This is awesome! I am wondering though how much this project is costing you from API calls (if you were to use this daily and pretty often)? I'm planning to build a home assistant that can control all of my home gadgets and perform actions on my computer, but I'm trying to decide whether I should use all local models (whisper, coqui, and mistral) instead of the paid online services. The quality and speed is a bit lower locally, but it's free so I'm thinking about the tradeoff. Please let me know what you think, thanks!

@alexandresajus 6 ай бұрын

Hey! Thanks, glad you liked it! I recommend going the paid online route. ElevenLabs is a paid subscription at 5$/month for 30,000 characters. OpenAI and Deepgram are pay-per-request but are dirt cheap: for this whole project, I probably talked for an entire hour with JARVIS, and it cost me 12 cents on OpenAI and 40 cents on Deepgram. If you want to lower cost, find an ElevenLabs equivalent that is pay-per-request, and you'll be good. Going local will drastically reduce performance and speed unless you have proper hardware, i.e., a dedicated GPU cluster at home. You'll have to use open-source, quantized to 8Gb models. If you have adequate hardware though, going local might be a good idea since you'll keep performance, and you can reduce latency by half by hosting locally, doing code shenanigans to parallelize each task instead of running them sequentially, and generally optimizing the pipeline. Latency is the biggest drawback; JARVIS is at 4 seconds of latency. Even if it was 2 seconds, it is still too awkward for a conversation.

@PenguinjitsuX 6 ай бұрын

@@alexandresajus Thanks for the in-depth reply! That's awesome to see that it's so cheap. I was actually really lucky and got got a 4090 last week. I've been running tests - On whisper and llm inference, I got performance at almost real-time,

@alexandresajus 6 ай бұрын

@@PenguinjitsuX Wow you already made a lot of progress! Yeah unfortunately I think we are just a few years away to solve that performance-latency tradeoff for TTS, then we'll be able to have a proper conversational Jarvis. Is your project open-source? I would love to take a look if you'd let me. I don't have a Discord server but I'd love to keep in touch on Discord. Here's my username: alex_1337

@oldspammer 4 ай бұрын

Some operating system API exists for text to speech are free and can act instantly without having to transact information flows through the internet to some central system that might get bogged down with excess usage. I have noticed that if one becomes dependent upon something or someone, a monopoly situation may well result and you end up potentially having to pay pay pay for things that your local PC could have done for free on its own without the need of network data interactions. Often the distant server has a better sounding voice and it does not mispronounce as many words, but soon you shall be out sourcing too many things to outside entities where you become too dependent on them. If a set of 10 words or so are known to be mispronounced by the local speech api in your PC is there a way to have your PC handle those exception words with specialized processing where a sylable at a time is custom handled per each of the 10 exception words to save you from having to use an api key that can be withdrawn from handy use by the flick of a switch by the third party provider?

@s.gveeronstart4794 5 ай бұрын

sir can u teach how to made it i mean to say that if u make a play list according to this topic

@alexandresajus 5 ай бұрын

Unfortunately, I won't be making an extended tutorial on this in the near future. But I'm sure there are many tutorials on the tools I used on KZbin. You can just look up "ElevenLabs tutorial" or "OpenAI API tutorial".

@EnnoAI431 5 ай бұрын

Great Project!! Would it also run on a RaspberryPi? Recently I ran a project also called Jarvis on a Pi . You don't need the API's from Deepgram & Elevenlabs and also latency is pretty good. Although the voice was horrible.... unless you like robots :-).

@alexandresajus 5 ай бұрын

Thanks! Sure this should be able to run on Raspberry since all of the heavy stuff is third party services that are hosted so barely anything runs in local. Cool! Where can I take a look at your project?

@FantasyDark-ub3xh 4 ай бұрын

Sir i want to do like this sir is there any Free API available if not in OpenAI means, pls tell some other AI APIs to do ai tasks sir!

@alexandresajus 4 ай бұрын

Sir! If you search for them online, there should be free alternatives for the models I used in the video! I recommend looking at HuggingFace for an OpenAI alternative, sir! For example, the Mistral model has a free inference API that is only rate-limited, sir!

@charliepersonalaccount5276 3 ай бұрын

Great stuff man! What's the best way to chat with you? I have an mvp i want to run by you and maybe have you help me build it out

@alexandresajus 3 ай бұрын

Thanks. Feel free to reach out on Linkedin: www.linkedin.com/in/alexandre-sajus/ I don't have much time because of work, but I can take a look.

@PilotsPitstop 2 ай бұрын

what exactly did u purchase on the open ai api thingy for it not to return "exceeded current quota"? i payed for chat gpt "hobbyist" plan and thought that would help but nah i wasted 20 $. and u should def start a discord good stuff

@alexandresajus 2 ай бұрын

Ah I see, you’re not supposed to pay a chatgpt subscription. OpenAI have a website for their API where you just have to enter billing details and maybe add a dollar of credit to use. They charge per request and not on a subscription basis. It should be on the same site where you got your API key

@PilotsPitstop 2 ай бұрын

@@alexandresajus AH MY HERO SO FAST, so i just add some money to my account and boom it works?

@undeadgaming2102 4 ай бұрын

i want to ask can you make a video on how we can make it do different tasks???

@alexandresajus 4 ай бұрын

What task are you thinking about? If it's just asking about the weather, you can add the current weather to the context so Jarvis knows about the current weather

@undeadgaming2102 4 ай бұрын

@@alexandresajus i was thinking like a google assistant

@user-qw6zz7pr2x 5 ай бұрын

When I run display.py to start the web interface, it shows "ModuleNotFoundError: No module named 'taipy'". But then after I install taipy (version 3.0.0), it still gives me the same error message. I have tried to uninstall and install taipy but same error message...

@alexandresajus 5 ай бұрын

Are you sure you are running display.py from the Python environment where taipy is installed? Use `pip list` to check that taipy is installed and then `python display.py` to run the file. If this does not work, I suggest creating a new virtual environment and re-installing the requirements. Bear in mind that taipy only works with Python 3.8 to 3.11

@user-qw6zz7pr2x 5 ай бұрын

Thanks! Instead of click to run display.py, I typed in "python display.py" and it open the website! @@alexandresajus One more question- when I ran "python main.py", I got the error message "TypeError: 'ABCMeta' object is not subscriptable". I am using Python3.8.10 in Visual Studio.

@GameXnationOfficial 2 ай бұрын

"You exceeded your current quota, please check your plan and billing details" its showing something like this and jarvis is not replying after an error

@alexandresajus 2 ай бұрын

You've exceeded your free quota on one of the APIs, check on which function call this error gets triggered to see which API needs billing

@AdeniranFrancis Ай бұрын

whenever i see videos like these, i clone the repos and i am never, ever able to successfully install all the dependencies or requirements.txt. makes me want to give up writing code altogether.

@anirvindhch1209 3 ай бұрын

What are you using to code this Alexandre??

@alexandresajus 3 ай бұрын

What do you mean? I'm coding in Python using VSCode, I used external APIs like ElevenLabs, OpenAI, Deepgram. Libraries like Taipy for the interface. I use GitHub Copilot to help me code faster as well.

@AndroidePulpico 4 ай бұрын

The latency is preaty Bad, have you tried Whisper Jax or Faster Whisper ??

@alexandresajus 4 ай бұрын

Yeah, the latency issue is currently the worst one. I have not tried these services. Let me know if it speeds up things. Currently, the consensus for reducing latency seems to be streaming data, running the tasks in parallel instead of sequentially, and hosting local and smaller models.

@niyatibalsara9409 5 ай бұрын

im encountering webrtcvad installation error..please let me know what to do..its urgent.. i need it for my project

@niyatibalsara9409 5 ай бұрын

@alexandresajus

@alexandresajus 5 ай бұрын

Please refer to this fix, let me know if it works: github.com/AlexandreSajus/JARVIS/issues/3

@niyatibalsara9409 5 ай бұрын

PS C:\Users\HP\Desktop\JARVIS2> & c:/Users/HP/Desktop/JARVIS2/myvenv/Scripts/python.exe c:/Users/HP/Desktop/JARVIS2/JARVIS/main.py Traceback (most recent call last): File "c:\Users\HP\Desktop\JARVIS2\JARVIS\main.py", line 8, in from dotenv import load_dotenv ModuleNotFoundError: No module named 'dotenv' PS C:\Users\HP\Desktop\JARVIS2> pip install python-dotenv Collecting python-dotenv Downloading python_dotenv-1.0.1-py3-none-any.whl (19 kB) Installing collected packages: python-dotenv Successfully installed python-dotenv-1.0.1 [notice] A new release of pip available: 22.3.1 -> 24.0 [notice] To update, run: python.exe -m pip install --upgrade pip PS C:\Users\HP\Desktop\JARVIS2> .\venv\Scripts\Activate (venv) PS C:\Users\HP\Desktop\JARVIS2> python JARVIS\main.py pygame 2.5.2 (SDL 2.28.3, Python 3.11.2) Hello from the pygame community. www.pygame.org/contribute.html Traceback (most recent call last): File "C:\Users\HP\Desktop\JARVIS2\JARVIS\main.py", line 13, in import elevenlabs File "C:\Users\HP\Desktop\JARVIS2\venv\Lib\site-packages\elevenlabs\__init__.py", line 2, in from .simple import * # noqa F403 ^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\HP\Desktop\JARVIS2\venv\Lib\site-packages\elevenlabs\simple.py", line 113, in elevenlabs.set_api_key(os.getenv("ELEVENLABS_API_KEY")) ^^^^^^^^^^^^^^^^^^^^^^ AttributeError: partially initialized module 'elevenlabs' has no attribute 'set_api_key' (most likely due to a circular import) Please solve this error.. its urgent not working.. please help

@rodrigodifederico 5 ай бұрын

I did the same a few months ago but i made it all through a real phone number so you can actually call a number and an assistant will pick the call and talk to you about the shop services or clinic procedures, etc. Pretty nice lab.

@alexandresajus 5 ай бұрын

That is a great use case. Were there any issues surrounding the latency? Were there any customer complaints from people who found the delay in answering too long or did not want to talk to an AI?

@rodrigodifederico 5 ай бұрын

@@alexandresajus I reduced the delay by 90% running all the systems locally. The speech to audio generator, audio transcription, the language model, etc. The only remote api that i used was for the phone number ( twillio ). If you run everything through remote apis, the delay will be a real problem, won't work as an assistant over the phone because it may take up to 10 seconds for an answer. But running everything locally it's almost instant. For the voice part, both to text and back, i don't generate an audio file, i stream it, so there is no delay. With a few tricks, you can make it almost real time 🙂

@alexandresajus 5 ай бұрын

@@rodrigodifederico Great! Is there anywhere where I could take a look at that project. Which text-to-speech model are you using?

@rodrigodifederico 5 ай бұрын

@@alexandresajus I am planning to transform it into a product so for now i won't share the code but i'll record a live interaction video and upload it to youtube soon, ill drop the link here if you are interested. About the text to speech, i created my own model.. pretty similar to elevenlabs. But i have to say that if you use elevenlabs streaming, this part of the process will have a similar delay, so i might switch to elevenlabs stream in the future, unless i want to keep it 100% free of costs, then i would keep my model.

@alexandresajus 5 ай бұрын

@@rodrigodifederico Sure I'd love to see a demo

@edbayliss1862 6 ай бұрын

This really interested me. I modified it a bit to add a listen button to the UI so it only listen when you select listen, this is easier than a “wake word” Then I thought, integration. I use MacOS. I build a folder called modules, added a second step that parse the text through GPT again to match a dictionary, and then GPT decide which function in the dictionary matched and ran it. It worked great for checking calendar events etc, and if no matches were found it defaulted to gpt chat reponse but the extra layer added more latency and just isn’t scalable

@alexandresajus 6 ай бұрын

Incredible! Good work! Is there anywhere where we could check out your project?

@edbayliss1862 6 ай бұрын

@@alexandresajus sure, is your GitHub open to branches? I can just push it as a branch for you check out on Monday

@alexandresajus 6 ай бұрын

@@edbayliss1862 I'm not sure, I think it is open to fork then pull request. I think I need to manually add you as a collaborator if you want to directly push to a branch. Your call. Or you could just share the link of your repo if it is public.

@Jordan-tr3fn 6 ай бұрын

hey cool vids, why not using OpenAI for translation instead on Deepgram ? you could stream the audio and not have audio files

@alexandresajus 6 ай бұрын

This is indeed probably a better approach. I was not aware of it at the time

@tismine 2 ай бұрын

Are you sure OpenAI supports streamed audio input? I looked around all the places no one was able to do that...

@Jordan-tr3fn 2 ай бұрын

@@tismine « openai stream audio » on Google …

@DalazG 3 ай бұрын

Incredible material! Thanks bro, you're tutorials are super helpful for those learning to code. I'm trying to follow along Not sure if you've taken any subscriber requests. I've really wanted to find a tutorial on creating a machine learning model on python that can figure out its own strategy for successfully trading forex and integrating it with mql4 or 5. Definitely possible but there's next to no tutorials on this anywhere i noticed

@alexandresajus 3 ай бұрын

Thanks! Glad to know the video is helpful. This indeed seems to be a niche topic. I don’t think I could help you with this unfortunately since I don’t know anything about forex or mql.

@DalazG 3 ай бұрын

@alexandresajus no worries, this tutorial was super useful anyway! Subscribed. Curious, would ask these apis you used for this jarvis application cost a lot of money though? I know chatgpt api isn't free (just the free credits)

@alexandresajus 3 ай бұрын

@@DalazG The APIs did not cost that much: for the whole project I talked for about 2 hours to JARVIS. It cost less than a dollar for both Deepgram and OpenAI. ElevenLabs cost me 5$ only because they have a subscription based fee.

@DalazG 3 ай бұрын

@@alexandresajus gotcha, elevenlabs has a brilliant voice api. But just because it adds up, i would probably prefer to use a cheaper worse one 😅 .

@felipemartinez1924 5 ай бұрын

How do I change the speech recognition to spanish? Btw amazing work!

@alexandresajus 5 ай бұрын

Thanks! I have not tried another language but there does seem to be the option in Deepgram's API to transcribe Spanish voice by using their nova-2 model and adding the parameter "language=es" to the query developers.deepgram.com/docs/language developers.deepgram.com/docs/models-languages-overview

@felipemartinez1924 5 ай бұрын

@@alexandresajus Thanks, you're amazing! You should do a series of this kind of videos, maybe a Jarvis like this one but that is able to take action like opening a program, or saving reminders, stuff like that. Thank you very much and looking forward to more videos. :)

@jan-peterbornsen8506 5 ай бұрын

@@felipemartinez1924 Hey were you able to change the language of Deepgram's API? I want to change it to german but all my attempts failed so far... i tried just adding a language=de but its not helping in anyway...

@tomasrochaakemi 7 ай бұрын

hey alex! can you help me with this error? "ERROR: Failed building wheel for webrtcvad Failed to build webrtcvad ERROR: Could not build wheels for webrtcvad, which is required to install pyproject.toml-based projects"

@alexandresajus 7 ай бұрын

Sure! This is because you don't have Microsoft Visual C++ installed properly. I have written a guide on how to fix this here: github.com/AlexandreSajus/JARVIS/issues/3

@tomasrochaakemi 7 ай бұрын

@@alexandresajus hey man. it worked but now i got another error. while running python main.py this error apears: line 17, in set_api_key os.environ["ELEVEN_API_KEY"] = api_key ~~~~~~~~~~^^^^^^^^^^^^^^^^^^ File "", line 684, in __setitem__ File "", line 744, in check_str TypeError: str expected, not NoneType

@alexandresajus 7 ай бұрын

@@tomasrochaakemi This means that Python has tried to find a .env file with ELEVEN_API_KEY but has not found either the file or the key in the file. You'll need to create a .env file a the same level of main.py containing ELEVENLABS_API_KEY=[your-API-key] Please follow the Requirements and the How to Install Step 3 of my repository ( github.com/AlexandreSajus/JARVIS ). I mention these steps at 4:06 and 6:06 of the video.

@tomasrochaakemi 7 ай бұрын

@@alexandresajus I did it still shows this

@alexandresajus 7 ай бұрын

@@tomasrochaakemi Hmmm weird issue. As a workaround, just replace the 3 lines of os.getenv("...") by simply the API key as a string. For example: OPENAI_API_KEY = os.getenv("OPENAI_API_KEY") -> OPENAI_API_KEY = "YOUR-API-KEY"

@ezzeldinhany7301 4 ай бұрын

hi alex, it says no module named 'deepgram' after running python main.py in terminal what should i do?

@ezzeldinhany7301 4 ай бұрын

i also tried pip install deepgram and it did not work

@alexandresajus 4 ай бұрын

@@ezzeldinhany7301 Using the same terminal where you ran "python main.py", run "pip list" and check if deepgram if properly installed. I suggest you reinstall requirements into a clean environment for this. Let me know if this works.

@ezzeldinhany7301 4 ай бұрын

@@alexandresajus i did reinstall requirements during the process of trying to solve this problem

@alexandresajus 4 ай бұрын

@@ezzeldinhany7301 Did the terminal say that deepgram was successfully installed? Can you check with "pip list" if deepgram is installed? Can you check if you are running main.py from the environment where you installed deepgram? Once again, I strongly recommend creating a fresh Python environment using venv and installing the requirements there and checking everything above

@ezzeldinhany7301 4 ай бұрын

i now have fixed the deepgram issue but it says it cannot download rhasspysilence i tried with pip also @@alexandresajus

@NotZymsYT 3 ай бұрын

can anyone help be i keep getting "ERROR: Failed building wheel for pyarrow" ?

@alexandresajus 3 ай бұрын

Switch to Python 3.8 to 3.11. The Taipy version I am using is old and does not support Python 3.12. You can also try changing to taipy==3.1.0 in requirements.txt github.com/AlexandreSajus/JARVIS/issues/7

@NotZymsYT 3 ай бұрын

@alexandresajus you are awesome thank you so much !!!!

@NotZymsYT 3 ай бұрын

@@alexandresajus hey sorry to be a pest the original issue is fixed but now It seems like the api_key variable obtained from os.getenv("ELEVENLABS_API_KEY") is None, and the set_api_key function from the elevenlabs module is trying to set this None value as the value of the ELEVEN_API_KEY environment variable. However, environment variables must be strings, so attempting to assign None as the value raises a TypeError. im really new to all this and any help is super appreciated

@alexandresajus 3 ай бұрын

@@NotZymsYT os.getenv("ELEVENLABS_API_KEY") should not get None. Please make sure you properly do step 3 of the installation as described at 6:04: make sure you have a .env file at the same level as main.py and make sure it is filled with the API keys using the syntax described in the README

@NotZymsYT 3 ай бұрын

@@alexandresajus i ran through the whole video on extra slow and now its giving me Traceback (most recent call last): File "main.py", line 59, in file_name: Union[Union[str, bytes, PathLike[str], PathLike[bytes]], int] TypeError: 'ABCMeta' object is not subscriptable

@ibrahimqadirmustafa 6 ай бұрын

Amazing bro , I want create like this but in Kurdish language do you know how can i use it and speaking in Kurdish language?

@alexandresajus 6 ай бұрын

Thanks! Unfortunately this might be harder to do in Kurdish. You need to find services that support the Kurdish language which are quite rare: both Deepgram and Elevenlabs do not support Kurdish currently. I'd guess that OpenAI does support Kurdish but I am not sure, even if it does not you can use a service to do the English-Kurdish translation in the middle of the pipeline.

@ibrahimqadirmustafa 6 ай бұрын

@@alexandresajus Can I use Google translate package in python for translate the content response from AI

@alexandresajus 6 ай бұрын

@@ibrahimqadirmustafa Yes this would solve part of the problem

@ibrahimqadirmustafa 6 ай бұрын

@@alexandresajus ok thanks for you if i need help i can contact u 😁

@ashrafulislamemon8782 Ай бұрын

I am stuck at git clone

@_GIGABYTES 6 ай бұрын

Traceback (most recent call last): File "F:\va\New folder (3)\JARVIS\display.py", line 5, in from taipy.gui import Gui, State, invoke_callback, get_state_id ModuleNotFoundError: No module named 'taipy'

@alexandresajus 6 ай бұрын

Are you sure you installed the requirements of the project (5:33)?

@Threecommaaclub 5 ай бұрын

hey, im I'm not sure if you're still running into this issue however I was able to solve this dilemma by creating a virtual environment as stated in the video try creating a virtual environment and if you need help there is a another video on KZbin that should solve that issue.

@olakunleogunseye9657 7 ай бұрын

aye this is so cool but there is no wake up keep and end key buh this the greatest and I know you know

@blazzycrafter 6 ай бұрын

YOU STOLE MY WORK?........ ...... ...... ..... ..... ...... HOW THE HEK DID IT WORK? XD

@alexandresajus 6 ай бұрын

skill issue

@tchen8124 7 ай бұрын

What’s the point of using elevenlabs? Without carefully finetuning, the voice sounds robotic anyway. Kinda a waste of money

@alexandresajus 7 ай бұрын

What do you suggest I use? I looked for fast TTS AI services and stumbled upon Elevenlabs and did not ask too many questions. The whole point was trying to recreate Jarvis from Iron Man which has a robotic voice. It cost me a dollar for 30,000 characters

@kyouko5363 7 ай бұрын

@@alexandresajus I'm tempted to make a suggestion here but.. if it gets too popular I might not be able to use it anymore. I can't afford API keys, and rely on it every day to ingest documentation and large pieces of text without interrupting my programming. Even made a private Neovim plugin for it.. as for LLMs.. I am *this* close to saying to hell with it and writing a daemon or local webserver or something that'll instruct Selenium to forward queries and responses on a headless Chromium instance. I'm tired of there being no free API keys for LLMs, not even rate limited ones, when the browser experience is free to begin with, but the moment I want to see the text in my terminal and respond in my terminal, it suddenly costs money, despite me technically having reduced their server load by skipping all the unnecessary CSS, HTML and JS every time I want to just send and receive a goddamned string? I *thought* ChatGPT had a free rate limited API key, and conveniently around the time it became part of my workflow, the API credits equivalent of a free trial runs out, almost as if to give you a cake and then take it right back after the first bite. I'm rambling. But hey, at least I've got good TTS for free.

@GreggHoush 7 ай бұрын

You should disable those API keys and blur API keys in videos like these. Everybody wants free API keys.

@alexandresajus 7 ай бұрын

Good advice. I disabled these keys right after recording and they all have a hard rate limit

@PHG_Team 7 ай бұрын

bruh note: This error originates from a subprocess, and is likely not a problem with pip. ERROR: Failed building wheel for pyarrow Failed to build pyarrow ERROR: Could not build wheels for pyarrow, which is required to install pyproject.toml-based projects

@alexandresajus 7 ай бұрын

This is probably due to a Python version issue: you are probably using Python 3.12 and this project uses Taipy which only supports Python 3.8 to 3.11. Please try using another Python version. If this does not help, do not hesitate to give more details on the issue here: github.com/AlexandreSajus/JARVIS/issues

@PHG_Team 7 ай бұрын

@@alexandresajusthx bro. If i delete display.py the assistant works? I want to create mine gui

@alexandresajus 7 ай бұрын

@@PHG_Team Yes you can delete display.py, both programs are independant.

@PHG_Team 7 ай бұрын

i'm italian adn i want to change speaking lenguage how can i do?@@alexandresajus

@Mirkolinori Ай бұрын

Good Idea but Eleven Labs is to expensive, the price is more then horrible for live tts… better you use the build in OpenAi tts. Also you can use the openai api whisper, assistant gpt and tts… all with easy tts. Quick cheap and easy

@n00ter99 7 ай бұрын

That latency is painful

@alexandresajus 7 ай бұрын

Agreed, unfortunately that latency is very hard to shave off. We could probably reduce it a bit by hosting locally, using quantized/smaller models and streaming the data instead of doing each task sequentially

@chrsl3 7 ай бұрын

it works so wonderfully, i wouldn't be bothered at all by the small latency.

@n00ter99 7 ай бұрын

@@alexandresajus Measure the latencies of the things you mentioned - you'll find that implementing streaming all the way across the stack will solve most of it. I have spent the last year building low latency streaming models in order to get sub 100-millisecond latencies for various audio/speech startups, it's the only way to get speeds and responsiveness that feels natural

@alexandresajus 7 ай бұрын

@@n00ter99 I did profiling on each task and we are at about 1s for transcribing, 1s for gpt and 2s for generating audio. Really? Where can I find how to do this? What models/services were you using?