Make an Offline GPT Voice Assistant in Python

Рет қаралды 14,431

Күн бұрын

Пікірлер: 56

@iyas5398 3 ай бұрын

if u had a problem with the vocab file download so basically its vocab.bpe not vocab.pbe u just need to change this in the curl command and it should work just fine

@jakeeh 3 ай бұрын

Thanks for the comment!

@joshuashepherd7189 7 ай бұрын

4:36 I think its Video RAM - Basically the RAM available on whichever GPU you're using for inference

@jakeeh 7 ай бұрын

Yeah I think you're right. Thanks! :)

@MyStuffWH 7 ай бұрын

Just out of interest. Do you have a GPU in your machine (laptop/desktop)? That would give some context to the performance you are getting.

@jakeeh 7 ай бұрын

Great question! I have an AMD Radeon RX 6800. So certainly not top of the line. Also, in my experience a lot of GPU accelerated things have only worked with NVidia with AMD being a 'TODO' on the developers side :)

@Truther_gold 11 күн бұрын

the vocab.pbe file went down again, I also don't have a .cache folder after installing whisper

@EduGuti9000 7 ай бұрын

Awesome Video! I am mainly GNU/Linux user and recently I am using also MS Windows, so may be this is a silly question: Are you running that in WSL2? If so, it is easy to use microphone and speakers with Python in WSL2?

@jakeeh 7 ай бұрын

Thank you! I'm running this on Windows. You might need to tinker around on GNU/Linux a bit more to get it working for the microphone input, but it shouldn't be too bad. I've seen a number of cases where linux users were using the microphone input. Happy coding :)

@mohanpremathilake915 4 ай бұрын

Thank you for the great content

@jakeeh 4 ай бұрын

Thank you! ❤️/ Jake

@adish233 7 ай бұрын

As part of my engineering project , I want to make a similar voice assistant specifically for agriculture which clears farmer's queries and also gives crop suggestions based on the local conditions.Can you please guide me through the project?

@jakeeh 7 ай бұрын

Wow! That sounds like a great project! I'm not sure I could guide you through the project, but you may want to try to find a machine learning model that is more specialized on plants and agriculture. You could even look into making one yourself if you have enough training data! :)

@joannezhu101 4 ай бұрын

@@jakeeh I am so curious to know how to train a domain-knowledge only model, that would be brilliant. There must be a way of doing it, I am also learning AI for fun out side of my day job.

@HassanAllaham 2 ай бұрын

There is no model specialized in plants and agriculture BUT: You can use RAG (Retrieval Augmented Generation). This natural language processing (NLP) technique combines the strengths of both retrieval- and generative-based artificial intelligence (AI) models. In this method, you need to have all the needed data and info you want the model to be specialized in. This data can be any file/files. In this method, the data will be extracted from the files and changed into chunks of some size. after that, those chunks will be changed into the semantic meaning equivalent numbers that LLM can understand. This process needs one of what is called Embedding Models, which is specialized in changing text chunks into semantic meaning equivalent numbers (Embedded Info) this "Embedded Info" is stored in a Vector Database. This process is done only one time. After that when the user asks a question, it will be chunked and changed into embed and searched for any related info in the Vector Database (The search here not done by words but by the meaning and context) - (By the way this method can be made super powerful using a stage called Re-ranking). The result of such a query will be given to the model to use + the question so the model has the needed info to give a good answer. This is the method to make any "small" LLM able to do the job you want it to do. The other method is to "Fine-Tune" the model it will need a very strong GPU takes a lot of time and will need to have a special dataset (The problem with this method is that you will not find a ready dataset and it will take a very long time to prepare a good one). Short answer: Read about RAG (then Reranking) and you may find many YT videos that explain it. RAG is the way you need to make any LLM specialized in any domain that you have data files needed to teach/help the model. 💥

@ishmeetsingh5553 2 ай бұрын

Awesome video! Quick question, when you already used the speech_recognition library, why didn't you use the recognize_whisper method from it and used the whisper library instead?

@jakeeh 2 ай бұрын

Thanks! To be honest, I had no idea speech_recognition had that. How does it compare?

@vidadeperros9763 5 ай бұрын

Hi Jake. Where do you import pyautogui from?

@jakeeh 5 ай бұрын

Hey, you need to install it using pip. Run “python -m pip install pyautogui” then you can just import pyautogui in your file. Make sure you use the same python when running your file as you do when you install with pip

@ShaanKevin Ай бұрын

I encountered an error when i try to run the program - Traceback (most recent call last): File "C:\Users\shaan\Desktop\AI\assissant1\assissant.py", line 4, in import whisper File "C:\Users\shaan\Desktop\AI\Venv\Lib\site-packages\whisper\__init__.py", line 8, in import torch File "C:\Users\shaan\Desktop\AI\Venv\Lib\site-packages\torch\__init__.py", line 148, in raise err OSError: [WinError 126] The specified module could not be found. Error loading "C:\Users\shaan\Desktop\AI\Venv\Lib\site-packages\torch\lib\fbgemm.dll" or one of its dependencies. Could you help me to resolve this issue ?

@jakeeh Ай бұрын

Hmm, it looks like you might need torch installed. Try doing pip install torch and then close your terminal and try again.

@joshuashepherd7189 7 ай бұрын

Heyo! Awesome Video! Thanks so much for doing this man. So insightful

@jakeeh 7 ай бұрын

Appreciate it! Really happy you enjoyed it :)

@jacklee4691 4 ай бұрын

Thanks for the awesome video! just curious, if I want to make the python text-to-speech offline more realistic with model (like in hugging face) is it possible?

@jakeeh 4 ай бұрын

Yeah it should be possible! There are some great OLlama models available now too :)

@atharvchaudhary3109 2 ай бұрын

Hi, I was trying to follow along kinda and understand this project, but I ran into an error where it can't find my command.wav file. I've exhausted my options on solving this, so if you could help that would be great.

@jakeeh 2 ай бұрын

Hmm, interesting. Are you sure that it’s saving the .wav file to the same directory as your py file?

@wethraccoon9480 4 ай бұрын

Please do more advanced versions of this, I am a web dev and would love to start integration my own voice assitance, I'm just a bit newbe to AI

@jakeeh 4 ай бұрын

Thanks for the comment! Yeah, I’d be happy to do some more stuff on this. I think the new version would use OLlama. Although I’d like to go over how to train your own model too.

@snapfacts41 3 ай бұрын

I think gemma could be a better option than this cus i dont think that it would have the token restrictions that gpt4all had, and its pretty easy to install using ollama. even with an integrated gpu from 5 years ago, i was able to get a comfortable experience with the llm model.

@jakeeh 3 ай бұрын

Yeah at the time llama wasn’t easily available for windows. There are definitely some better things available now

@Kevin-th5rw 2 ай бұрын

Thx for the help

@jakeeh 2 ай бұрын

Happy you found it useful! :)

@ianhampton9210 22 күн бұрын

Hey at 11:25 I keep getting this error when it tries to read the data from the command.wav file: Traceback (most recent call last): File "c:\Users\username\Desktop\GPT\Python\LocalGPT\assistant.py", line 106, in main() File "c:\Users\username\Desktop\GPT\Python\LocalGPT\assistant.py", line 97, in main command = listen_for_command() File "c:\Users\username\Desktop\GPT\Python\LocalGPT\assistant.py", line 46, in listen_for_command command = base_model.transcribe("C:/Users/recent /Desktop/GPT/Python/LocalGPT/command.wav") File "C:\Users\username\AppData\Local\Programs\Python\Python310\lib\site-packages\whisper\transcribe.py", line 122, in transcribe mel = log_mel_spectrogram(audio, model.dims.n_mels, padding=N_SAMPLES) File "C:\Users\username\AppData\Local\Programs\Python\Python310\lib\site-packages\whisper\audio.py", line 140, in log_mel_spectrogram audio = load_audio(audio) File "C:\Users\username\AppData\Local\Programs\Python\Python310\lib\site-packages\whisper\audio.py", line 58, in load_audio out = run(cmd, capture_output=True, check=True).stdout File "C:\Users\username\AppData\Local\Programs\Python\Python310\lib\subprocess.py", line 501, in run with Popen(*popenargs, **kwargs) as process: File "C:\Users\username\AppData\Local\Programs\Python\Python310\lib\subprocess.py", line 966, in __init__ self._execute_child(args, executable, preexec_fn, close_fds, File "C:\Users\username\AppData\Local\Programs\Python\Python310\lib\subprocess.py", line 1435, in _execute_child hp, ht, pid, tid = _winapi.CreateProcess(executable, args, FileNotFoundError: [WinError 2] The system cannot find the file specified (Sorry for the long text) I'm assuming this means it cant find the file but Idk for sure. Not super well versed in python. Please help!

@jakeeh 20 күн бұрын

Hey Ian, Yeah, my initial guess would be it can't find the audio file? Can you confirm that it's in the same directory as the script that's running?

@ianhampton9210 20 күн бұрын

@@jakeeh Thanks for the reply. It is indeed. I did a bunch of research and it seems that Whisper needs a module called ffmpeg but the problem is that every way that people have said online on how to install it hasn't worked

@jamesnorrington8231 2 ай бұрын

Hey man loved the video I just have one doubt someone has built a tars robot replica form Interstellar which can communicate in tars voice and answer any question. Is there any way to give this assistant custom voice.

@jakeeh 2 ай бұрын

Hey, Yeah, that should definitely be possible! You would need to create a model that’s capable of doing that, but there are some free libraries that can help with that already. One example is tortoise. Maybe I’ll make a video about it in the future :)

@lucygelz 4 ай бұрын

is this possible on linux and if so can you make a tutorial or link a text guide to something similar

@jakeeh 4 ай бұрын

Yes absolutely you should be able to do this on Linux as well. You could take a look at medium.com/@vndee.huynh/build-your-own-voice-assistant-and-run-it-locally-whisper-ollama-bark-c80e6f815cba which uses OLlama as well which is probably a great option nowadays too :)

@lucygelz 4 ай бұрын

@@jakeeh thank you

@prabhatadvait6171 4 ай бұрын

i'm using linux can you tell me how how to do it in linux

@jakeeh 4 ай бұрын

Which part are you having trouble with in Linux? :)

@yashikant5819 6 ай бұрын

can you combine it with frontend

@jakeeh 6 ай бұрын

You could absolutely make a front end for this so you could interact with it through a GUI and/or voice.

@inout3394 5 ай бұрын

Thx

@jakeeh 5 ай бұрын

Thanks for your comment!

@MyStuffWH 7 ай бұрын

It is (very) clear you do not have a technical AI background, but you inspired me to try and make my own local assistant. Thanks!

@jakeeh 7 ай бұрын

Oh absolutely! I certainly have a technical background, but I am far away from having too much experience outside of scraping the surface of AI. Happy you felt inspired to give things a shot! You've got to start somewhere! :)

@leodark_animations2084 Ай бұрын

@@jakeehYou motivated me to learn Python and attempting an assistance program. I'm making a keyboard input instead of a vocal one as my laptop can't probably handle the stress of AI. So it's more an auto task through keywords that has a voiceline linked to each action it does to sell the AI part for now lol

@fnice1971 5 ай бұрын

I used LM Studio mostly as it loads multi models, with 3x 24gb GPU's 70GB VRAM you can run like 10 models at same time, more polished then GPT4ALL but both work and free.

@jakeeh 5 ай бұрын

Oh that sounds great! I certainly don't have those specs, but that does sound great nonetheless. Thanks for your comment! :)

@joannezhu101 4 ай бұрын

@@jakeeh i wonder if it is worth to compare those like Ollama, LM Studio or just the way you've shown in the video, thought i don't quite get how Ollama or LM Studio works (I thought gguf is the only way to work witht local offline method, didn't know what is inside Ollama). Do they really help to speed up things?

@jakeeh 4 ай бұрын

I think it really depends on the hardware of your machine. If they can utilize your GPU then they can likely greatly improve the performance. Although I'm not an expert on them :)