How to use

Рет қаралды 57,600

Brandon Jacobson (Jacobson Enterprises)

Күн бұрын

Пікірлер: 106

@-RobGPT- Жыл бұрын

focken hell, this video was buffering in the background and suddenly started playing while i was coding, thought my language model had come alive for a few seconds there lmao

@BrandonJacobson Жыл бұрын

LOL, I don't think my voice would be a great language output.

@georgechu8281 2 ай бұрын

Thank you so much. This is the only source that works perfectly with no holding back or alteration of material. You are great. You saved my day.

@BrandonJacobson 2 ай бұрын

Yay! I'm glad it helped!

@S-Technology Жыл бұрын

Great Tutorial! I used it to add vosk to my own JARVIS system. I don't know if others mentioned it, but to fix the problem with the mic not being able to be shared you can change 1 line of code in your example. change: data = stream.read(4096) to data = stream.read(4096, exception_on_overflow = False) I can talk to my assistant and have OBS recording from the same microphone at the same time.

@BrandonJacobson Жыл бұрын

Awesome. Thanks for the contribution too. I will need that for sure.

@six1free Жыл бұрын

I've wanted to do this project ever since the IBM activa commercial way back in like 1990 :D looks like the tech is finally getting close!

@BrandonJacobson Жыл бұрын

Me too. Well circa 1999/2000, but then I abandoned my programming dreams for nearly two decades before starting again.

@edsonservi 2 жыл бұрын

Muito legal irmão!!! Sei que fala inglês, mas traduzir é fácil. Adorei o vídeo. Estou iniciando em python e estava com dificuldades de converter fala em texto, principalmente pela questão de ser "off-line" o que eu preciso! Muito obrigado!!!

@daramolaoluwafemimichael37 2 жыл бұрын

This is just what I've been looking for. Thanks brandon🖒

@BrandonJacobson 6 ай бұрын

I didn't see this notification one year ago, but just wanted to say I'm glad it helped and I hope you've done some great projects since then.

@aaronstarkweather2427 6 ай бұрын

Didn't realize that this would solve my biggest nemesis which has been installing PyAudio. Omg thank you!

@BrandonJacobson 6 ай бұрын

Awesome! It opens the path to so many more projects.

@code-grammardude5974 2 жыл бұрын

very glad vosk some more attention. I think it's very underrated

@BrandonJacobson 2 жыл бұрын

I'm using the speech_recognition library for my digital assistant, but I'm going to replace it with vosk going forward.

@amoawesomeart6074 2 жыл бұрын

File "C:\Python\Python37\lib\site-packages\vosk\__init__.py", line 138, in __init__ self._handle = _c.vosk_recognizer_new(args[0]._handle, args[1]) AttributeError: 'str' object has no attribute '_handle' Got the vosk and pyaudio installed without any problems, however when I try to run the script as written in the video it's unable to locate the model location as it tosses the above error at me.

@ardavanorakzade 2 жыл бұрын

It was an awesome tutorial and exactly what I was looking for, thanks very much.

@BrandonJacobson 2 жыл бұрын

Thank you for the positive feedback!

@RobotCoder1951 Жыл бұрын

Awesome tutorial! Exactly what I needed.

@BrandonJacobson Жыл бұрын

Awesome! I'm glad it helped.

@jumpinjohnnyruss 3 ай бұрын

4:06 Can't i just copy and paste the file itself instead of screwing with command prompt and all the other?

@skepziev2565 25 күн бұрын

With vosk can you like manually interrupt the recording process? Meaning, let's say, I press a button to take input from vosk, can I interupt that process and turn off recording whenever I want?

@crunckNATIon 9 ай бұрын

awesome was looking for documentation for vosk online and it wasnt much...and they few others were either medium articles or stack overflow...but most solid source ive seen is this video...i just which it show how write it to a file, and the key interrupt that you were talking about. sorry for the rant but awesome video

@BrandonJacobson 9 ай бұрын

I'm self taught so the lack of information on Vosk is what inspired me to create the video. I'm glad it helped.

@crunckNATIon 9 ай бұрын

@@BrandonJacobson thanks im self teaching as well

@CountryHouseIncubators Жыл бұрын

Is there a way to take the vosk model and add words that it can recognize?

@jloibman 2 жыл бұрын

I'm getting an error when trying to create a Model: "Exception: Failed to create a model"... Do you know how to fix? Thanks.

@profredstone 8 ай бұрын

It works very well for French model thank you !

@xxlarrytfvwxx9531 11 ай бұрын

7:28 This isn't quite correct, yes it does store that string in that variable, which is the absolute path. The error mentioned at 7:34 isn't because you're not giving the absolute path, the reason why it may give an error, is because there may be an invalid character after the \ character

@MicrobeHg 11 ай бұрын

Thanks for the tutorial, it's really helpful XD

@BrandonJacobson 11 ай бұрын

Awesome. I'm glad it helped.

@seraph8672 2 жыл бұрын

This tutorial is just GRAND. I have written a bot that goes and download various videos from reddit, and then goes and makes compilation videos and inserts my pre recorded intro outro and midrole clips. For the one channel about pets there is nothing more to do, but I am working on another one for dashcam clips which tend to have a lot of curses and wanted a way to programmatically find and bleep them out. There are pre made tools for this but they are overly complex and heavy. I think what I will do now that you have given me a basic understanding of how to instantiate vosk and fire an audio stream through it, is after my compilation is done do some "post processing" and just split the audio off with ffmpeg, fire it through vosk and when it finds a word I want to filter, notate the timestamp in a list, and then at each index in the list, throw my bleep audio clip in there, and then ffmpeg the new audio back onto the video and call it a day.

@seraph8672 2 жыл бұрын

also pyaudio is now installable on python 3.10 on windows with pip I just tried it for giggles,, and it worked :D

@jc.daguila Жыл бұрын

Excellent tutorial, Thank You so much!

@BrandonJacobson Жыл бұрын

Awesome! I'm glad it helped!

@didnt_get_the_handle_i_wanted 2 жыл бұрын

This is very useful. Thank you :)

@BrandonJacobson 2 жыл бұрын

Awesome! I'm glad it helped. I'm released how to do it on a Raspberry Pi today.

@ateafordsk8er 2 жыл бұрын

Thank you for this valuable information. It’s helping me with a project for a homeless shelter. I unfortunately have ran into a problem. I’ve made an “if” statement to perform an action if a hotword/phrase is heard by the mic but I can’t figure out how to interrupt the action if I say another “hotword/phrase. I’d like to be able to say “stop” aloud and it interrupt the action performed by a previous phrase that was said in the “if” statement. Do you have any advice for me on how I can figure this out? Thank you for any help and for the help already!

@BrandonJacobson 2 жыл бұрын

I haven't experimented with this yet, but it's on my list to figure out. You can try Threading and create a function with the sole existence of listening to a STOP command and interrupts whatever is going on.

@ateafordsk8er 2 жыл бұрын

@@BrandonJacobson Thank you again! I will try studying your advice and continue to work on it.

@saheraalreqeb 2 жыл бұрын

The output is a json file that’s why it’s displayed like this. Use the json module to extract the text out of it

@AhmadNaveedHamdard 2 жыл бұрын

@@Zaddish2 can you show me please how to write the voice text in a .txt file like every spoken word in a line

@hartoflearning 2 жыл бұрын

Can I save the model on my D drive and run it from there with the absolute path?

@judelfenix1949 9 ай бұрын

thank you bro, i can now start my project

@BrandonJacobson 9 ай бұрын

Awesome. Good luck!

@mattdonnelly3743 8 ай бұрын

11:38 Better way is to do Line 20: text = json.loads(text) Line 21: print(text['text'])

@alx8439 Жыл бұрын

You decided to go from scratch like "Programming Hero" guy did? Well, it is fun up to some extent. This is where you actually need: 1) a good modularity (like to be able to replace say STT engine of Vosk with Whisper / Whisper.cpp or DeepSpeech or AprilASR), or replace TTS 2) good architecture, ideally multi threaded one, so you can interrupt whatever else your assistant will be doing 3) wake word detection, ideally with some sophisticated model, which is trained to hear only that wake word and nothing else, but does that good and verry effecient 4) dialog management and skills You can code all that yourself, that's not a problem, but does it make sense to reinvent the wheel? I'd say it makes a lot more sense to take something like Rhasspy or Microft and start from that

@paul_devos 2 жыл бұрын

This is great. super simple and effective. I am trying to do an NBA play-by-play (speech to text) app. It gets a lot of NBA players names and the "actions" (e.g. Rebound, Assist, Jump Shot, Dunk) correct. But that said, it doesn't get many names. So I was wondering if you knew if VOSK can train custom models? If not, what would give an OFFLINE inference for a custom model? Any recommendations? Thanks in advance!

@BrandonJacobson 2 жыл бұрын

I was just about to warn you about navigating through the diverse names in the NBA. According to the Github you can create your own custom model. Seems a bit beyond my knowledge level, but I hope it helps: github.com/matteo-39/vosk-build-model

@hamzaanwar5678 10 ай бұрын

can not import name model in from import model error. could you tell me how to fix this please ?

@sjjayaswal7621 2 жыл бұрын

Thank you for information sir 🙏

@BrandonJacobson 2 жыл бұрын

I'm glad it helped!

@snehaladbol0308 Жыл бұрын

Great tutorial!! thank you

@Jorge_i_Norge Жыл бұрын

I use Vosk with Kdenlive, Spanish and English. Have not found a Norwegian model. The first to language work smoothly for me.

@BrandonJacobson Жыл бұрын

What kind of project are you using it for?

@Jorge_i_Norge Жыл бұрын

@@BrandonJacobson Just editing videos, but I have family in Argentina, the UK and I am living in Norway and learning the language.

@meetpatel4865 2 жыл бұрын

There is an issue with vosk website. The site is not working. Is there any other way I can download the model?

@NaderTheExpert Жыл бұрын

Hello Brandon, very impressed with this tutorial and made me curious to dig further into your books. Books are even more impressive and looking forward to get my hands/eyes on some of them. I am a retired technology hobbyist and have some question about your books. Q1. Have you provided the code of youtube videos and books on Github? Q2. I am not very familiar how does the kindle version works? can I copy paste the text/code from kindle book? Thnaks for your kind reply.

@fardinrahaman621 2 жыл бұрын

14:-3 was really helpfull..... 🙌🙌

@BrandonJacobson 2 жыл бұрын

Awesome! I'm glad to hear that!

@samirsl7698 2 жыл бұрын

Hello How to make language model for vosk

@egrammar9750 2 жыл бұрын

Thanks for the details, Could you please help us understand how "elsa speaking app" pronunciation error identification works. What is the logic and code behind it.

@swapnilsharma1104 2 жыл бұрын

Thanks for this video sir. I want to know do we need to repeat step of installing model in each run?... As it takes a lot of time.

@BrandonJacobson 2 жыл бұрын

The model runs once when you start the program and then it shouldn't after that. It only takes 3 seconds on my computer.

@AyushKumar-qk2qg 2 жыл бұрын

Tell where we paste it in vs code

@micaelsantosdasilva773 2 жыл бұрын

You r so cute teaching.

@AmitChristian 2 жыл бұрын

Hi Brandon, this is an awesome tutorial. Can you pass an audio file (say wav or mp3) instead of Mic input? I have some speech audio that I would like to convert to Text. I can play that and have it captured by Mic, but if I can just pass an audio file, that would be an awesome feature. Please help.

@javadbagheri6576 2 жыл бұрын

@Alpha Group doesnt work Error : initializer for ctype 'char *' must be a bytes or list or tuple, not str

@younesyaaich3379 2 жыл бұрын

Thanks bro for this explication

@exploittutorial8689 2 жыл бұрын

thank you so much, I have been working on my voice assistant but i couldn't figure a way of working things around when am offline. I had issues using pocket sphinx.

@BrandonJacobson 2 жыл бұрын

I heard a lot of people were having trouble using pocket sphinx, that's why I tried Vosk. I'm going to try and put it on a Raspberry Pi soon too.

@Matt-iy2cf 2 жыл бұрын

It is a dictionary so you can do print(text[“text”]) and it should cleanly display voice commands

@Dr.Wardskaiker 9 ай бұрын

A great video but I still don't know what Model I need I need Vosk so I can recognize Anime videos (I will convert the video to audio of Course) and then Use an AI Voice cloning technology to clone the voices in other language I am not a programmer I know nothing about programming My device is Win 7 What do you advise me to do so I can setup the Vosk to be 100% Compatible with my PC with the fact that my PC is not that High end. And Thank you in advance.

@BrandonJacobson 9 ай бұрын

Similar to English, there's a small and large Japanese model you can find on this website: alphacephei.com/vosk/models. You should be able to use my code but replace the model with the Japanese model and it should be able to understand most Japanese. I don't know the accuracy of the moral "tonal" languages like Japanese or Chinese. I dabbled in Japanese, I know that one wrong vowel pronunciation can have embarrassing results.

@Dr.Wardskaiker 9 ай бұрын

@@BrandonJacobson Thanks

@tobiaskarl4939 2 жыл бұрын

How to select the 4. sound imput device, thats my mic ? I found it: stream = mic.open( ..... input_device_index=4, ....... ) For me it was the 4. index. YEAH it worked very nice. Many thanks for this tut. ❤

@howardbaxter2514 2 жыл бұрын

For some odd reason I still cannot import PyAudio. Why is installing PyAudio such a bs overcomplicated issue? Edit: Finally figured it out. If you are still having issues downloading PyAudio follow these steps: 1. Open up Pycharm and create a new Python File. 2. Type in the following: import os print(os.system("cmd /k pip install pipwin") 3. Run this script. This will install pipwin in the Project Folder. 4. Locate '(venv) [insert Project Folder Location]>' in the Run Module 5.Type in the following and press Enter: pipwin install pyaudio. Hopefully this helps anyone that is stuck with the import pyaudio error like I was.

@BrandonJacobson 2 жыл бұрын

Thank you for contributing like this! It was one of my hopes on my KZbin channel.

@howardbaxter2514 2 жыл бұрын

@@BrandonJacobson no problem Brandon. Your video is very good, but I just wanted to help others that were struggling over that PyAudio import since it isn’t exactly intuitive.

@MauricioRamcerva 10 ай бұрын

Does this work for mac?

@tobiaskarl4939 2 жыл бұрын

How to create a model with only 20 words for example ? I pay for it.

@jacobjones7413 2 жыл бұрын

Thank you so much.

@BrandonJacobson 2 жыл бұрын

I'm glad it helped!

@stinger220 10 ай бұрын

@@BrandonJacobsonlol only responding to comments that compliment you, grow a spine

@BrandonJacobson 10 ай бұрын

@@stinger220 whoah, chill. KZbin is horrible at sending notifications. Luckily, I saw this one. It said you responded just now, but the comment says 1 day ago. Sorry it seems like I cherry picking comments to respond. I'll do better going forward.

@miottraja8162 2 жыл бұрын

There can find this for Serbian language?

@SnippetBucketTechnologies 4 ай бұрын

How to avoid noise cancellation or multiple people speaking same time, how to identify unique voice !

@BrandonJacobson 4 ай бұрын

I haven't used vosk in a while, but I think it has two methods to help with this by adjusting --max-alternatives and --word-penalty.

@AkshayaAileni 5 ай бұрын

can u please provide the code here??

@aperson1181 Жыл бұрын

Will it support other languages? I am trying to help my elderly mom, who speaks Ukrainian/Russian. I tried to have her speak into a Windows PC in Russian to transcribe this text and then translate it to English. Is there a tool for this? For some reason, Windows does not support Russian speech recognition. Yes, offline is a Big plus.

@BrandonJacobson Жыл бұрын

Yes, there are four Russian models: alphacephei.com/vosk/models

@aperson1181 Жыл бұрын

@@BrandonJacobson Do you have a walk-through? I am new to programming. What do I need? I have a Windows PC

@mohitnamdev408 Жыл бұрын

thank you

@okebanei 2 жыл бұрын

offline ?

@adribmahmud 2 жыл бұрын

plz make a video how to add Vosk in ai assistant

@BrandonJacobson 2 жыл бұрын

That's my plan in the future. I used speech_recognition in these videos for an ai assistant. kzbin.info/www/bejne/p5mshYBopsR5r7M and the original here: kzbin.info/www/bejne/fYqtfJ1-gqejeqs

@yorlandis0578 2 жыл бұрын

thank you very much bro i need it , crack

@Flab1 Жыл бұрын

if everything works for me, then wait for the donation

@letscode4370 2 жыл бұрын

instead of manually slicing the result from recognizer.Result(), this worked for me said = self.recognizer.Result().splitlines()[1].split(sep=":")[1].strip().strip("\"") anyways.. thanks for the video

@BrandonJacobson 2 жыл бұрын

I bet that's way faster as well. Thanks for the contribution.

@USAIsrUKEUVngrdBLRckOccupiedUA 2 жыл бұрын

People like must spread idea of personal AI, not corporative AI. Stop give private data to the corporation. Each person must use own AI. GOOGLE AND OTHER COMPANIES! I DO NOT ALLOW TO USE MY IDEAS!