Live Speech to Text with Watson Speech to Text and Python

Live Speech to Text with Watson Speech to Text and Python | FREE Speech to Text API

Рет қаралды 62,292

Күн бұрын

Пікірлер: 337

@StevePGLy 3 жыл бұрын

Hey Nicholas. I am Steve who is on a learning journey to become data scientist. I would like to say that I really appreciate what you are sharing! I have been studying pretty much by watching your great videos :D

@NicholasRenotte 3 жыл бұрын

Thanks so much Steve! So glad you're enjoying the videos!!

@TimDownsAnimation 3 жыл бұрын

I'm an amateur at this stuff, but I'm trying to design a proof-of-concept that proposes a hybrid of something like this with lip-reading AI to produce real-time subtitles for the deaf and hard-of-hearing. In my research, I came across your channel and I love it! You're great at explaining things and it's easy to follow along. Instant sub!

@NicholasRenotte 3 жыл бұрын

Ayyyy, welcome to the channel. Also, that sounds like a sick MVP!

@KANEDAX1987 2 жыл бұрын

i am actually trying to do something similar. I am trying to create something that recognices lip reading to audio instead of subdtitles for the mutes. pretty similar but different at the same time. nice. also an amateur

@byiringirooscar321 2 жыл бұрын

HEY DEAR i AM ABOUT TO IMPLEMENT THAT KIND OF PROJECT I need your advise and methodology

@TimDownsAnimation 2 жыл бұрын

@@byiringirooscar321 oh I have no idea. By “proof of concept” I meant I was making a short film for a school project in animation and VFX lol. Sorry

@aoeu256 6 ай бұрын

Use AI image recognition? Use someone elses project? Also you can use a one handed keyboard disambiguate or something for example LIP sounds (BPMFV), mouth roof(TDNSZ), velar(KG NG X), H, vowels will be mouth size like A is big, U is rounded lip. Detect nasal air flow to detect M N NG, u can detect eye movement for B vs P, S vs Z, T vs D, etc... Also would be nice to have a full face anime girl mask and girl voice... @@byiringirooscar321

@guyincognito1985 3 жыл бұрын

Thanks for zooming in to make the text larger, and for using a larger font in VS Code. Also thanks for alerting me to pipwin!

@NicholasRenotte 3 жыл бұрын

Anytime @Guy Incognito, I actually picked it up while installing it, wanted to call it out to ensure it all worked!

@OliverHiggins 3 жыл бұрын

Great video. I got the 403 issues to (Based in Sydney) the only location that would work was us-south

@NicholasRenotte 3 жыл бұрын

Thanks for the share, yeah looks like it's a common issue across the regions but us-south is up.

@OliverHiggins 3 жыл бұрын

@@NicholasRenotte if it’s set to record and it doesn’t hear anything does it process nothing or does it send something ie if I did a loop every few second to record for a 2-3 seconds would it eat through the allocated time from IBM?

@OliverHiggins 3 жыл бұрын

@@NicholasRenotte almost need a software vox or something 🤔

@NicholasRenotte 3 жыл бұрын

@@OliverHiggins let me get back to you on that, will reach out to the product teams!

@thinamG 3 жыл бұрын

This looks awesome, Nicholas. Thanks for sharing!

@NicholasRenotte 3 жыл бұрын

Thanks so much @Thinam!!

@nitinpatel35 3 жыл бұрын

Those who are getting the following error while running the code - 'Handshake status 403 Forbidden' and have selected 'eu-gb' as the region, please change the region and it should work for you. I have encountered a similar problem. After setting up my service at the 'us-east', it worked for me.

@NicholasRenotte 3 жыл бұрын

This is awesome! Thanks so much for sharing @Nitin!

@andresg297 3 жыл бұрын

This channel is so underrated

@NicholasRenotte 3 жыл бұрын

It does an innate split @Mert, the default is set to 70%. More detail here: pycaret.org/train-test-split/

15 күн бұрын

Hi Nicholas. First of all, I'd like to appreciate all the efforts you put into your videos. They have helped me in various scenarios. I would like to raise a small request. Would you do a video on Meta's SeamlessM4T model and finetuing of it if possible? Thanks in advance

@haisomeone2218 2 жыл бұрын

You don't know or somone didn't tell you that you are the best.

@sayelfujael6378 3 жыл бұрын

I'll check it! Great work man.

@NicholasRenotte 3 жыл бұрын

Awesome, let me know how you go @Sayel.

@ravindunawanjana7050 3 жыл бұрын

I already started to followed your videos and this is great video nick cheers

@viniciuslongo4622 3 жыл бұрын

Hands down the best ML youtube channel

@NicholasRenotte 3 жыл бұрын

Ahhhh shucks @Vinícius, you're too kind! Thanks sooo much!

@adibokay 3 жыл бұрын

This is so helpful, thank you Nick! I am trying to build a software that can convert ASL to text on the screen that can help ease communication on platforms such as Zoom and Microsoft Teams. Would be great if you could show us how to build it ^^

@NicholasRenotte 3 жыл бұрын

Got something coming soon on advanced conversion!

@fjizq 2 жыл бұрын

Good evening. Thanks for the video and our explanations! I've got this error: [Errno 11001] getaddrinfo failed. Any clue? Thanks a lot!

@9000richi 2 жыл бұрын

I'm stuck with the same problem, have you found a solution?

@fjizq 2 жыл бұрын

@@9000richi Not really. I was trying to make it work in different computers, but I usually received the 403 error. I think all of them have to do with the region you base your stt service in, but I was not able to solve it

@9000richi 2 жыл бұрын

@@fjizq I tried with 2 regions and non of them work, so tbh I don't know if that's the problem, but thanks for answering. I'm going to try another method to use the API no python, I'll let you know if I find one.

@andreabussolan2832 2 жыл бұрын

@@9000richi I'm also stuck with that error and I tried different regions. Did you find a solution?

@9000richi 2 жыл бұрын

@@andreabussolan2832 Not really, haven't looke into it a lot, I've been focused on other parts of my code leaving this to last. Sorry.

@gokulkaruna1 3 жыл бұрын

"Handshake status 403 Forbidden" ERROR while running the transcribe.py file.

@NicholasRenotte 3 жыл бұрын

Heya @Gokul, can you double check your URL and API key. Normally this error is due to a slight typo in either.

@pidpyd7759 3 жыл бұрын

@@NicholasRenotte Still Not Working :/

@frenchcoder-developpementw2429 3 жыл бұрын

@@pidpyd7759 Problem solved. I changed the region

@pidpyd7759 3 жыл бұрын

@@frenchcoder-developpementw2429 to what u did

@frenchcoder-developpementw2429 3 жыл бұрын

@@pidpyd7759 I changed the region to us-south

@avnishat24 2 жыл бұрын

@Nicholas You specified this can be used for live transcript from a meeting but when I tried this transcribes only the voice from my microphone and does not transcribe others voice from the meeting so how it can be used for transcribing a meeting. I would love to use this code to transcribe the entire meeting session

@TheTomdeaf 3 жыл бұрын

I am deaf, I want live subtitles (not transcript) in a small window, which is always on to the top in Z-order. Can you or so someone here show or give tipps how do adapt it?

@NicholasRenotte 3 жыл бұрын

Heya @The Tomdeaf, you could take this code and apply it into something like a GUI. Tkinter perhaps?

@TheTomdeaf 3 жыл бұрын

I don't know anything about python, Only C++ and WinAPI. Can you give me some details in this direction

@NicholasRenotte 3 жыл бұрын

@@TheTomdeaf oh got it, could probably build a GUI with C++ as well. There's a fully documented API here: cloud.ibm.com/apidocs/speech-to-text

@kiss-bws 3 жыл бұрын

Bro nice video but the thing is I want to train my voice assistant for accuracy and I also want to it predict please make another tutorial please if you are going to make then reply and also you got another subscriber🙂

@NicholasRenotte 3 жыл бұрын

Definitely, so fine tuning STT models?

@ComputerScienceSimplified 3 жыл бұрын

Amazing video, keep up the incredible work! :)

@NicholasRenotte 3 жыл бұрын

Thanks so much @Computer Science Simplified!

@rachidaboussaid501 2 жыл бұрын

@@NicholasRenotte Hey dear friend Nicholas, we appreciate your work too,thanks you for that . I want to ask you,how can I get the realtime speech to text result in browser, rather then terminal,to avoid copy paste or goings comings between cmd and browser, to use for exemple extension of Google translate in chrome to translate a part of text(speech) I want, quickly with the feature of click_translate ,can I use Jupyter notebook for that rather then vscode ? In other words, can I run that command "python transcribe.py -t 20" for exemple in jupyter notebook, and show result in browser rather then terminal (cmd)?

@ranjansutradhar1046 3 жыл бұрын

Thank you , but I want to save the speech output as text file , how to do that ,could you please answer

@NicholasRenotte 3 жыл бұрын

Another subscriber shared the code. Add this in transcribe.py in line126 with open('output.txt', 'w') as out: out.writelines(data['results'][0]['alternatives'][0]['transcript'])

@ranjansutradhar1046 3 жыл бұрын

@@NicholasRenotte bro i have tried out, what you have mentioned in the above comment, but getting this error - out.writelines(data['results'][0]['alternatives'][0]['transcript']) NameError: name 'data' is not defined, data variable is not recognised, i have tried out with as many configurations that i could ,but still getting this error , would you be able to rectify this , as im doing a project on speech to text transcription and then summarization of the transcription.

@ranjansutradhar1046 3 жыл бұрын

It's done , thank you buddy

@esmahanaldoseri8151 3 жыл бұрын

@@ranjansutradhar1046 Hi , can you share the way you solved the error as I got the same too

@ranjansutradhar1046 3 жыл бұрын

@@esmahanaldoseri8151 paste the same code snippet given by @Nicholas Renotte in the above reply in transcribe.py's line no 98 after the print satement and accourding to the structure, pls if you run successful pls notify me here okay, thanks

@潘凡雯SHAIKRESHMAPARVEENQ36 2 жыл бұрын

Hi, I am getting this error ->"[Errno 11001] getaddrinfo failed ", when running transcribe.py file in the last ,please suggest the needful Thank you

@seymuromarov9287 6 ай бұрын

Hi, I also have such problem, did you solve it?

@manikagarwal5415 2 жыл бұрын

list index out of range on_close() takes 1 positional argument but 3 were given I got this error please help me out as well as not recording my audio. Previously I got an error with pyaudio while importing but it has been resolved by declaring a variable to it.

@darkmasterbatista2815 6 ай бұрын

[Errno 11001] getaddrinfo failed on_close() takes 1 positional argument but 3 were given. I got this one aswell, did you solve it ?

@valeriofaraone388 3 жыл бұрын

hey nicholas thank you very much, you are great.

@NicholasRenotte 3 жыл бұрын

You're most welcome @Valerio, glad you enjoyed it!

@WisKy64VT Жыл бұрын

Nice! what if you wanted to have it automatically trigger when someone starts talking, and stops after like 2 seconds of silence?

@zeroranger 3 жыл бұрын

I couldn't run it because it said that there was no SSL available :( Please help!

@NicholasRenotte 3 жыл бұрын

Heya @JumpNShootMan, was there a broader error you can share?

@pragatisharma3703 3 жыл бұрын

Hey Nicholas. I am stuck with "Handshake status 401 Unauthorized" can you plz help me out.

@NicholasRenotte 3 жыл бұрын

Heya @Pragati, just double checking you've updated the API key? If so, try creating an instance in a different region.

@oluwatimilehinfolarin5758 2 жыл бұрын

Thank you Nicholas for the wonderful video. But I got an error - [Errno 11001] getaddrinfo failed. How can I solve this problem? It's urgent. Thank you.

@lpmotivations7672 23 күн бұрын

im also getting the

@lpmotivations7672 23 күн бұрын

if you resolved this help me out!!!!!11111

@ameerazam3269 3 жыл бұрын

Already i cover this but credit goes to you sir

@ameerazam3269 3 жыл бұрын

because of you that work

@NicholasRenotte 3 жыл бұрын

Awesome work @Ameer, so you're using?!

@ameerazam3269 3 жыл бұрын

@@NicholasRenotte yes already I did by watching previous video and deploy on herokuapp ..already send you my work linkedin

@oswaldmboussa3798 3 жыл бұрын

Thank you sir for this great video! Can it work with streamlit?

@NicholasRenotte 3 жыл бұрын

I believe so, will add it to the list of upcoming vids @Oswald!

@SaumyaSharma007 2 жыл бұрын

Thanks Nicholas Sir for this awesome video 😀 Plzzzz can u help me in this, For example if a village person is not able to speak Hindi language then is it possible to design a solution that converts local ethnic language into Hindi or English text format and that too in live conversation... Plzzzz Sir reply..... U have already made a video on where u have converted eng audio of "Hello World" into Hindi text.....But that was not live.... Plzz I just want your views about this.... Will be waiting for your reply 🥺

@razmandhamarasheed4325 3 жыл бұрын

Hey man, great work! I love your explanation. Want to get into data science and learning it from you makes it easy. I have a request though, are you able/willing to look more into process time series data where you can make continuous predictions based on historical data? Anyways, keep up the great work!

@NicholasRenotte 3 жыл бұрын

Definitely! Just checking, have you seen this: kzbin.info/www/bejne/gaeveGSriahggbc

@razmandhamarasheed4325 3 жыл бұрын

Hi @@NicholasRenotte, thank you for your reaction, yes I think that was the first one I saw from your work. I kinda get the batch like prediction, where you have a csv file or something else to predict in batches. But is it possible to get a continuous prediction that kinda adapts to new situations. For example: predicting process behavior like the effect of heat and fluid flow on pressure in a continuous fashion. If you are interested I can give a more detailed description of my question. Anyways, thanks for your reaction as a follower I appreciate your work. Keep it up. Razmand

@NicholasRenotte 3 жыл бұрын

@@razmandhamarasheed4325 so predicting in real time? as in new data comes in a you forecast the next couple of time steps?

@razmandhamarasheed4325 3 жыл бұрын

@@NicholasRenotte, yes exactly. I see that the problem is a bit more in the corner of programming, but I have until now not seen anyone on KZbin try to get the predictions automatic. In my case I would like to use these predictions for a continuous stream of data coming from an SQL server (Pi OSISOFT). I would appreciate it if you could explain how this would be setup like all your other explanations.

@NicholasRenotte 3 жыл бұрын

@@razmandhamarasheed4325 hmmm, to be honest if I was just running on a SQL server table I would have a stoc proc setup to reforecast each time new data was added to the table. That would probably be the easiest way rather than having to work with a stream.

@-alfeim2919 3 жыл бұрын

when I wrote python transcribe.py -t 20 i got 7 errors on the configparser.py file, and all of them are: Refactor this function to reduce its Cognitive Complexity from 20 to the 15 allowed any help

@NicholasRenotte 3 жыл бұрын

Hmmm, haven't encountered that one before, can you share the full error? Any changes made to the baseline code?

@-alfeim2919 3 жыл бұрын

@@NicholasRenotte it turns out that I've forgot about adding the apikey, and other errors as well, but hopefully enough after week of trying, I figured it out and it worked!! Many thanks for your concern, and your amazing job!!

@NicholasRenotte 3 жыл бұрын

@@-alfeim2919 stoked you got it working! Nice!!

@shoaqa16 3 жыл бұрын

amazing video, thank you i'm having a problem with the last text is not being printed and idk why i added a print statement to on_close to see if it reaches it but it didn't. even though it was working fine before then it stopped so idk what to do :(

@NicholasRenotte 3 жыл бұрын

Got any errors for me?

@el3412 3 жыл бұрын

great video thank you ! How can I print text in .text file ? I use out.writelines( ) ,but there’s error.

@NicholasRenotte 3 жыл бұрын

Might be easier to record the live audio then output: kzbin.info/www/bejne/d2rCYYKdjJZ_kLc

@samarqasem9558 3 жыл бұрын

Did you figure it out? I'm going through the same thing :(

@el3412 3 жыл бұрын

@@samarqasem9558 yes , add this in transcribe.py in line126 " with open('output.txt', 'w') as out: out.writelines(data['results'][0]['alternatives'][0]['transcript']) “

@nighthawk6414 2 жыл бұрын

if you're on a mac and have issues installing py audio run these commands brew update brew install portaudio brew link --overwrite portaudio $ pip install pyaudio ;)

@enigmaticpuzzle9654 3 жыл бұрын

also, I need assistance with how to set up the command prompt and vs visual code. To make it clear I am stuck from 7:41 to forward. I need help with that. Thank you

@NicholasRenotte 3 жыл бұрын

Shoot, what's up?

@anishaudayakumar1778 3 жыл бұрын

Amazing Tutorial !!! I'm stuck with Pyaudio installation in my windows :( And when I tried with colab my final step throws "OSError: No Default Input Device Available"... Any leads to help?

@NicholasRenotte 3 жыл бұрын

Heya @Anisha, this won't work in Colab, you'll need access to the microphone from your local machine!

@aldorojas1918 3 жыл бұрын

I've got one question, I use TextNow to call people from my computer to make some interviews, but I would like to transcript not my voice (trough the microphone), but what people I'm calling is saying, that's what I want to transcript, what people say over this program (TextNow), is that possible? and how can I do that? Thanks

@NicholasRenotte 3 жыл бұрын

Would suggest recording rhe video from textNow then running it through a transcription tool like Watson STT. Got a vid on the channel about how to do it from video!

@pauljones5476 2 жыл бұрын

Hi Nicholas, I've watched a few videos on how to use speech to text, with the majority being with google. I must say your explantion is more detailed, and the audio quality is very good (crisp and clear). It seems like watson is easier to install than googles version. I'd like to install watson on my macbook. Is there anything that I need to do differently, or can I follow the exact steps you used to get it installed on my mac? Will watson detect speech if I have a video playing on my mac and then transcribe that audio to text? Kind Regards Paul

@ajkadhim6058 3 жыл бұрын

thank you so much for an excellent video. When it is transcribing live, it outputs several lines as you are speaking. However, after "done recording", you see the complete output of the text. When I am running the command, I do not have that finalized output at the end. Is there a way to get that? thank you again in advance.

@NicholasRenotte 3 жыл бұрын

Heya AJ, as in you don't see it or need the output saved somewhere?

@Louisljz 2 жыл бұрын

@@NicholasRenotte same problem here too..I don't see the final text come out in the end, after the text "done recording"..it says on_close() takes 1 positional argument but 3 were given..thanks in advance

@NeoAAnderson 4 ай бұрын

How do I do this in django? This is perfect, I need to include it in my final paper

@thibautbouexiere1881 3 жыл бұрын

Hey Nicholas, thanks for your video. I'm a code beginner. How can I get that transcription written in another text editor like Google Doc? I'm trying to find a solution to subtitle a live conference. Thank you so much.

@NicholasRenotte 3 жыл бұрын

Heya @Thibaut, do you need to subtitle live or just in post?

@thibautbouexiere1881 3 жыл бұрын

Hey @@NicholasRenotte I need to subtitle live, as it would work, you know, on a KZbin video being subtitled in live. How can I add my subtitles to the live video? Thank you!!

@houralghasham 3 жыл бұрын

Thank you for your hard work. I’m wondering if there is a way to merge it with watson assistant chatbot so it will give me a response?

@NicholasRenotte 3 жыл бұрын

Sure can, there's actually a voice agent integration available for WA that's specifically designed for it.

@shikhajoshi8961 4 ай бұрын

Can i get a little help.. while installing pyaudio it is showing error while building wheels

@Cdawgw 3 жыл бұрын

Hey Nicholas, great tutorial, I am working on a project that can classify information based on a key pressed and then stores it. Imagine pressing 'B' key and the streamed text will flow into text file 1, but when pressing the 'N' key on my keyboard it would stream to text file 2. Could you show how the text can actually be retrieved? Thank you so much you saved my project!!

@NicholasRenotte 3 жыл бұрын

Hmmmm, I'm not too sure I get the project. So it would allow you to break up the speech into multiple files?

@Cdawgw 3 жыл бұрын

Yes so imagine you to take notes but only from a specific part of a sentence (part of a string), that is one of the key things I am missing with speech transcription. I could do learning and repetition using books, read out important parts and highlight parts of that sentence using for example a key press on my computer. Hope that makes sense! Thanks for the input!🚀🚀

@NicholasRenotte 3 жыл бұрын

@@Cdawgw ah, got it! Could add some logic into the code to capture your keypress and change the routing of the output in response.

@nadaessam4603 3 жыл бұрын

Thanks for sharing this video, I have a question when I try to install pyaudio a I got the error 'pipwin is not recognized as an internal or external command' so what should I do, thanks in advance

@NicholasRenotte 3 жыл бұрын

Try installing pipwin @Nada, check this out: pypi.org/project/pipwin/

@tkipkemboi 3 жыл бұрын

pip install pipwin

@esmahanaldoseri8151 3 жыл бұрын

Hi , I got this error .. Can you please help ? Error: A connection attempt failed because the connected party did not properly responded .........

@esmahanaldoseri8151 3 жыл бұрын

Finally fixed this problem.. It worked ell but not concatenating the whole text together Error: on_close() takes 1 positional argument but 3 were given

@lenover12 3 жыл бұрын

this looks really awesome, I was wondering if it was possible to output the phonemes instead of text. that would be something I could really use in a project!

@NicholasRenotte 3 жыл бұрын

Ooooh, I don't think that's possible unfortunately @lenover12.

@jianbintang554 7 ай бұрын

Nick, need an updated version on this :) I tried, the existing seems not working any more.

@arif5615 3 жыл бұрын

hey nicholas, great job. One thing, May i know, are there any videos of you doing live speech to text translation with Python?

@NicholasRenotte 3 жыл бұрын

Heya @Arif, this one?

@arif5615 3 жыл бұрын

@@NicholasRenotte yeah this is live STT, what I mean was live speech translation. You speak English and then, directly translate to other languages in text. 😊

@NicholasRenotte 3 жыл бұрын

@@arif5615 oh, long day at work, wasn't paying attention. Ah, nope, no vid on it yet!

@mailtoraj76 2 жыл бұрын

Great work!!, but can I get the final text somewhere into the file preferably in JSON? I need to pass this to my app. Advise pls.

@__Hrishi__ 3 жыл бұрын

(Handshake status 403 Forbidden on_close() takes 1 positional argument but 3 were given) How to resolve this error??

@NicholasRenotte 3 жыл бұрын

Try using the us-south region!

@gustavoluz8983 3 жыл бұрын

Great job Nicholas! what about consuming KZbin live stream as the audio input? any thoughts on that?

@NicholasRenotte 3 жыл бұрын

Ooooh, a live stream feed. I think you might be able to do it using PyAudio. Need to dig into it a little more though.

@gustavoluz8983 3 жыл бұрын

@@NicholasRenotte Nice suggestion, thanks for the answer! Found some reddit discussion but it seems kind of messy . The best way would be to integrate with the youtube api but guess the delay would be very high and the integration is not very clear. My idea would be to subtitle and translate at real time a youtube live stream (maybe should try with twitch or others)

@NicholasRenotte 3 жыл бұрын

@@gustavoluz8983 ya, I've added it to my video list. I think it'd be sick to do a video on it!

@gustavoluz8983 3 жыл бұрын

@@NicholasRenotte i will be the first to watch it! will try to develop some things on my free time and if i succeed i put it at my forked repo and let you know, thanks

@NicholasRenotte 3 жыл бұрын

@@gustavoluz8983 yess, thanks so much excited to hear about it!

@rangadiyyala7546 3 жыл бұрын

This looks really cool, Nicholas. How can we make speaker audio as input, here we are using mic as input...

@NicholasRenotte 3 жыл бұрын

Would need to do some digging into this? Want me to make a vid on it?

@rangadiyyala7546 3 жыл бұрын

Yeah try to make a video on that it would be supercool

@funkedelic_bob 3 жыл бұрын

@@NicholasRenotte Did you ever manage to put together a video or dig into this? I'm also looking for it to transcribe whatever the system audio is playing. Thanks!

@NicholasRenotte 2 жыл бұрын

@@funkedelic_bob never got around to it Justin, might bump it up on the list, keen to get back into some of the Watson stack.

@yashuandchikusfunworld3208 3 жыл бұрын

Thanks for the nice explanation. When I ran last statement to run the program, getting error as "Handshake status 401 Unauthorized on_close() takes 1 positional argument but 3 were given". Can somebody help me on this.

@NicholasRenotte 3 жыл бұрын

Try using the us-south region!

@yashuandchikusfunworld3208 3 жыл бұрын

Thanks… I just realized that my stt service was not launching properly.. so i restarted that and its working now..

@ibrahimisrafilov1248 3 жыл бұрын

Nic, i'm trying to get as input wave the device output (Speakers) however I get low quality transcription I think it's becuase of the quality of waves since from speaker it doesn't register as good as from mic. What do you suggest me to do?

@NicholasRenotte 3 жыл бұрын

Heya @Ibrahim, does it need to come through the speakers or could you use a recording perhaps?

@ibrahimisrafilov1248 3 жыл бұрын

@@NicholasRenotte From speakers :)

@NicholasRenotte 3 жыл бұрын

@@ibrahimisrafilov1248 hmmm, that's a tough one, I wonder if you could tap into the output signal and use that instead?

@ibrahimisrafilov1248 3 жыл бұрын

@@NicholasRenotte Exactly I did as you say. I have used sounddevice lib. and installed the output device as an input so basically speakers. However, the quality is not good as it were in MP4 so transcribtion is not good enough. I was thinking it could be due to the Hgz,

@NicholasRenotte 3 жыл бұрын

@@ibrahimisrafilov1248 hmmm, yeah that's a tough one! I think the audio signal would impact the result significantly.

@sameermishra3598 3 жыл бұрын

Sir, I'm getting Handshake error 403 forbidden

@NicholasRenotte 3 жыл бұрын

Heya @Sameer, can you try creating a new instance in a different region?

@AmandeepKaurDhillon 3 жыл бұрын

Hii sir really I like your coding style, but this code returning me following error "Handshake status 403 Forbidden " any suggestions?

@NicholasRenotte 3 жыл бұрын

Heya, try spinning up an STT service in another region. This error sometimes pops up.

@AmandeepKaurDhillon 3 жыл бұрын

@@NicholasRenotte thanks, Yes, solved with changing the location

@mazaharhulhaque4482 3 жыл бұрын

I want to transcribe names I speak in microphone. How can I give some meta data (kind of hint) to improve the transcription

@NicholasRenotte 2 жыл бұрын

Can fine tune the model using Watson STT! medium.com/ibm-data-ai/tune-by-example-how-to-tune-watson-text-to-speech-for-better-intonations-bcee8404d927

@عمرالقرني-ه6ي 3 жыл бұрын

Hey nicholas thank you for this video I Have a Problem with this code when I run it it give me HandShake 401 unauthorized Problem Can you tell me How can I fix this problem I am sure of my apikey and region

@NicholasRenotte 3 жыл бұрын

Can you try using the us-south region?

@nitinpatel35 3 жыл бұрын

I am trying to pickle my final text output to process it using the NLP service. However, I am not able to. Could anyone guide how can I pickle the final text output, please?

@nitinpatel35 3 жыл бұрын

I found it. Just in case anyone is looking for here are the details. Replace the on_close function with the below syntax. This will generate transcribe.pkl file in the same folder which you can use for further analytics. def on_close(ws): """Upon close, print the complete and final transcript.""" global LAST if LAST: FINALS.append(LAST) transcript = "".join([x['results'][0]['alternatives'][0]['transcript'] for x in FINALS]) print(transcript) with open("transcribe.pkl", "wb") as file: pickle.dump(transcript, file)

@NicholasRenotte 3 жыл бұрын

Thanks for sharing @Nitin

@esmahanaldoseri8151 3 жыл бұрын

@@nitinpatel35 Hi , I added the code you've mentioned above but it said ( pickle is not defined)

@ziyadcodes 3 жыл бұрын

NICHOLAAAS help 😥, so I was getting the 403 error so I changed it to London which is closest to where I live then I tried again and now I'm getting error 503 service unavailable please anyone help

@ziyadcodes 3 жыл бұрын

so I somehow got it to work😅🥳, all I did was that I wrote py -m pip install -r requirements.txt and it told me that requirement already satisfied then I wrote python transcribe.py -t 10 and it gave me error 403 instead of error 503 which meant that I was using the wrong region when I was 100% sure that I was using the right one, so I decided to try us-south because I saw a comment saying that it worked for him, so I used it and then I typed py -m pip install -r requirements.txt again but then after doing that I wrote python transcribe.py -t 10 and It worked, somehow😂 CONCLUTION: use us-south no matter what, and then reinstall the requirements then type python transcribe.py -t 10 and pray that it works ( :

@NicholasRenotte 3 жыл бұрын

I love the journey here 😂, thanks for sharing. US-South seems to be the way to go!

@usus8420 5 ай бұрын

hi great works but what about smartphones?

@souparnaroy5283 3 жыл бұрын

Hi Nicholas, I followed along with your video but when I try to run live transcription in the end it shows: Handshake status 403 Forbidden on_close() takes 1 positional argument but 3 were given Any idea how I can go about solving this?

@NicholasRenotte 3 жыл бұрын

Can you try using the us-south region?

@souparnaroy5283 3 жыл бұрын

@@NicholasRenotte Thanks mate. Changed that and it's working now.

@brown_canadian Жыл бұрын

Hey, I am getting the error, but I already had the us-south. Any fix?

@amitjena1556 7 ай бұрын

Hey Nicolas, How do it work on my teams meeting or KZbin ??

@mohamedhasib5037 3 жыл бұрын

i have this error and i can't solve it Handshake status 403 Forbidden

@NicholasRenotte 3 жыл бұрын

Heya @Mohamed, can you double check your APIKEY and Region are correct?

@mohamedhasib5037 3 жыл бұрын

@@NicholasRenotte i checked them many times but i don't know where is the problem

@NicholasRenotte 3 жыл бұрын

@@mohamedhasib5037 can you show me the full output?

@andreasweilinghoff9075 3 жыл бұрын

@@NicholasRenotte I've got exactly the same problem using Windows 10 with Python 3.9.1.. I also checked the apikey and the region as specified in my cloud profile

@allyg1383 3 жыл бұрын

@@NicholasRenotte Probably a problem in the region? I chose en-GB, otherwise I come from Slovenia.

@manishsharma2211 3 жыл бұрын

Handshake status 403 Forbidden Error while running the transcribe.py I have double checked the API and Region Any help Nic ?

@NicholasRenotte 3 жыл бұрын

Heya @Manish, can you try using a different region when you setup your service? Try us-east.

@srivatsavm3892 2 жыл бұрын

Can I get a code where I get only one line where the words are added as I speak(instead of the whole line printing again and again)?

@Kishi1969 3 жыл бұрын

Wao you are Amazing but try to explain that our minutes in IBM WATSON can finished and what else can we do?

@NicholasRenotte 3 жыл бұрын

You can upgrade the plan or you might need to delete and create a new free tier but keep in mind that the API key will change!

@Kishi1969 3 жыл бұрын

@@NicholasRenotte Thanks for your response, Please im having problem with FFmpeg you did the other time i sent you text but not response, if you want me to re-send sir

@cajwan 3 жыл бұрын

Hi! thanks for the video. is it possible to convert the recorded speech into a text file?

@NicholasRenotte 3 жыл бұрын

Yup, another subscriber shared the code. Add this in transcribe.py in line126 with open('output.txt', 'w') as out: out.writelines(data['results'][0]['alternatives'][0]['transcript'])

@sindugokulapati9834 3 жыл бұрын

hey i got the following error Handshake status 403 Forbidden could anyone plx help me out

@NicholasRenotte 2 жыл бұрын

Try using the us-south region :)

@avnishat24 2 жыл бұрын

This is very useful. Thanks a lot. I am facing "Handshake status 403 Forbidden on_close() takes 1 positional argument but 3 were given" error and based on below comments I did try changing region to "us-south" but still I get the same error. Tried multiple times creating new service in IBM cloud in us-south. As region / api-key does not seem to be an issue here I would like to know if I must change the url in the code as I saw few comments related to changing url. If yes which file has the url to be changed? When i ran a sample test with curl command using an input audio file (example shown in ibm cloud page) I do get a response of the transcription test.

@NicholasRenotte 2 жыл бұрын

Try using the us-south region instead!

@avnishat24 2 жыл бұрын

@@NicholasRenotte Thanks. region did not help but i changed url in transcribe.py file and it fixed 403 error. I still get "on_close() takes 1 positional argument but 3 were given" but atleast it works. This one transcribes only microphone voice. Is there any way to transcribe all voices in the meeting? If there are 2 or more ppl in the meeting I would love to transcribe all the voices.

@sarindrathereserandriambel417 2 жыл бұрын

@@avnishat24 hey can you help me to fix this too, I did not manage to solve this problem

@rafaelprudencioleite7291 2 жыл бұрын

@@avnishat24 How do u solve it?

@michelemetta23 2 жыл бұрын

How do you solve it? I don't know..

@Snakebite0 7 ай бұрын

Sir I'm trying fine tune mozila deepspeech with my custom data in colab It's not working I tried it in many ways but not working Can you can give me any idea ? Or can you do a video on it 🥺

@katherinezhang7194 3 жыл бұрын

Thanks for sharing this ! Question : when I run pip install -r requirements.txt I get ERROR: Command errored out with exit status 1: - do you now how to fix this? (I'm on a Mac) Thanks !!

@NicholasRenotte 3 жыл бұрын

Heya @Katherine, is there a larger error? If not, can you try individually installing the packages from the requirements.txt file, ideally one should error out, then we can work through it!

@katherinezhang7194 3 жыл бұрын

@@NicholasRenotte Thanks ! (Was missing Homebrew Portaudio - it now works finally !) When I run python transcribe.py -t 20 it's showing ImportError: No module named configparser - any ideas ?

@NicholasRenotte 3 жыл бұрын

@@katherinezhang7194 try running pip install configparser

@keatlck 3 жыл бұрын

@@katherinezhang7194 python3 transcribe.py -t 20

@aprosflumine9074 3 жыл бұрын

Is there any way to make computer read this text after you recorded it.

@Van088 3 жыл бұрын

Same question pls

@NitheshS-gm6cf 4 ай бұрын

I cannot create my ibm account they are asking for credit card

@Prateikx Жыл бұрын

How to transcribe from the stream of the audio in real time whisper AI model?

@raha5985 3 жыл бұрын

i followed your code and every thing work but for no reason the output did not show when i done recording

@NicholasRenotte 3 жыл бұрын

Got any errors?

@raha5985 3 жыл бұрын

@@NicholasRenotte no

@NicholasRenotte 3 жыл бұрын

@@raha5985 is your mic connected? Might be muted.

@raha5985 3 жыл бұрын

@@NicholasRenotte no it not mute ,every things i say it show but when the recording is done the final result don’t show

@NicholasRenotte 3 жыл бұрын

@@raha5985 hmmm, I'm not too sure unfortunately. If the API keys and the URLs are set and you're not getting errors, kinda hard to debug.

@meg33333 2 жыл бұрын

Which algorithm is used in this speech to text?

@BtechF15 5 ай бұрын

reallly really love you bro

@vamsivuyyuru671 2 жыл бұрын

Hi Nicholas, Thank you for the video, as always it is crystal clear and short. I have tried to implement and facing an issue, Could you please guide me? Error: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1123) on_close() takes 1 positional argument but 3 were given I have tried the solution from the comments such as to change the regions, but none of them worked. Please help. Thank you in advance.

@KuzieJames 3 жыл бұрын

i get a "certificate verify failed" error when I try this. i'm very familiar with using the ibm python sdk's so i'm wondering if this authentication method is not correct because they ask you to use their authenticator nowadays. any idea why my authentication might be failing? i've double and triple checked my url and api key everything should be good. thanks in advance!

@NicholasRenotte 3 жыл бұрын

Heya @Kuzie, can you share the full error?

@samuelsilva4665 3 жыл бұрын

Hey. I'm getting this "Handshake status 403 Forbidden" when I try to run the command. I've already tried to set the same model as my region but I still getting the same problem

@NicholasRenotte 3 жыл бұрын

Heya @Samuel, can you try the us-south region. Looks like some of the others were dropping out.

@samuelsilva4665 3 жыл бұрын

@@NicholasRenotte If I'm not wrong I think that I did that when I was trying to use this way to create the voice assistant and it didn't worked (I also believe that I actually tried all the regions) but I'll try again. Thanks for helping

@samuelsilva4665 3 жыл бұрын

@Zain Lokhandwalla Yeah, that's what happened to me too. I've tried lots of different regions and didn't worked

@vmars316 2 жыл бұрын

Can this live speech be used for youtube videos ?

@abdullahhashmi5423 2 жыл бұрын

[Errno 11001] getaddrinfo failed on_close() takes 1 positional argument but 3 were given this error coming can u help i changes three regions too as you said down but didnt work

@seymuromarov9287 6 ай бұрын

Hi, I have the same problem, did you solve it?

@stephanetollec9776 3 жыл бұрын

Hi Nicholas, thanks for sharing this content. I successfully installed the requested SW pieces but when running the python -t transcribe.py command, I get an error message "Handshake status 403 Forbidden". Looking closer, it seems URLs are deprecated. Is there a new available transcribe.py with new URLs? Thanks for you help.

@NicholasRenotte 3 жыл бұрын

Heya @Stéphane can you try setting up the API in an alternate region?

@سناءدهلوي 3 жыл бұрын

how could i save the output here as a text ?

@NicholasRenotte 3 жыл бұрын

You can tweak the underlying code to output the results as a text file once the connection closes.

@YuvarajSR-m3l 11 ай бұрын

Hey Nicholas, while creating account in IBM cloud, it shows error. How can we solve this

@hadjerBrioua 10 ай бұрын

did you solve it?

@YuvarajSR-m3l 10 ай бұрын

@@hadjerBrioua not solved, I done with another platform

@hadjerBrioua 10 ай бұрын

which platform did you use? and did it give the same result and efficiency?@@YuvarajSR-m3l

@canislupus2661 2 жыл бұрын

I get this error. I would highly appreciate if you could help. Double checked everything, so I followed correctly, also tried different region [Errno -2] Name or service not known on close() takes 1 positional argument but 3 were given

@kevwesophia Жыл бұрын

Hello please, did you figure out the problem and solution cause i am also getting the same issue

@canislupus2661 Жыл бұрын

@@kevwesophia no sorry. i assume there are far more effective options nowadays already though

@fuzzyreplex2033 2 жыл бұрын

'[Error 11001] getaddrinfo failed on_close() takes 1 positional rgument but 3 were given' Error. Any information from anyone would be very appreciated

@fuzzyreplex2033 2 жыл бұрын

I've tried changing regons and I couldn't get it to function. My main desire is to get a solid base for speech to text and then have the command prompt return to me individual words that I've said. I want to use this data to code some trigger words which will activate commands.

@eilonlevi3927 Жыл бұрын

I am the 1K like :D

@amritkumar8876 3 жыл бұрын

I tried to run transcribe python file but it showed error ( handshake status 403 forbidden) . Please help 🙏

@NicholasRenotte 3 жыл бұрын

Can you try using the us-south region for me @Amrit?

@fatimahjabr1269 3 жыл бұрын

great video , thanx :) i'm having trouble with the last step (Running Live Speech to Text) it first had a problem with importing configparser but i googled the solution and solve it. now it says:

@NicholasRenotte 3 жыл бұрын

Got the rest of the error for me?

@fatimahjabr1269 3 жыл бұрын

@@NicholasRenotte I have no idea what happened to the rest of the comment 😂! anyway, the error I'm having now, is: Traceback (most recent call last): File "transcribe.py", line 27 in import pyaudio ImportError: No module name payaudio event though I actually downloaded it successfully, I also tried to download it again just to be sure and the requirement was satisfied.I don't know what to do! I'm using mac so the pipwin is not working

@fatimahjabr1269 3 жыл бұрын

ti anyone having the same problem, I FIGURE IT OUT ! all what you have to do is to add number 3, so it will be: python3 transcribe.py -t 10 an off course you can change the seconds number

@jordondraggon1459 2 жыл бұрын

@@fatimahjabr1269 Bit late but how did you solve the configparse if you remember, i have an error importing it with no module found despite the fact i've successfully downloaded it: using python 2.7.15 File "transcribe.py", line 19, in import configparser ImportError: No module named configparser