Transcribe Video to Text with Python and Watson in 15 Minutes

  Рет қаралды 51,254

Nicholas Renotte

Nicholas Renotte

Күн бұрын

Пікірлер: 169
@M310GL
@M310GL 3 жыл бұрын
Amazing tutorial, everything work smoothly. Hopefully, IBM will provide better models to non-english languages in the future.
@juandavidruizcohen1380
@juandavidruizcohen1380 3 жыл бұрын
Such a good tutorial!, would love to see some content of actual write up and training of these models. Keep the good work
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
Heya @Juan, thanks so much! More of this coming this year!
@harry_code
@harry_code 3 жыл бұрын
Really Informative and concise Nicholas...! Thanks a ton for this awesome tutorial!
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
Thanks so much @Hariharan, glad you enjoyed it!
@DennyBaso
@DennyBaso 2 жыл бұрын
When i try to running "!brew install ffmpeg" this message show >'brew' is not recognized as an internal or external command, operable program or batch file< How to fix this? I use Windows Operating System.
@MarcVerwerft
@MarcVerwerft 2 жыл бұрын
Absolutely spot on - good content, good explanation, fast tutorial with all the basics. Thanks a million ;-)
@jloibman
@jloibman 2 жыл бұрын
Hi! When I tried the command "subprocess.call(command, shell=True)" for a mp4 video it returned 1.... Do you know how can I fix that to generate the .wav? Thanks!
@francycharuto
@francycharuto 3 жыл бұрын
You're the man! Thanks for putting it together.
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
Anytime!!! Pumped it’s proving useful!
@danieljuca
@danieljuca Жыл бұрын
Is it possible to convert this text into a subtitle file...?
@guyincognito1985
@guyincognito1985 3 жыл бұрын
Every so often I find a YT channel so awesome, that I say to it... "Where have you been all my life?" Are any of these ML speech to text services accurate enough to use? I paused the video and read the transcript and it seemed pretty "garbled".
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
Hahahah thanks so much @Guy Incognito! They're good, a lot of the time it really boils down to the quality of the audio and using accent specific models e.g. a lot of models would suck for me unless I used an Australian specific model!
@rohithkumarbairy6034
@rohithkumarbairy6034 3 жыл бұрын
I'm not able to store the audio, I'm using windows 10 and Jupiter Notebook any suggestions?
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
Heya @Rohith, does the folder you're trying to put it in exist?
@kei4838
@kei4838 Жыл бұрын
Great! You can highlight and download all the transcripts with one click with Glasp.
@varunvora816
@varunvora816 3 жыл бұрын
Great Job. Really Helpful!
@DevsLikeUs
@DevsLikeUs 4 жыл бұрын
Awesome tutorial, thank you!
@NicholasRenotte
@NicholasRenotte 4 жыл бұрын
Thanks a billion! Glad you enjoyed it!! #happycoding
@ansh6848
@ansh6848 2 жыл бұрын
Wow! That's amazing but is there any way by which we can convert test to video?
@khubir.4483
@khubir.4483 3 жыл бұрын
'brew' is not recognized as an internal or external command, operable program or batch file. I don't know what is the problem
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
Try installing homebrew: brew.sh/
@clairematthews2255
@clairematthews2255 3 жыл бұрын
Thanks for the great videos @nicholas! At step 3 I am getting a "NameError: name 'stt' is not defined" - any tips? I also wondered how would this source code change if you were using already prepared audio .wav files in Jupyter folder?
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
Heya @Claire! Thanks so much 🙏 . Just checking, did the code below run successfully? It's possible that if you weren't able to authenticate then the STT variable wouldn't be available, once the code below runs fine you should be good to go. authenticator = IAMAuthenticator(apikey) stt = SpeechToTextV1(authenticator=authenticator) stt.set_service_url(url) Ah, if you've already got preprocessed audio files you can skip the audio extraction. Check this out, it's straight Speech (Audio WAV) to Text: kzbin.info/www/bejne/d2rCYYKdjJZ_kLc
@thebigbigdaddy
@thebigbigdaddy Жыл бұрын
Would you have something that also detects different speakers? Great video!
@yogesharora-g6w
@yogesharora-g6w Ай бұрын
Hey please help. I want transcript of one youtube channel of every video of theirs. How is it possible?
@eugeneshilow
@eugeneshilow 3 жыл бұрын
Worked for me ONE time. But then at Step 3 I started getting this error: ApiException: Error: "The server encountered an internal error or misconfiguration and was unable to complete your request. Reference&#32;&#35;4&#46;cdc7b5c&#46;1609227022&#46;c04aea , Code: 503" How to fix it?
@eugeneshilow
@eugeneshilow 3 жыл бұрын
More info the error "Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/ibm_cloud_sdk_core/base_service.py", line 224, in send raise ApiException( ibm_cloud_sdk_core.api_exception.ApiException: "
@eugeneshilow
@eugeneshilow 3 жыл бұрын
Fixed this. The error was due to the typo in the model's name: en-US_NarrowbandModel
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
Heya @Eugene, awesome you got it working!
@patrickjane276
@patrickjane276 3 жыл бұрын
@@eugeneshilow fireeeeeee. now I love KZbin comments again. thank you so much for posting this
@SurajSingh-lu8ei
@SurajSingh-lu8ei Жыл бұрын
how can i transcript 9 hours video within minutes, is it possible ?? plz reply i am working on project
@wannaknowme2841
@wannaknowme2841 2 жыл бұрын
What comment does we use to type in cmd? For mac we used open . How about windows?
@alishaansari9086
@alishaansari9086 3 жыл бұрын
Amazing!!! This is exactly what I was looking for. I got this error during the execution of idk if this is just the server issue or due to code. ApiException Traceback (most recent call last) in 1 with open('audio1.wav','rb') as f: ----> 2 res = stt.recognize(audio=f, content_type='audio1/wav', model='en-US_NarrowbandModel', timestamp=True, continuous=True).get_result() ~\anaconda3\lib\site-packages\ibm_watson\speech_to_text_v1.py in recognize(self, audio, content_type, model, language_customization_id, acoustic_customization_id, base_model_version, customization_weight, inactivity_timeout, keywords, keywords_threshold, max_alternatives, word_alternatives_threshold, word_confidence, timestamps, profanity_filter, smart_formatting, speaker_labels, customization_id, grammar_name, redaction, audio_metrics, end_of_phrase_silence_time, split_transcript_at_phrase_end, speech_detector_sensitivity, background_audio_suppression, **kwargs) 504 data=data) 505 --> 506 response = self.send(request) 507 return response 508 ~\anaconda3\lib\site-packages\ibm_cloud_sdk_core\base_service.py in send(self, request, **kwargs) 265 status_code=response.status_code) 266 --> 267 raise ApiException(response.status_code, http_response=response) 268 except requests.exceptions.SSLError: 269 logging.exception(self.ERROR_MSG_DISABLE_SSL) ApiException: Error: Internal Server Error Internal Server Error - Write The server encountered an internal error or misconfiguration and was unable to complete your request. Reference&#32;&#35;4&#46;debd7768&#46;1616270159&#46;1230d52 , Code: 503 Really appreciate if you could help me with this.
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
Hmmm, without seeing your code I've got a feeling it might be an incorrect APIKEY. Can you check this is correct and also compare your code to the sample code in the description?
@emchivi5780
@emchivi5780 3 жыл бұрын
Hey Man did you find what was the problem? I getting the same error, thanks!
@stanleymwangi6524
@stanleymwangi6524 4 жыл бұрын
Awesome tutorial. What if you want to generate srt files instead of a transcript?
@NicholasRenotte
@NicholasRenotte 4 жыл бұрын
Thanks so much @Stanley Mwangi, I started looking into this yesterday. I've got it added to the list of upcoming vids 👨‍💻
@JustKamKam
@JustKamKam Жыл бұрын
Why is it impossible to set up an IBM account. Trying to replicate this and I can't create an IBM Cloud account.
@patrickjane276
@patrickjane276 3 жыл бұрын
awesome man thanks so much! any idea if there's a Watson model that knows when to add exclamation points or question marks? Trying to come up with a way to show sentence importance.
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
Ooooh, I don't think so @Max. What's the goal regarding importance?
@datareactor4143
@datareactor4143 2 жыл бұрын
Hi Nicholas, I'm unable to get the audio file here i tried to use different short video files, do i need to change any other parameters according to that? I'm not getting any error but not able to get the audio.wav file extracted
@destinibuckner1773
@destinibuckner1773 3 жыл бұрын
Is there a different format of what command should equal if you're using windows? I'm not getting an audio file output
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
It should be the same @Destini, I'm using Windows here as well :)
@sahilgarg4850
@sahilgarg4850 3 жыл бұрын
Hey man! Great Content. Just a quick question - What is the alternative for 'apiKey' & 'URL' if we have to use it for multiple videos and unlimited minutes without paying anything (Free). Is there any other way to do that?
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
Could look at open source alternatives. Haven't dug into too many myself.
@NicholasPadilha
@NicholasPadilha 2 жыл бұрын
Is it possible to take the timestamp of what was said in the video / audio?
@vaishaligunjal582
@vaishaligunjal582 Жыл бұрын
getting error while importing libraries : AttributeError: partially initialized module 'charset_normalizer' has no attribute 'md__mypyc' (most likely due to a circular import) from ibm_watson import SpeechToTextV1 from ibm_watson.websocket import RecognizeCallback, AudioSource from ibm_cloud_sdk_core.authenticators import IAMAuthenticator
@asmitamondal705
@asmitamondal705 2 жыл бұрын
Hello, awesome video...I was just wondering if this would work for videos which do not have a KZbin transcript
@bronsonranga5770
@bronsonranga5770 4 жыл бұрын
Very informative, thank you bro ❤️
@NicholasRenotte
@NicholasRenotte 4 жыл бұрын
Thanks so much @Bronson Ranga, glad you enjoyed it!
@MrMwenesi
@MrMwenesi 3 жыл бұрын
How would you do the same for a live youtube video?
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
Pass through the audio feed to the API!
@devpriyashivani1855
@devpriyashivani1855 Жыл бұрын
Hey, I'm getting this error: TypeError: Session.request() got an unexpected keyword argument 'continuous'
@devpriyashivani1855
@devpriyashivani1855 Жыл бұрын
When I removed the continuous=True, I got the below error: ApiException: Error: Internal Server Error Internal Server Error - Write The server encountered an internal error or misconfiguration and was unable to complete your request. Reference&#32;&#35;4&#46;4d752c31&#46;1672464316&#46;3358daf9 , Code: 503
@tennisboi1
@tennisboi1 2 жыл бұрын
Not sure if this is because of updates they have made. I'm on step 3 using the recognize command and keep getting the error "request() got an unexpected keyword argument 'continuous'". I wonder if you know how they have updated the library to get this step working again.
@user-dg8ys
@user-dg8ys Жыл бұрын
same problem here
@tennisboi1
@tennisboi1 Жыл бұрын
@@user-dg8ys I ended up just deleting the continuous after looking through the library. And that worked for me, though took me running it twice though
@user-dg8ys
@user-dg8ys Жыл бұрын
@@tennisboi1 it didn't work me like that actually. Did you also delete the comma at the left side of continuous?
@alex-vq1yy
@alex-vq1yy 6 ай бұрын
bro if u have another method without using ibm key then please tell me
@madhurir9646
@madhurir9646 2 жыл бұрын
I am getting this error: ApiException: Error: Internal Server Error Internal Server Error - Write The server encountered an internal error or misconfiguration and was unable to complete your request. Reference&#32;&#35;4&#46;733a2f17&#46;1655198247&#46;4e83735e , Code: 503
@ju1042
@ju1042 2 жыл бұрын
Question: it would work with a video in a different format like mp4?
@NicholasRenotte
@NicholasRenotte 2 жыл бұрын
Yup!
@praveenshahani5339
@praveenshahani5339 2 жыл бұрын
Have a doubt, is it possible for put a command which can search the youtube video and articulate the speech?
@sneh5496
@sneh5496 5 ай бұрын
youtube dl didn't work for me pip install yt-dlp did
@raphaelradespiel9970
@raphaelradespiel9970 3 жыл бұрын
Hi, so, I'm trying this project out so that I can speed up my transcription tasks and I was able to fix some previous problems and learn a bit more about jupyter notebook, but now I hit a wall that I just can't seem to find a solution. In the "Open Audio Source and Convert" part, the first code cell, I've been getting this error that says: ERROR:root:Error in service call then the info apears and there where three consecutive "ConnectionAbortedError: [WinError 10053]" errors. could you help me out? I've read it has to do with anti-virus blocking the connection or my firewall, but I've dissabled them all to test this out. Maybe I just haven't configured python correctly or I missed a step. Anyways, do you have any idea what this could be? I can send you the error messege if you want. (thanks in advanced)
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
Heya @Raphael, yup, shoot through the error!
@raphaelradespiel9970
@raphaelradespiel9970 3 жыл бұрын
@@NicholasRenotte ok, here it goes: ERROR:root:Error in service call Traceback (most recent call last): File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\site-packages\urllib3\connectionpool.py", line 706, in urlopen chunked=chunked, File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\site-packages\urllib3\connectionpool.py", line 394, in _make_request conn.request(method, url, **httplib_request_kw) File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\site-packages\urllib3\connection.py", line 234, in request super(HTTPConnection, self).request(method, url, body=body, headers=headers) File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\http\client.py", line 1239, in request self._send_request(method, url, body, headers, encode_chunked) File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\http\client.py", line 1285, in _send_request self.endheaders(body, encode_chunked=encode_chunked) File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\http\client.py", line 1234, in endheaders self._send_output(message_body, encode_chunked=encode_chunked) File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\http\client.py", line 1065, in _send_output self.send(chunk) File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\http\client.py", line 986, in send self.sock.sendall(data) File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\ssl.py", line 972, in sendall v = self.send(byte_view[count:]) File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\ssl.py", line 941, in send return self._sslobj.write(data) File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\ssl.py", line 642, in write return self._sslobj.write(data) ConnectionAbortedError: [WinError 10053] Uma conexão estabelecida foi anulada pelo software no computador host During handling of the above exception, another exception occurred: Traceback (most recent call last): File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\site-packages equests\adapters.py", line 449, in send timeout=timeout File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\site-packages\urllib3\connectionpool.py", line 756, in urlopen method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2] File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\site-packages\urllib3\util etry.py", line 531, in increment raise six.reraise(type(error), error, _stacktrace) File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\site-packages\urllib3\packages\six.py", line 734, in reraise raise value.with_traceback(tb) File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\site-packages\urllib3\connectionpool.py", line 706, in urlopen chunked=chunked, File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\site-packages\urllib3\connectionpool.py", line 394, in _make_request conn.request(method, url, **httplib_request_kw) File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\site-packages\urllib3\connection.py", line 234, in request super(HTTPConnection, self).request(method, url, body=body, headers=headers) File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\http\client.py", line 1239, in request self._send_request(method, url, body, headers, encode_chunked) File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\http\client.py", line 1285, in _send_request self.endheaders(body, encode_chunked=encode_chunked) File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\http\client.py", line 1234, in endheaders self._send_output(message_body, encode_chunked=encode_chunked) File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\http\client.py", line 1065, in _send_output self.send(chunk) File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\http\client.py", line 986, in send self.sock.sendall(data) File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\ssl.py", line 972, in sendall v = self.send(byte_view[count:]) File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\ssl.py", line 941, in send return self._sslobj.write(data) File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\ssl.py", line 642, in write return self._sslobj.write(data) urllib3.exceptions.ProtocolError: ('Connection aborted.', ConnectionAbortedError(10053, 'Uma conexão estabelecida foi anulada pelo software no computador host', None, 10053, None)) During handling of the above exception, another exception occurred: Traceback (most recent call last): File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\site-packages\ibm_cloud_sdk_core\base_service.py", line 227, in send response = requests.request(**request, cookies=self.jar, **kwargs) File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\site-packages equests\api.py", line 61, in request return session.request(method=method, url=url, **kwargs) File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\site-packages equests\sessions.py", line 542, in request resp = self.send(prep, **send_kwargs) File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\site-packages equests\sessions.py", line 655, in send r = adapter.send(request, **kwargs) File "c:\users\ana carolina valle\appdata\local\programs\python\python36\lib\site-packages equests\adapters.py", line 498, in send raise ConnectionError(err, request=request) requests.exceptions.ConnectionError: ('Connection aborted.', ConnectionAbortedError(10053, 'Uma conexão estabelecida foi anulada pelo software no computador host', None, 10053, None))
@raphaelradespiel9970
@raphaelradespiel9970 3 жыл бұрын
@@NicholasRenotte Hey, never mind, I found out there was a 100MB limit to the audio files. I fixed it and its working just fine right now. Thanks for the tutorial my dude.
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
@@raphaelradespiel9970 anytime!! Glad you got it!
@gauravmalik3911
@gauravmalik3911 Жыл бұрын
Worked for me, cheers
@anirudhc426
@anirudhc426 3 жыл бұрын
Awesome video!
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
Thanks so much 🙏!!
@xiaohanzhang4052
@xiaohanzhang4052 3 жыл бұрын
Fantastic video! Thanks~ I tried your code with your video file. But I got the following error in the final api call: "It is required that you pass in a value for the "algorithms" argument when calling decode()" Is anyone seeing that error? happen to know what reason it is?
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
Heya @Xiaohan, yup one of the other subscribers who figured out this was an issue with PyJWT, try installing 1.7.1 and it should fix the issue. Example install: pip install PyJWT==1.7.1
@vijayasekaran3144
@vijayasekaran3144 4 ай бұрын
How can I do it for Instagram videos ?
@aryanvijay6081
@aryanvijay6081 3 жыл бұрын
hey, i had a doubt what happens when we do it on a mp4 file
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
Should still work, just need to change your input format.
@alcidesneves2807
@alcidesneves2807 Жыл бұрын
what it for?......what about search exacly word anda finda video with that word im youtube
@foxtrothu2831
@foxtrothu2831 3 жыл бұрын
Doesn't work for me! Return Code: 1 instead of 0
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
Heya @Foxtrot, any additional errors?
@foxtrothu2831
@foxtrothu2831 3 жыл бұрын
@@NicholasRenotte thank you for your reply. I wrote every code same as yours in the video, #### But in the part: extracting audio from video, it return code:1 instead of 0. I have no idea what it means and no additional error explanation shown. After googling this issue, I found it's better to give up. So I used PR to convert the mp4 to wav. #### In the part - converting audio to txt through Watson STT, my connection would be aborted after serveral minutes' running the code. Maybe it's because I'm in China? Even though using VPN, I am still not able to access it. Thanks for your video after all. I know it's not easy to make a video. Keep up the great work!
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
@@foxtrothu2831 thanks appreciate the feedback. Weird, I would've thought you could access the API regardless of location.
@butternuts842
@butternuts842 Жыл бұрын
how can I make it so it can tell the difference between 2 or more speakers?
@rachidaboussaid501
@rachidaboussaid501 3 жыл бұрын
how can i get subtitle with watson my dear fellow as srt or vtt file to lunch it with that's video ?
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
Oooh, haven't gone that far unfortunately rachid!
@PakistanInstitute
@PakistanInstitute Жыл бұрын
bro I have vs code editor I installed youtube-dl in vs code but unable to download a youtube video in which editor you are running youtube-dl and then you paste a link of video and it start downloading plz guaid step by step 🙂
@nobleson685
@nobleson685 4 жыл бұрын
In the third step 3. Open Audio Source and Convert, the program throws me this error. I am unable to correct it. ConnectionError: ('Connection aborted.', OSError("(32, 'EPIPE')")). Do you know why it happened? Thanks
@NicholasRenotte
@NicholasRenotte 4 жыл бұрын
Let's dig a little further, what machine/OS are you using and can you paste your code below? :)
@nobleson685
@nobleson685 4 жыл бұрын
@@NicholasRenotte macOS Mojave 10.14.6 with open('audio.wav', 'rb') as f: res = stt.recognize(audio=f, content_type='audio/wav', model='en-US_NarrowbandModel', continuous=True).get_result()
@NicholasRenotte
@NicholasRenotte 4 жыл бұрын
Is your audio file greater than 100MB? It might be because the file sizes are too large as the STT service can only handle audio up to 100MB. Try this code below, it'll split your audio file first and loop through them to convert! I commented #NEW where there are new code bits! Let me know how you go. # 0. Install and Import Dependencies !pip install ibm_watson !brew install ffmpeg import subprocess from ibm_watson import SpeechToTextV1 from ibm_watson.websocket import RecognizeCallback, AudioSource from ibm_cloud_sdk_core.authenticators import IAMAuthenticator # NEW Import os to loop through directory import os # 1. Extract Audio command = 'ffmpeg -i aiml.mkv -ab 160k -ar 44100 -vn audio.wav' subprocess.call(command, shell=True) # NEW Split audio files into manageable chunks command = 'ffmpeg -i audio.wav -f segment -segment_time 240 -c copy out%03d.wav' subprocess.call(command, shell=True) # 2. Setup STT Service apikey = 'YOUR API KEY' url = 'YOUR URL' # Setup service authenticator = IAMAuthenticator(apikey) stt = SpeechToTextV1(authenticator=authenticator) stt.set_service_url(url) # 3. Open Audio Source and Convert # NEW loop through audio files and convert results = [] for filename in os.listdir('.'): if filename.endswith(".wav"): with open(filename, 'rb') as f: res = stt.recognize(audio=f, content_type='audio/wav', model='en-AU_NarrowbandModel', continuous=True).get_result() results.append(res) # 4. Process Results and Output to Text len(res['results']) # Preprocess transcriptions text = [] for file in results: for result in file['results']: text.append(result['alternatives'][0]['transcript'].rstrip() + '. ') text = [para[0].title() + para[1:] for para in text] transcript = ''.join(text) with open('output.txt', 'w') as out: out.writelines(transcript)
@nobleson685
@nobleson685 4 жыл бұрын
@@NicholasRenotte You are right. The audio file was above 100MB. Thank you for the updated code. I was able to split the audio file into
@NicholasRenotte
@NicholasRenotte 4 жыл бұрын
Awesome, we're through that! Is that the full error? Also two things, have you updated the API key and URL? And are you behind a firewall at the moment, the request will need to be able to go out to the cloud service?
@jackReme
@jackReme 2 жыл бұрын
Thanks alot!
@pramodsurya.m
@pramodsurya.m 2 жыл бұрын
I'm stuck at step 4. Can you please help me with this After executing these lines that were provided -------------------------------------------------------------------------- with open('audio.wav', 'rb') as f: res = stt.recognize(audio=f, content_type='audio/wav', model='en-GB_BroadbandModel', continuous=True).get_result() --------------------------------------------------------------------------- TypeError Traceback (most recent call last) Input In [7], in () 1 with open('audio.wav', 'rb') as f: ----> 2 res = stt.recognize(audio=f, content_type='audio/wav', model='en-GB_BroadbandModel', continuous=True).get_result() File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\ibm_watson\speech_to_text_v1.py:587, in SpeechToTextV1.recognize(self, audio, content_type, model, language_customization_id, acoustic_customization_id, base_model_version, customization_weight, inactivity_timeout, keywords, keywords_threshold, max_alternatives, word_alternatives_threshold, word_confidence, timestamps, profanity_filter, smart_formatting, speaker_labels, customization_id, grammar_name, redaction, audio_metrics, end_of_phrase_silence_time, split_transcript_at_phrase_end, speech_detector_sensitivity, background_audio_suppression, low_latency, **kwargs) 580 url = '/v1/recognize' 581 request = self.prepare_request(method='POST', 582 url=url, 583 headers=headers, 584 params=params, 585 data=data) --> 587 response = self.send(request, **kwargs) 588 return response File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\ibm_cloud_sdk_core\base_service.py:306, in BaseService.send(self, request, **kwargs) 304 logger.warning('"%s" has been removed from the request', key) 305 try: --> 306 response = self.http_client.request(**request, 307 cookies=self.jar, 308 **kwargs) 310 if 200
@pepedecastro3352
@pepedecastro3352 2 жыл бұрын
I keep getting that continuous is not an argument ? any help please
@davegamboa-
@davegamboa- 2 жыл бұрын
same
@MrMwenesi
@MrMwenesi 3 жыл бұрын
And...can you also add translation with live video?
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
Heya man!! It's using PyTorch so the nearest equivalent would be ONNX or CoreML, check this out: github.com/ultralytics/yolov5/issues/251
@MrMwenesi
@MrMwenesi 3 жыл бұрын
@@NicholasRenotte That requires more explanation. I am basically a nocode/lowcode developer of conversational agents. how would I integrate in into the jupyter notebooks?
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
@@MrMwenesi oh, I think my comments got mixed up. Ignore that one. Check this out kzbin.info/www/bejne/j3Tcpo2DaZpnlas handles the live audio bit. Could extract the audio feed in real time and do something like that.
@kavitham4526
@kavitham4526 3 жыл бұрын
Hi, this is very helpful.. is there any possibility to write the code for converting text to video animation...If you do so, it will be very helpful for us
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
I believe so, I haven't dug into it yet but you could look at using GANs!
@kavitham4526
@kavitham4526 3 жыл бұрын
@@NicholasRenotte Thank you so much
@dogs8113
@dogs8113 2 жыл бұрын
How to make pyttsx3 read subtitles of text from KZbin and at the same time dub with translation?
@heartheart5543
@heartheart5543 2 жыл бұрын
Can you do it with R ?
@lukajvv.7839
@lukajvv.7839 3 жыл бұрын
Hey man, I have python installed on my computer but I keep getting an error stating that 'brew' is not recognized as an internal or external command, operable program or batch file. What should I do?
@lukajvv.7839
@lukajvv.7839 3 жыл бұрын
Sorry I am working on a windows machine, but I am nor sure how to install ffmpeg for it
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
No stress, so it's a three step process for Windows: 1. Download the ffmpeg source files from here: ffmpeg.org/download.html+ 2. Unzip the folder where you want the installation to be 3. Update your Windows environment variable PATH to include the path to the bin folder for ffmpeg
@1UniverseGames
@1UniverseGames 3 жыл бұрын
Is it possible to get or extract the KZbin videos subtitles/voice into Text? Any suggestions
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
Haven't tried it but in theory it should work.
@rohandevaki4349
@rohandevaki4349 Жыл бұрын
does this still work?
@lavanyakasu8852
@lavanyakasu8852 3 жыл бұрын
heyy...please tell me how to deploy this model
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
Heya @Iavanya, the tutorial uses a SaaS service via an API, there's no need to deploy it!
@ahmetozel5112
@ahmetozel5112 17 күн бұрын
I was really disappointed. you translate the audio, not the video. I thought you did it by processing audio and video data. very disappointing
@nevaehthompson5818
@nevaehthompson5818 2 жыл бұрын
Great vid, helped a lot. Sadly, however, I'll have to find a different method bc IBM deactivated my account for no reason with no warning.
@openmindjustdoit1306
@openmindjustdoit1306 3 жыл бұрын
good job did this support Arabic language or not
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
Sure does! cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-models
@humzaali5980
@humzaali5980 3 жыл бұрын
Hello, first of all amazing video and keep it up. I just want to ask you if i can use a path for input and output file.
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
Sure can, just add in the following paths in the read and output sections of the code: with open('PATH_TO_INPUT_FILE/audio.wav', 'rb') as f: res = stt.recognize(audio=f, content_type='audio/wav', model='en-AU_NarrowbandModel', continuous=True).get_result() ... text = [para[0].title() + para[1:] for para in text] transcript = ''.join(text) with open('PATH_TO_OUTPUT_FILE/output.txt', 'w') as out: out.writelines(transcript)
@humzaali5980
@humzaali5980 3 жыл бұрын
@@NicholasRenotte Thank you very much for replying Nicholas. I have tried your code but it does not transcribe all videos properly. Can you tell me what i am doing wrong. Thanks
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
@@humzaali5980 definitely, what seems to be happening? Any errors?
@humzaali5980
@humzaali5980 3 жыл бұрын
@@NicholasRenotte just that some times it does not transcribe it properly like it translate so its to solid it here to you. So some times it does not translate right. Specially a movie clip.
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
Ahhhh @@humzaali5980, you might need to refine the model sometimes depending on the audio quality and accents! cloud.ibm.com/docs/speech-to-text?topic=speech-to-text-customization
@achendvankar
@achendvankar 3 жыл бұрын
I am unable to extract audio using this code: command = 'ffmpeg -i aiml.mkv -ab 160k -ar 44100 -vn audio.wav' subprocess.call(command, shell=True) I get no audio output.
@achendvankar
@achendvankar 3 жыл бұрын
Any help would be greatly appreciated. Thank you :)
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
Definitely @A C, what's the name of your audio file?
@achendvankar
@achendvankar 3 жыл бұрын
The name of the audio file was audio.wav
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
@@achendvankar Your input file was named audio.wav?
@achendvankar
@achendvankar 3 жыл бұрын
@@NicholasRenotte yes, it was. Hope i am not doing anything wrong here and unnecessarily troubling you :)
@khubir.4483
@khubir.4483 3 жыл бұрын
command = 'ffmpeg -i Spirit.mkv -ab 160k -ar 44100 -vn audio.wav' subprocess.call(command, shell=True) is giving output 1. and brew is also not recognised as command. !brew install ffmpeg in windows 10 can anyone help me
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
Install brew: brew.sh/
@lynnwillis4332
@lynnwillis4332 3 жыл бұрын
anyone getting this error: ApiException Traceback (most recent call last) in 1 with open('audio.wav', 'rb') as f: ----> 2 res = stt.recognize(audio=f, content_type='audio/wav', model='en-AU_NarrowbandModel', continuous=True).get_result() ~\anaconda3\lib\site-packages\ibm_watson\speech_to_text_v1.py in recognize(self, audio, content_type, model, language_customization_id, acoustic_customization_id, base_model_version, customization_weight, inactivity_timeout, keywords, keywords_threshold, max_alternatives, word_alternatives_threshold, word_confidence, timestamps, profanity_filter, smart_formatting, speaker_labels, customization_id, grammar_name, redaction, audio_metrics, end_of_phrase_silence_time, split_transcript_at_phrase_end, speech_detector_sensitivity, background_audio_suppression, low_latency, **kwargs) 564 data=data) 565 --> 566 response = self.send(request) 567 return response 568 ~\anaconda3\lib\site-packages\ibm_cloud_sdk_core\base_service.py in send(self, request, **kwargs) 306 status_code=response.status_code) 307 --> 308 raise ApiException(response.status_code, http_response=response) 309 except requests.exceptions.SSLError: 310 logging.exception(self.ERROR_MSG_DISABLE_SSL) ApiException: Error: Internal Server Error Internal Server Error - Write The server encountered an internal error or misconfiguration and was unable to complete your request. Reference&#32;&#35;4&#46;470b3017&#46;1626566664&#46;3077eed1 , Code: 503
@rahulkmail
@rahulkmail 3 жыл бұрын
Excellent
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
🙏 thanks so much @Rahul!
@ibrahelsheikh
@ibrahelsheikh Жыл бұрын
Where code
@youngboys7342
@youngboys7342 2 жыл бұрын
Thank you sir
@kvafsu225
@kvafsu225 2 жыл бұрын
Fascinating
@praneethsai8589
@praneethsai8589 3 жыл бұрын
cant able to download videos C:\Users\user>youtube-dl kzbin.info/www/bejne/fH6ZnHuZjdWqb5o 'youtube-dl' is not recognized as an internal or external command, operable program or batch file.
@RunyCalmera
@RunyCalmera Жыл бұрын
Awesome❤
@johnclarkson6120
@johnclarkson6120 3 жыл бұрын
god...so great
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
Thanks sooo much 🙏
@lhlee1580
@lhlee1580 4 жыл бұрын
can the code be done for mp4 videos?
@NicholasRenotte
@NicholasRenotte 4 жыл бұрын
Sure can! Just change the command line: # From this command = 'ffmpeg -i aiml.mkv -ab 160k -ar 44100 -vn audio.wav' # To This command = 'ffmpeg -i yourfilename.mp4 -ab 160k -ar 44100 -vn audio.wav'
@ЛеонидБорисов-ч5х
@ЛеонидБорисов-ч5х 7 ай бұрын
Colors of Python snakes like a flag of my country
@muhammadnoval8787
@muhammadnoval8787 3 жыл бұрын
no arabic languange?
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
I believe there is one for Arabic :)
@loverizer8414
@loverizer8414 2 жыл бұрын
I think this is very complicated, A layman like me that doesn't know anything about python can't understand this. Try am make it more simple, I have to sign out in about 2mins because am not following.
@銘鋒
@銘鋒 3 жыл бұрын
Hi, can I know how to install ffmpeg Window in Jupyter Notebook? Your reply is appreciating. Thank you!
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
www.wikihow.com/Install-FFmpeg-on-Windows
@ggcharlie8511
@ggcharlie8511 3 жыл бұрын
The real question is, can it translate Playboi Carti
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
😂😂😂 next project man, mumble rap decoder!
@ashilshah3376
@ashilshah3376 Жыл бұрын
4:27
@ashilshah3376
@ashilshah3376 Жыл бұрын
kzbin.info/www/bejne/fH6ZnHuZjdWqb5o
@armeniansnoocersnoocer
@armeniansnoocersnoocer Ай бұрын
shame video ! dislike
@barigerajesh1
@barigerajesh1 2 жыл бұрын
Hi Nicholas, great job. when the following two instructions are run.. command = 'ffmpeg -i Spirit.mkv -ab 160k -ar 44100 -vn audio.wav' subprocess.call(command, shell=True) displaying Output as 1 and audio file is not created. And I'm working on Windows Workstation. @Nicholas Renotte, I Really appreciate if you could help me with this.
Analyzing Twitter Accounts with Python and Personality Insights
22:57
Nicholas Renotte
Рет қаралды 7 М.
AI Speech to Text for LONG Files in 15 Minutes with Watson STT and Python
17:39
💩Поу и Поулина ☠️МОЧАТ 😖Хмурых Тварей?!
00:34
Ной Анимация
Рет қаралды 1,5 МЛН
Modus males sekolah
00:14
fitrop
Рет қаралды 26 МЛН
I Built a Personal Speech Recognition System for my AI Assistant
16:32
Creating a Speech to Text Program with Python
8:38
CS Coach
Рет қаралды 64 М.
Best FREE Speech to Text AI - Whisper AI
8:22
Kevin Stratvert
Рет қаралды 976 М.
Extract Text from Video - images | Tesseract
18:21
Python 360
Рет қаралды 21 М.
AI Text to Speech in 10 Minutes with Python and Watson TTS
13:57
Nicholas Renotte
Рет қаралды 93 М.
A Simple Plain Text Knowledge System
12:56
Sebastian Daschner
Рет қаралды 5 М.