Build a Containerized Transcription API using Whisper Model and FastAPI

Рет қаралды 5,907

Күн бұрын

In this exciting tutorial, I'll guide you step by step on how to create your very own Containerized Transcription API using the powerful Whisper AI model and FastAPI as the backend framework.
We'll start by setting up a development environment and configuring FastAPI to build a robust web application. Then, we'll seamlessly integrate the open-source Whisper AI model, a cutting-edge solution for Speech-to-Text (STT) transcription, to enable accurate and efficient audio-to-text conversion.
But that's not all! We'll take it a step further by containerizing our application using Docker, ensuring that it runs consistently and efficiently in any environment. This approach not only simplifies deployment but also allows for scalability and easy management.
By the end of this tutorial, you'll have a fully functional, containerized Transcription API that can effortlessly convert audio files into text. Whether you want to automate your transcription tasks, enhance accessibility for your content, or explore the world of AI-powered applications, this project has you covered.
Don't forget to hit that "Like" button if you find this tutorial helpful, leave your questions and thoughts in the comments section below, and be sure to subscribe for more exciting AI and development tutorials.
GitHub Repo: github.com/AIAnytime
Whisper GitHub: github.com/openai/whisper
#openai #ai #python

Пікірлер: 27

@shivamroy1775 8 ай бұрын

Absolute quality content. So informative and I love how every step is explained in great detail.

@AIAnytime 8 ай бұрын

Glad you liked it!

@chrisumali9841 4 ай бұрын

Thanks for the demo and info, very informative and precise. I truly appreciate it. Easy to deploy. Have a great day.

@AIAnytime 4 ай бұрын

Glad it was helpful!

@shubhbhalla3850 3 ай бұрын

Great explanations, thank you so much for the tutorial!

@AIAnytime 3 ай бұрын

You're very welcome!

@MrZelektronz 7 ай бұрын

Solely judging from the title this is exactly what i need. I hope it works as I expect :D gonna keep watching

@AIAnytime 7 ай бұрын

Thanks 👍

@nicolassuarez2933 28 күн бұрын

Outstanding!

@kshitizkhandelwal879 8 ай бұрын

You are incredible. Can we get more of end to end projects involving Docker

@AIAnytime 8 ай бұрын

Thanks... you can watch this as well. kzbin.info/www/bejne/bXTIcn1meseteqM

@nicolassuarez2933 27 күн бұрын

Best way to deploy this container? AWS EC2 kind of expensive... needs lot of RAM

@harshkadam3702 8 ай бұрын

Hey , you created video on the text to image API in past , so can we able to create API that can use checkpoint from civitai , like able to use multiple checkpoint , models and able to call that API ? Is it possible ?

@joshmay9531 4 ай бұрын

Do you know if speaker diarization (breaking up the transcription be each speaker) can be built into this?

@SonGoku-pc7jl 4 ай бұрын

thank you so much! One question, in the first version of whisper you couldn't do a translation from English to Spanish. You could only do a .transcribe of one language or another but not the translation. Do you know if whisper v3 can now do translations from English to Spanish? or any updated whisperX or any options? In truth, where I want to use it the most is, for example, translating your videos since the KZbin translator is very bad and it is difficult to follow you. If possible, could you make a video? ;)

@ryanbradbury3745 3 ай бұрын

I notice you're pushing the audio file via http post method. Is there anyway to pull the file from a given location? i.e. from AWS S3 bucket, file system etc...

@nguyentoanhnt Ай бұрын

How can I use async with the code line: result = model.transcribe(temp.name)

@rois8888 7 ай бұрын

When I run in Postman in headers I put Content-Type: multipart/form-data and in the Body I put Key as "files" and for Value I upload the .wav file. For some reason I get files: undefined Maybe on Mac I'm supposed to do something different?

@josuechacon6240 5 ай бұрын

I got the same error. Because I called Files to the parameter and is mandatory (from FastAPI documentation) to call "file" the parameter in the function. file: UploadFile Then, you can access to the file: File = file.file

@RAVINDRACHOWDARY 4 ай бұрын

Hii 👋, Can you do for whisperjax?😉

@datasciencetoday7127 8 ай бұрын

hero

@josuechacon6240 5 ай бұрын

Someone know how to handle myltiples requests and running in differents GPU sockets? Because I have four GPU in the server but the model and FastAPI only use one GPU (number 0)

@concaption 8 ай бұрын

requirements file in incomplete. Is not working with the whisper library that i am usign from pypi

@AIAnytime 8 ай бұрын

You don't have to install Whisper from pypi from requirements.txt. Dockerfile will take care of it. As it is building directly from Git.

@concaption 8 ай бұрын

@@AIAnytime I figured it out finally. There were some issues in the newer version of openai-whisper package. fastapi==0.78.0 uvicorn[standard]==0.23.2 aiofiles==23.2.1 python-multipart==0.0.6 torch==2.0.1 openai-whisper==20230314 tiktoken==0.3.1

@user-jf5ru5ow8u Ай бұрын

what happen when i pass 8 gn file

@nguyentoanhnt 13 күн бұрын

How can I run with GPU. Currently when I run a container, the code line "DEVICE = "cuda" if torch.cuda.is_available() else "cpu"" the DEVICE is "cpu" though my computer has gpu. Thanks.