Speech Recognition in Python | finetune wav2vec2 model for a custom ASR model

  Рет қаралды 11,763

Python Lessons

Python Lessons

Күн бұрын

Пікірлер: 33
@infinitewebrevolution
@infinitewebrevolution 9 ай бұрын
Thank you so much sir with your hard work and pertained model, it has helped me alot I would always thank you
@PyLessons
@PyLessons 9 ай бұрын
Glad to hear that! You are welcome
@filofilo9695
@filofilo9695 Ай бұрын
Why do you use batch size = 1. Is it just for ram space reason? Great tutorial thank you!❤
@hugok6212
@hugok6212 10 ай бұрын
Excellent video and explanation. I have a question, if I train a model this way, can I use it for speech recognition in real time?. Thank you
@PyLessons
@PyLessons 10 ай бұрын
Hey, yes and no. If depends on what hardware you'll run model (cpu, gpu or other). It depends on your "real time" requirements. You need to test it and you'll see :)
@SagarPatel-h1g
@SagarPatel-h1g Ай бұрын
After Training and Testing how can we use this onnx model for any Speech to Text Task?
@shafiqrhmankeliwall8019
@shafiqrhmankeliwall8019 9 ай бұрын
Hi Great job Keep it up, I have one question that : I want to build/Train model for some low resource languages such as Pashto, I will make a dataset from scratch. any idea how to start or any useful links. Thanks
@PyLessons
@PyLessons 9 ай бұрын
Thanks! I do not recommend to make a dataset from scratch alone, I believe you should be able to find something in open source. I don't have dataset, but check my dataset structure and you'll see what format it required
@BrightShoko-m7c
@BrightShoko-m7c 10 ай бұрын
Good job👏..........but i'm getting errors on onnx installation, ....what python version did you use
@PyLessons
@PyLessons 10 ай бұрын
I used it with 3.10 python. What error you receive, often it might be related with protobuf version
@konami_cheater
@konami_cheater Ай бұрын
Can you give me the Paper Link 0:57
@victormessias107
@victormessias107 9 ай бұрын
When I'm training, its freezes on the end of the first epoch. Any idea?
@PyLessons
@PyLessons 9 ай бұрын
It shouldn't be like that, try to debug it. For example iterate through training data provider and validation data provider, for example "for data in data_provider" and check if it can reach the end. If you still face these issues open issue on GitHub with more details
@AmitYadav-rp3ot
@AmitYadav-rp3ot Жыл бұрын
Hi there, great video! I wanted to know your opinion on training a model like this just for recognising numbers and couple of words from an audio file. will such a custom training help to reduce the size of the model ? I want to create a very small model so that I can run it on a sub GHz clock CPU. please share what you think. Many thanks
@PyLessons
@PyLessons Жыл бұрын
Hi, thanks! No, training model on simpler data doesn't reduce model size. Check my other videos to create your own custom model for simpler data, such as numbers and words. But if your variety of words is simple, maybe you should consider classification task. Also, to reduce size of the model check quantization and pruning techniques
@maimunahmaskur7525
@maimunahmaskur7525 7 ай бұрын
its a great code! Could you please help, if I want to use this code for a dataset labeled phonemes and use PER (Phoneme Error Rate) for test and validation, what should I do? I mean which parts of the code do I need to adjust? Thank You!
@PyLessons
@PyLessons 6 ай бұрын
I am not familiar with PER, so I can't tell you
@djrocks5678
@djrocks5678 Жыл бұрын
Hi there! Thanks a lot for this. I wanted to ask you - I am working on a desktop voice assistant project as part of my university work. I wanted to train my own speech recognition model. How would I go about this? I saw datasets and something like Mozillas 79GB data is too much for my needs and was wondering how I'd go about making a smaller scale speech recognition model for my project.
@PyLessons
@PyLessons Жыл бұрын
Hi, usually its impossible to get great results, without huge datasets and GPU computing. But you may try to create a custom ASR model with my another tutorial, what you can check here: kzbin.info/www/bejne/bpjCZX99Z9GjiJo. Also, there are a lot of trained ASR models that usually you need only to integrate (just an idea)
@BASDOURI
@BASDOURI 8 ай бұрын
your contact please ?
@N3ONGNCS
@N3ONGNCS 6 ай бұрын
i want to create an ASR for an African Vernacular/local language ,could i use this for that, ill create my own dataset if need be, or what would you suggest, im attempting this for the first time an am a little lost and overwhelmed
@daisy-bot-py
@daisy-bot-py Ай бұрын
Im working on a similar project and im just curious to know if you trained the model in an African language?
@mohamedabdiaziz5993
@mohamedabdiaziz5993 7 ай бұрын
My final university projects is like this system, I need help I have prepared my own dataset
@PyLessons
@PyLessons 6 ай бұрын
I already helped by creating this step by step tutorial mate :)
@mohamedabdiaziz5993
@mohamedabdiaziz5993 6 ай бұрын
@@PyLessons I am facing error, already I have prepared my own dataset for the last 4 months
@mohamedabdiaziz5993
@mohamedabdiaziz5993 6 ай бұрын
@@PyLessons I tried but wer is 100% even if model make 7200 steps
@Ogamp
@Ogamp Жыл бұрын
thank you for this. Could you please put me through an ASRmodel for recognizing regional accents please? how can i contact you thanks
@PyLessons
@PyLessons Жыл бұрын
Your task is classification task, try to google it, but usually it's done only with Encoder model.
@PyCode.academe
@PyCode.academe 11 ай бұрын
God bless you!
@PyLessons
@PyLessons 11 ай бұрын
You are welcome :)
@山田林-f5b
@山田林-f5b Ай бұрын
06:05
@aqibmumtaz1262
@aqibmumtaz1262 Жыл бұрын
Great
@PyLessons
@PyLessons Жыл бұрын
Thanks
번쩍번쩍 거리는 입
0:32
승비니 Seungbini
Рет қаралды 182 МЛН
Какой я клей? | CLEX #shorts
0:59
CLEX
Рет қаралды 1,9 МЛН
Every team from the Bracket Buster! Who ya got? 😏
0:53
FailArmy Shorts
Рет қаралды 13 МЛН
Visualizing transformers and attention | Talk for TNG Big Tech Day '24
57:45
Fine-Tune XLSR-Wav2Vec2 for low-resource ASR with :câlin: Transformers
28:59
How To Train Your WGAN-GP To Generate Fake People Portraits
28:19
Python Lessons
Рет қаралды 1,5 М.
Fine-tuning Whisper to learn my Chinese dialect (Teochew)
28:10
Efficient NLP
Рет қаралды 9 М.
I Built a Personal Speech Recognition System for my AI Assistant
16:32
Fine-tune Text-to-Speech Models for any Language: Introduction to TTS
24:18
5 Python Libraries You Should Know in 2025!
22:30
Keith Galli
Рет қаралды 80 М.
7 Outside The Box Puzzles
12:16
MindYourDecisions
Рет қаралды 16 М.
번쩍번쩍 거리는 입
0:32
승비니 Seungbini
Рет қаралды 182 МЛН