Prabhjot Gosal

22:29

Part 1 (Installation) SeamlessM4T: Massively Multilingual & Multimodal Machine Translation

11 ай бұрын

13:57

A successor to the fairseq library is here: It is called fairseq2 :) Let's install!

11 ай бұрын

15:12

Part 4 (Evaluation) - S2UT: Direct Speech-to-Speech Translation with Discrete Units implementation

Жыл бұрын

25:57

Part 3 (Inference) - S2UT: Direct Speech-to-Speech Translation with Discrete Units implementation

Жыл бұрын

10:31

Part 2 (Training) - S2UT: Direct Speech-to-Speech Translation with Discrete Units implementation

Жыл бұрын

40:07

Part1 (Data Prep) - S2UT: Direct Speech-to-Speech Translation with Discrete Units implementation

Жыл бұрын

1:15:36

AI for Everyone

Жыл бұрын

10:46

An introductory talk about AI #toastmastersinternational #machinelearning

Жыл бұрын

5:55

Preparing data for wav2vec model: Creating the manifest file (.tsv) from raw audio

Жыл бұрын

11:22

A step by step guide to fairseq library installation #nlp #speechprocessing #fairseq

Жыл бұрын

6:14

Let's install the fairseq library #keystrokes #nlp #fairseq #speechprocessing

Жыл бұрын

8:33

Enhanced Direct Speech to Speech Translation with Self supervised Pre-training and Data Augmentation

Жыл бұрын

22:09

Textless Speech-to-Speech Translation on Real Data #nlp #SpeechProcessing

Жыл бұрын

12:38

S2UT: Direct Speech-to-Speech Translation with Discrete Units #nlp #deeplearningpapers

Жыл бұрын

22:30

HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units #nlp

Жыл бұрын

14:21

Translatotron2: High-quality direct speech to speech translation with voice preservation #nlp

Жыл бұрын

12:47

Conformer: Convolution-augmented Transformer for Speech Recognition #nlp

Жыл бұрын

9:52

Translatotron: Direct speech to speech translation using sequence to sequence model

Жыл бұрын

14:40

Text data Pre-processing in NLP #nlp #machinelearning #naturallanguageprocessing

Жыл бұрын

19:23

Vector Quantization for Speech Processing #nlp #naturallanguageprocessing #machinelearning

Жыл бұрын

23:33

OpenAI Whisper model: ASR for many languages AND other languages to English translation model #nlp

Жыл бұрын

9:42

Understanding the Word Error Rate (WER) #nlp #machinelearning #naturallanguageprocessing

Жыл бұрын

17:45

4 - Sigmoid vs Softmax activation functions #machinelearning #softmax #sigmoid

2 жыл бұрын

31:54

3 - Training a Logistic Regression model from scratch #machinelearning #logisticregression

2 жыл бұрын

18:52

2 - Training a Machine Learning model - Supervised Learning framework #machinelearning

2 жыл бұрын

21:54

1 - Linear Regression vs Logistic Regression #machinelearning #linearregression #logisticregression

2 жыл бұрын

32:47

Natural Language Processing - Introduction #nlp

2 жыл бұрын

27:16

7 - Audio Classification using a Transformer model - a complete project walkthrough #machinelearning

2 жыл бұрын

10:18

5 - Data Preparation for training and testing Deep Learning models

2 жыл бұрын

Пікірлер

@adityasinghrathore5956 15 күн бұрын

Very informative video. Thanks for the effort.

@РамильАхмедов-м6ъ Ай бұрын

what an outstanding explanation. GOATed this one. Keep going!

@bengonoobiang6633 Ай бұрын

Thank you for the explaination!! Also thank you for the blog.

@AzeAltmi 2 ай бұрын

How accurate?

@MAdMAX-rz8gx 2 ай бұрын

hello i need help please about speech to speech translation can you contact with me i have a final project

@drm8164 2 ай бұрын

love you

@李薇-w3p 2 ай бұрын

How did you prepare the parallel speech data (SRC speech - TGT speech pairs)? Is there a detailed instructional video available? Thank you very much for sharing and your help.

@MAdMAX-rz8gx 2 ай бұрын

hello can you help me i do a final project in coolage for machin translate speech to speech can you contact me any one

@cgyh68748 3 ай бұрын

Nice!

@anbanglee98 3 ай бұрын

hey,waiting for your update

@baazjhaj8125 3 ай бұрын

Hello Prabhjot, were you able to get good results with your model? I trained mine on Fairseq's Jap to Eng and after training for approx 200 epochs the model only generated the training data when I ran inference.

@leemark7739 4 ай бұрын

Will you updated？

@jatinjoshi7549 4 ай бұрын

can i get the dataset used by you

@songli5145 4 ай бұрын

How did you clip the fall and non-fall samples? Are they already sampled as fall and non-fall clips?

@shreyojitdas9333 4 ай бұрын

never seem someone explain better.I love ur contents

@medhnhadush9703 4 ай бұрын

thank you very much , great content.

@abdulwali2235 4 ай бұрын

Thanx a lot ma'am , your research work is highly useful for me, i m working in this domain, i m facing difficulties , my area is also "DIRECT SPEECH-TO-SPEECH TRANSLATION". please i need help. best regards

@MAdMAX-rz8gx 2 ай бұрын

the same me can you contact with me i need help idont know where i can start

@anbanglee2910 5 ай бұрын

the training process is long and large data,can i just terminate and use one of them?

@baazjhaj8125 5 ай бұрын

Hello. I am attempting to preprocess the data from FLEURS as well. I was wondering how you went about renaming the wav files so that they were the same names in each folder? I am running into issues with reading from the tsv file. Thanks!

@prabhjotgosal2489 5 ай бұрын

I will make a new video about processing Fleurs data to get into a format that Meta's S2UT code needs. I am pre-committed at the moment. Can't get to the video making for next 2 weeks or so.

@baazjhaj8125 5 ай бұрын

@@prabhjotgosal2489 that would be great. I feel like there are some issues with the dataset. I had to manually go through the en_us tsv file and update it which was pretty annoying.

@leemark7739 3 ай бұрын

Did you make it

@baazjhaj8125 3 ай бұрын

@@leemark7739 yes. However, I ran into issues with model performance and training time.

@leemark7739 3 ай бұрын

@@baazjhaj8125 could you please share your code for processing the data，which languages do you focus on

@leemark7739 5 ай бұрын

what is the second column in dev.tsv?how to get it

@leemark7739 5 ай бұрын

will you talk about train unwritten languages like hokkien

@leemark7739 5 ай бұрын

hey ,can you please share your code,your file.by the way, what if a language is unwritten ,how to evaluate it with asr?

@prabhjotgosal2489 5 ай бұрын

Hi - I will share the code. Give me about 2 weeks or so. If we are dealing with an unwritten language, evaluation becomes harder. Take a look at the following paper: ai.meta.com/blog/ai-translation-hokkien/ The authors developed their own method to evaluate an unwritten language, Hokkein. However, I have not read the details of their method yet and it could be that their method was specifically for the Hokkien language.

@leemark7739 4 ай бұрын

@@prabhjotgosal2489 thanks

@leemark7739 4 ай бұрын

@@prabhjotgosal2489 hey are you free now？

@anbanglee98 3 ай бұрын

@@prabhjotgosal2489you are so busy😂

@anbanglee98 22 күн бұрын

@@prabhjotgosal2489waitting. For you😂

@SaifAbbas-c9p 5 ай бұрын

clear thanks

@coryspencer7007 5 ай бұрын

Hello Prabhjot! Thank you for your videos covering the SeamlessM4T library. I'm currently working with a cancer patient who had extensive damage to her vocalization abilities over the course of the cancer treatment process. We were exploring whether we would be able to use the SeamlessM4T model to help translate her current speech patterns back into a more easily understandable voice. Are you available for a quick chat via video call, by any chat? I would love to have your perspective.

@prabhjotgosal2489 5 ай бұрын

Hi @coryspencer7007 - That is a very valuable use-case! Kuddos to you for using technology for giving someone a better living experience. I doubt that Seamless M4T can do what you are trying to do but I have only scratched its surface so far. A new model called GAMA (arxiv.org/pdf/2406.11768) was just released today. Demo: 92c08e835dbe471711.gradio.live/ and 326e9a808b0c28063d.gradio.live/ Project website: lnkd.in/giZjmugU It is not an ASR model however, so it cannot transcribe speech. But it can take non-speech audio as input and reason about it. I would explore this one. Use the demo link and upload a sample audio you have and see what the model can do for you. I would love to chat with you about this but I am pre-occupied with personal commitments for another 3 weeks. My LinkedIn contact is on my channel page. Please fee free to use that for further discussion. In the near future, I will plan to explore the model I shared above and make a video.

@Htk-es1ky 5 ай бұрын

Thanks for explaining this, there are very few resource available on Audio ML..watched all of them in one go, please come up with video on some real life example on audio classification problem .

@prabhjotgosal2489 5 ай бұрын

Yes, will share an example.

@tiendungmai2292 5 ай бұрын

can you tell me that how do you get $SRC_AUDIO and $TGT_AUDIO? and what is the txt file in $TGT_AUDIO?

@santosotoso4287 6 ай бұрын

Thanks for the explanation

@dr-bijay-kumar-singh 6 ай бұрын

I got no error message, but at "print(len(transcripts_all_NoFall))" the output is zero. Although I have kept 10 English audio samples in the directory. Because of this when I proceed further the output for WER is empty results without any error message. Can you share how you kept the audio samples in your directory. I am using your code on colab.

@prabhjotgosal2489 6 ай бұрын

Hi - The audio samples are simply kept as .wav files in the directory that I am reading from. This directory exists in my Google drive. There are few ways to help root cause your issue further. 1. Before you do any ASR with Whisper.. check if audios are properly read. You could use librosa.load( ) to do so.. Check the output of the librosa.load() to see if the values in the audio files are non-zero. 2. Make sure the audio files contain speech. If there is no speech detected by the Whisper model in the audio, it will output nothing for the transcripts.

@dimtsel4044 6 ай бұрын

You explained it very well, bravo

@ABAnuSaraReality 7 ай бұрын

Hi, i have doubt . I uploaded the files in google collab via google drive , but the upload files and folders are not displaying even i auth the google drive with google collab , it doesn't show. Is it ok to use vscode as i am familar with the vscode functionality? Or is it compulsory to submit via google collab or can i upload the vs code executed files in google collab ?

@prabhjotgosal2489 7 ай бұрын

Hi - What error do you get when you try to run the code in colab? I am not sure I understand what you mean by, "the vs code executed files in google collab", Are you refereeing to audio files or something else?

@ABAnuSaraReality 7 ай бұрын

@@prabhjotgosal2489 I am referring about mounting the whole project in google collab. but i can't properly auth the google drive in google collab and mount them.

@prabhjotgosal2489 7 ай бұрын

@@ABAnuSaraReality You can run the entire code in VScode locally on your machine. You may have to adjust some syntax and ofcourse the filepath. I suggest creating a new file in VScode and copy/paste sections of code from my file little by little, rather than running the original file all at once. It will make debugging and understanding the code easier.

@arashroshanpoor1682 7 ай бұрын

really it is the best presentation of attention. I googled many tubes, but this one is significant.

@tarazdxs 7 ай бұрын

Thank you so much. really good explanation!

@chezhian4747 7 ай бұрын

Very good explanation. Thank you 🙏

@EngineerAnandu 7 ай бұрын

clear.

@EngineerAnandu 7 ай бұрын

very useful.

@itsworks723 7 ай бұрын

Please help, when using expressivity_predict I get the error expressivity_predict is gated. How to fix it?

@yusufcan1304 7 ай бұрын

thanks madam.

@yusufcan1304 7 ай бұрын

thanks madam.

@yusufcan1304 8 ай бұрын

thanks you, life saver.

@lindaflow5434 8 ай бұрын

this was really helpful! thank you very very much!!

@Tarzan_of_the_Ocean 8 ай бұрын

HuBERT proposed by Hubert😄

@prabhjotgosal2489 8 ай бұрын

Haha! That would have been a great title for this paper.

@jacobdavies3761 8 ай бұрын

Thanks for the help 👍

@pavangnetrakarnetrakar8019 8 ай бұрын

Please Make video on DASpeech arxiv.org/pdf/2310.07403.pdf

@kenturkey1971 8 ай бұрын

What a waste! You showed literally nothing! Well, you showed you have no clue what you’re doing. Ugh.

@nutifafaakpeleasi2543 8 ай бұрын

Thank you very much for this video. I love it.

@randalthor69 8 ай бұрын

can you provide the code or the github repositery?? it would be helpful for college project??

@SagarVerma-jl3cj 9 ай бұрын

hi thank you for this video , But i am getting an error that m4t_prediction not found . could you please help me how to resolve this issue

@sauravmeher4899 9 ай бұрын

Could you please create a tutorial on fairseq. Various modules. How to use it. How to use pre trained models etc. would be really helpful

@prabhjotgosal2489 9 ай бұрын

Yes!! I will create more tutorials on fairseq in the near future. I am taking a short pause with making videos due to my upcoming PhD defense.

@sauravmeher4899 9 ай бұрын

@@prabhjotgosal2489 sure. Thanks for the tutorials so far! Best of luck with your defence!. I wish that you walk out that room with a PhD!

Ең жақсы KZbin

Пікірлер