End-to-End Adversarial Text-to-Speech (Paper Explained)

  Рет қаралды 14,488

Yannic Kilcher

Yannic Kilcher

Күн бұрын

Пікірлер: 39
@HarisGulzar-d9c
@HarisGulzar-d9c 9 ай бұрын
Never enjoyed paper explanations this much. Thanks, Yannic!
@MrAmirhossein1
@MrAmirhossein1 4 жыл бұрын
Hey Yannic! just wanted to thank you for the excellent content that you provide. Keep it up man :)
@rishabhkumar722
@rishabhkumar722 10 ай бұрын
Wow... Why not more TTS papers explanation
@zhivebelarus560
@zhivebelarus560 4 ай бұрын
Yannic, thanks for doing this! Quick question: why instead of fiddling with the aligner they did not start training from smaller samples like one phoneme long and then as loss drops gradually increase the sample length to 2, 3, etc? It seems too much black magic going on in training a tts model. Do you have a suggestion for the most clean architecture that works well? Is there a good review of one step tts models? How can a speaker embedding can be integrated for voice cloning into such model? Sorry for too many questions…
@rvalusa
@rvalusa 4 жыл бұрын
Awesome. Superb explanation. Love the channel and content 👍👏🙂
@revanthadiga329
@revanthadiga329 3 жыл бұрын
anyone knows where to find this code implementation
@kimchi_taco
@kimchi_taco 4 жыл бұрын
Thank you! It includes so many ad-hoc. I wonder why it's better than combination of Tacotron+WaveNet?
@motherbear55
@motherbear55 3 жыл бұрын
Quality wise it’s not better than tacotron (see MOS scores in the paper-tacotron is about 4.5, this approach is about 4.0). But unlike tacotron, it’s not autoregressive, so inference can be much faster.
@alaapdhall8541
@alaapdhall8541 4 жыл бұрын
ah always so fast, I heard the google released pre trained weights for big transfer, could you also make a video on BiT?
@alaapdhall8541
@alaapdhall8541 4 жыл бұрын
@Mallow Marsh oh ok, I'll go through his videos then
@Haapavuo
@Haapavuo 2 жыл бұрын
Whose videos? The comment was deleted. Thanks.
@СергейПавлович-г2и
@СергейПавлович-г2и 4 жыл бұрын
Can I try it somewhere?
@YannicKilcher
@YannicKilcher 4 жыл бұрын
Not sure. I've linked their website in the description
@ushasr2821
@ushasr2821 3 жыл бұрын
Great explaination Thank you so much
@hannesstark5024
@hannesstark5024 4 жыл бұрын
Visual Transformers tomorrow?
@shivamraisharma1474
@shivamraisharma1474 4 жыл бұрын
Amazing! Do we have any GitHub code or pretrained model weights available?
@YannicKilcher
@YannicKilcher 4 жыл бұрын
I don't think so
@avihudekel4709
@avihudekel4709 3 жыл бұрын
Great work!
@bossgd100
@bossgd100 4 жыл бұрын
Its working in real time ?
@herp_derpingson
@herp_derpingson 4 жыл бұрын
Anything can be real time if you have enough compute
@YannicKilcher
@YannicKilcher 4 жыл бұрын
I don't think so
@bossgd100
@bossgd100 4 жыл бұрын
@@herp_derpingson the singularity is far 😵
@koheimatsuura3610
@koheimatsuura3610 4 жыл бұрын
@@YannicKilcher Hi :) why do you think so? this seems non-autoregressive model and I think its inferences are so fast...
@henkjekel4081
@henkjekel4081 8 ай бұрын
You're the best
@myungchulkang5716
@myungchulkang5716 4 жыл бұрын
Nice !
@screenapple1660
@screenapple1660 4 жыл бұрын
people want realistic TTS voice that sounds high-quality humans. not robot voice. Robot Voice is usually free. But it's stupid. Most businesses use high-quality human voice synthesis.
@ziqiangshi8167
@ziqiangshi8167 4 жыл бұрын
Awesome.
@DinaEl-Kholy--
@DinaEl-Kholy-- 4 жыл бұрын
Thank you!!
@snippletrap
@snippletrap 4 жыл бұрын
I think Tacotron sounds better
@yabdelm
@yabdelm 4 жыл бұрын
I absolutely love the content but I vote for not saying "As always if you like this work subscribe" I believe if people are exploring AI videos, they probably know where the subscribe button is, and if they like the videos, they'll probably subscribe. Plus we've heard it a billion times in every video on KZbin ever made. It just becomes noise at a certain point. At this point I’m thinking of training an AI to skip every time someone says that. Nevertheless, they're your videos, and a personal choice, not a democracy. Feel free to disagree. Don't mean to be mean or anything.
@lakshay510
@lakshay510 4 жыл бұрын
Hi but I also don't agree with you, When I am doing any kind of research I just open 10s of tab and start exploring it one by one and sometimes if I get the right content I learn the stuff and leave, Also there are analytics that youtube provide which might show that most of his viewers are not his subscribers.
@yabdelm
@yabdelm 4 жыл бұрын
Lakshay Chhabra You think the majority of people will subscribe because he reminded them to subscribe? I don’t doubt that that might occur as I really have no way of checking that. I agree that some way of determining that from the analytics would be better.
@siyn007
@siyn007 4 жыл бұрын
For me I usually have a few trial videos before I subscribe but I must admit being told to subscribe lets me evaluate if I should subscribe instead of just exiting like what Lakshay suggested. I agree with not telling people where the subscribe button is though.
@yabdelm
@yabdelm 4 жыл бұрын
@@siyn007 Oh sorry but I don't think Yannic specified where the subscribe button was. I just meant to point to saying whether or not to subscribe. I see. Good to know that there's the opposite take there. It's definitely not the end of the world. :D I still love Yannic and his videos.
@YannicKilcher
@YannicKilcher 4 жыл бұрын
This is one of the things that, yes, is slightly annoying, but you'd be surprised how many people who aren't subscribed go "oh yes, I could do that". So I try to give you the high level before I say that so that you can decide to skip the video without having to listen to it :)
@bossgd100
@bossgd100 4 жыл бұрын
First !
How to Fight a Gross Man 😡
00:19
Alan Chikin Chow
Рет қаралды 18 МЛН
Lazy days…
00:24
Anwar Jibawi
Рет қаралды 7 МЛН
Yay😃 Let's make a Cute Handbag for me 👜 #diycrafts #shorts
00:33
LearnToon - Learn & Play
Рет қаралды 117 МЛН
Flow Matching for Generative Modeling (Paper Explained)
56:16
Yannic Kilcher
Рет қаралды 55 М.
Linformer: Self-Attention with Linear Complexity (Paper Explained)
50:24
GPT-3: Language Models are Few-Shot Learners (Paper Explained)
1:04:30
Yannic Kilcher
Рет қаралды 213 М.
Neural Architecture Search without Training (Paper Explained)
35:06
Yannic Kilcher
Рет қаралды 28 М.
When BERT Plays the Lottery, All Tickets Are Winning (Paper Explained)
53:35
Hopfield Networks is All You Need (Paper Explained)
1:05:16
Yannic Kilcher
Рет қаралды 99 М.
How to Fight a Gross Man 😡
00:19
Alan Chikin Chow
Рет қаралды 18 МЛН