I really love the way you explain. You are using very standard language which the old pre-deeplearning guys are familiar with.
@Vroomerify2 жыл бұрын
This was great! Your explanation of batch normalization is by far the most intuitive one I've found.
@lakerstv30216 жыл бұрын
Love your content! Some of the best explanations on the internet. Would be amazing if you could go through the Neural ODE or Taskonomy paper next.
@dvirginz40015 жыл бұрын
That's the best video on youtube about batchnorm, thanks for going over the paper.
@deepakarora9875 жыл бұрын
This is really cool explanation , would love to hear more from you.
@KulkarniPrashant7 ай бұрын
Absolutely amazing explanation!
@黃一-h6b4 жыл бұрын
Thank you so much for your intro. I had a hard time grasping the concepts, this helped a lot. Thank you :)
@madhavimehta60102 жыл бұрын
thanks for putting effort in explaining in simpler manner
@ross8255 жыл бұрын
Why don’t you have more subscribers, so helpful!!!
@manasagarwal94854 жыл бұрын
Hi, thanks a lot for these amazing paper reviews. Can you make a video about layer normalization as well, and why is it more suited for recurrent nets than batch normalization?
@payalbhatia52444 жыл бұрын
@ Yannic Kilcher. Thanks , it has been best explaination. You simplified maths as well. Would request you to explain all recent papers in same way. Thansk
@tudor62104 жыл бұрын
Thanks for going over the paper!
@Konstantin-qk6hv3 жыл бұрын
Thank you for the review. I like to watch your videos instead of reading paper
@Laszer2714 жыл бұрын
Batch Normalization doesn't reduce internal covariate shift, see: How Does Batch Normalization Help Optimization? arXiv:1805.11604
@Engidea4 жыл бұрын
What is the app are you using to edit and write on the the paper.
@MLDawn3 жыл бұрын
Absolutely beautiful man. I love how you mentioned the model.train and mdoel.eval coding implication as well. OPut of curiousity: 1) What software are you using to show the paper (not adobe right?) 2) What kind of drawing pad are you using? I have a Wacom but since I cannot see what I'm doing on it, it is annoying to teach using it really.
@abdulhkaimal03523 жыл бұрын
why we normalize the data and then we multiply by gamma and add it with beta i understand it produce the best distribution of data but rather cant we just multiply by gamma and add the beta without even do the normalization part ?
@shinbi8809285 жыл бұрын
I really like it, thank you! :)
@wolftribe665 жыл бұрын
Could you make a video about group normalization from FAIR?
@BlakeEdwards3335 жыл бұрын
Awesome thanks!
@kynguyencao1630Ай бұрын
thank you so much
@matthewtang14894 жыл бұрын
Wouldn't it be cool for some professors to make the students derive the derivatives in the test =).
@hoangnhatpham80763 жыл бұрын
I had to do that my DL exams. Just feedforward though, nothing this involved =)
@adhiyamaanpon41684 жыл бұрын
someone clear the following doubt: does gamma and beta value will be different for each input feature in a particular layer?
@YannicKilcher4 жыл бұрын
yes
@adhiyamaanpon41684 жыл бұрын
@@YannicKilcher Thanks a lot!!
@AvielLivay4 жыл бұрын
But why do you need the lambda/beta? what's wrong with just shifting to average 0, variance 1? And also - how do you train them, you mean they are part of the network and so they are trained but I thought we want things not to be shaky but you are actually adding these that are adding to the 'shakiness'... what's the point?
@rbhambriiit Жыл бұрын
Idea is to learn the better repression. Identity or normalised or something in between. Think of it as data preprocessing
@rbhambriiit Жыл бұрын
Agreed it does question the original hypothesis/definition of normalisation at input layer as well
@rbhambriiit Жыл бұрын
It's not that shaky. It's another layer trying to learn the better data dimensions. With images identity layers work well. So the batchnorm learning should effectively reverse the mean/variance shift.
@yamiyagami71416 жыл бұрын
Nice video! You might also want to check out "How Does Batch Normalization Help Optimization?" (arxiv.org/abs/1805.11604), presented at NeurIPS18, which casts doubt on the idea that batchnorm improves performance through reduction in internal covariate shift.
@dlisetteb3 жыл бұрын
i really cant understand it
@yasseraziz12874 жыл бұрын
YOU DA MAN LONG LIVE YANNIC KILCHER
@michaelcarlon18316 жыл бұрын
All of the cool kids use SELU
@ssshukla264 жыл бұрын
If not more then almost as complicated explanation as in the paper.
@garrettosborne43644 жыл бұрын
You got lost in the weeds on this one.
@rbhambriiit Жыл бұрын
The backprop was bit unclear n perhaps the hardest bit