Deep Learning 20: (2) Variational AutoEncoder : Explaining KL (Kullback-Leibler) Divergence

Рет қаралды 36,273

Күн бұрын

Its a part of mini lecture series on Variational Auto-Encoders which is divided into six lectures. This is the second lecture that discuss the detail derivation and insight to Kullback-Leibler Divergence
#autoencoder#variational#generative
Implementation by Andrew Spanopoulos
github.com/And...

Пікірлер: 54

@vatsalamolly 3 жыл бұрын

Can't thank you enough! Thank you for explaining in such detail. I had been stuck understanding this proof for hours and it was driving me crazy!

@xruan6582 4 жыл бұрын

Super detailed explanation. Human beings need more teachers like this. A small question. 11:47 why the 1/2 disappeared in tr[∑₁(1/2)∑₁⁻¹]. Why we end up with k rather than k/2

@yashkhurana9170 4 жыл бұрын

I have the same question. Did you find an answer to it?

@xruan6582 4 жыл бұрын

@@yashkhurana9170 not yet

@MrSouvikdey 4 жыл бұрын

There will the 1/2 term as well. In the final expression of Dkl (after proving), he is taking the 1/2 common. So it has to be K/2

@moliv8927 2 жыл бұрын

it's a typo he missed it.

@moliv8927 2 жыл бұрын

@@yashkhurana9170 it's a typo he missed it.

@oriabnu1 4 жыл бұрын

truly excellent, i have studied a lot of papers and lectures but could not remove my confusion , now feel comfortable thanks again

@sabazainab1524 3 жыл бұрын

Wow, Amazing tutorial. Thank you for all of your videos on every topic.

@dailyDesi_abhrant 4 жыл бұрын

I am so happy I found you, I was struggling to get a clear idea of this concept !

@VarounsVlogs 5 жыл бұрын

Damn, this is awesome. THANKS!!!!

@IgorAherne 2 жыл бұрын

Thank you so much for this. The way you structure it, the examples, is great

@omniscienceisdead8837 2 жыл бұрын

simple and straight to the point , thank you

@BiranchiNarayanNayak 5 жыл бұрын

Excellent tutorial.

@yangzhou6314 2 жыл бұрын

Best explanation I have seen so far!

@SP-db6sh 3 жыл бұрын

Thank you very much for such a simple explanation.

@bekaluderbew19 6 ай бұрын

God bless you brother

@nitinkumarmittal4369 3 жыл бұрын

Thank you Sir for the explanation!

@rosyluo7710 4 жыл бұрын

very clean explanation! Thanks dr.

@johnbmisquith2198 2 жыл бұрын

Great details on WGAN

@navaneethmahadevan2458 4 жыл бұрын

Gaussian distributions are supposed to be for continuous random variables? How are we using it for a discrete random variable where it can take K possble states here? Shouldn't we consider an integral here? Please correct me if i am wrong - totally new to machine learning

@nikiamini2768 3 жыл бұрын

Thank you so, so much! This is super helpful!

@bozhaoliu2050 3 жыл бұрын

Excellent tutorial

@adityaprakash256 5 жыл бұрын

this is too good. thank You

@bosepukur 5 жыл бұрын

one of the very best

@Vighneshbalaji1 4 жыл бұрын

E_p(x) = µ is fine, but 16:05 you said that E_p(x) = µ_1. I think it was coming from Q(x) so it should be µ_2 right?

@saadanwer2106 5 жыл бұрын

Amazing tutorial

@RealMcDudu 4 жыл бұрын

Excellent video for showing the KL for 2 gaussians. A bit too fast, but luckily KZbin has 0.75 speed :-)

@guofangtt 5 жыл бұрын

Thanks for your tutorial. I have one question, on 13:49, why (B^T)A=(A^T)B. Thanks!

@NeerajAithani 4 жыл бұрын

Earlier we applied trace trick because A.T*X* A is scaler. Here B.T*X* A has same dimension as A.T*X* A . so B.T*X* A is a scalar number also B.T*X*A is. so we can add two scalers.

@coolankush100 4 жыл бұрын

Yes we can certainly add them because they will have the same dimension i.e. 1x1, a scalar.. It is also worth mentioning that they are indeed equal. You might wanna take some matrices and vectors to try to convince yourself.

@jerrycheung8158 2 жыл бұрын

great video! but wonder why we can't use the same operation to the second terms (1/2 (x - mean2) transpose x inverse of covariance matrix 2 x ..) to get K, same as the first term?

@compilations6358 4 жыл бұрын

Great lectures, But i think you should have shown derivation of loss through MLE.

@apocalypt0723 4 жыл бұрын

awesome

@not_a_human_being 4 жыл бұрын

great content, but sometimes we can hear your mic scratching against something, 6:48 . Not a big deal!

@satyamdubey4110 7 ай бұрын

💖💖

@codewithyouml8994 2 жыл бұрын

I have some questions like when u were writting the first part at 8:37 then that time, we got a good answer which is K (trace), but I do not understand what is the need of the doing of replacing mew 2 with mew 1 in the second part at 12:30, like we can do the same as the first first part and can get some another K2 and got their difference, so what is the need of that? Thank you for the video.

@kwippo 4 жыл бұрын

14:31 - the constant expression should be multiplied by 0.5. Doesn't really matter because it was denoted by beta.

@husamalsayed8036 3 жыл бұрын

what is the intuition behind the KL divergence of two distribution is not symetric ? for me it seems like they should be symmetric

@choungyoungjae8271 5 жыл бұрын

so cool!

@songsbyharsha 4 жыл бұрын

From the probability refresher, it is told that sigma of (x*p(x)) is called Expectation but sigma of p(x) is considered as Expectation at timestamp 9:08. Pls help :)