L17.3 The Log-Var Trick

  Рет қаралды 6,333

Sebastian Raschka

Sebastian Raschka

3 жыл бұрын

Slides: sebastianraschka.com/pdf/lect...
-------
This video is part of my Introduction of Deep Learning course.
Next video: • L17.4 Variational Auto...
The complete playlist: • Intro to Deep Learning...
A handy overview page with links to the materials: sebastianraschka.com/blog/202...
-------
If you want to be notified about future videos, please consider subscribing to my channel: / sebastianraschka

Пікірлер: 7
@Darkev77
@Darkev77 2 жыл бұрын
Hey professor, thanks for these brilliant lectures! One question if I may, at 4:56, "this allows us for + and - values", which I guess makes since since it gives our model more flexibility. However, in the sampling process when generating the "z", the term that represents the standard deviation (exp term) will always be positive, so doesn't that go against out initial intention? Sorry if I got things confused!
@SebastianRaschka
@SebastianRaschka 2 жыл бұрын
Thanks for the feedback! The exp term will always be positive, that's true. But you only use it for sampling the eps term from the random normal distribution that is centered at 0 and can be positive and negative.
@Darkev77
@Darkev77 2 жыл бұрын
@@SebastianRaschka hey professor thanks a lot for your response I really do appreciate it. So I guess my question would now be why didn’t we just use the variance vector as it is; why did we have to take it into the log (to allow for negative values initially) but then exponentiate it if the outcome is the same? Again, really sorry if I’m mixing things up!
@SebastianRaschka
@SebastianRaschka 2 жыл бұрын
@@Darkev77 good q! the log one is the one that is being "optimized" via backpropagation. But then to sample from a normal distribution, we need the exponentiated one because the sampling functions expects a standard deviation on the original scale
@shashankshekhar7052
@shashankshekhar7052 2 жыл бұрын
From one of the NYU lectures i understood this, please tell me know if my understanding is correct. If we have a latent space of 10dimension, then we will have 10 mean and 10 std dev to sample a single point. When we bring noise during sampling and call reparametrize 100 times then it will create a cluster of point around that mean and std to form a single bubble. For different means and std dev we would have different bubbles, all these bubbles will represent different classes. Their will be a big bubble containing all the small bubbles something called bubble of bubbles. The big bubble is a gaussian N(0,1) and the small bubbles are also N(0,1).
@SebastianRaschka
@SebastianRaschka 2 жыл бұрын
Yeah, if your latent space has 10-dimensional, we are sampling from a 10-dimensional normal (i.e., Gaussian) distribution. Since you can see a multi-dimensional Gaussian as a multi-dimensional bubble, this sounds about right. The smaller bubbles inside can occur if you have different classes and there is some relationship between the samples from each class that makes cases within a class more similar than cases outside that class. E.g., in MNIST this might be the case. However, VAEs are unsupervised, so the smaller bubbles may not occur in practice. The bubbles might be more related to the dimensions, like in each dimension, the points will be around its mean (which we usually choose at 0). One last thing: regarding the N(0, 1): you might be right in most cases, but it depends on the distribution we sample from. If you choose a standard normal distribution, N(0, 1) is correct.
@shashankshekhar7052
@shashankshekhar7052 2 жыл бұрын
@@SebastianRaschka thanks for clarifying!!
L17.4 Variational Autoencoder Loss Function
12:16
Sebastian Raschka
Рет қаралды 10 М.
The Reparameterization Trick
17:35
ML Explained
Рет қаралды 16 М.
I CAN’T BELIEVE I LOST 😱
00:46
Topper Guild
Рет қаралды 86 МЛН
THEY WANTED TO TAKE ALL HIS GOODIES 🍫🥤🍟😂
00:17
OKUNJATA
Рет қаралды 14 МЛН
The day of the sea 🌊 🤣❤️ #demariki
00:22
Demariki
Рет қаралды 99 МЛН
Variational Autoencoders
15:05
Arxiv Insights
Рет қаралды 483 М.
L18.3: Modifying the GAN Loss Function for Practical Use
18:50
Sebastian Raschka
Рет қаралды 6 М.
L15.5 Long Short-Term Memory
16:58
Sebastian Raschka
Рет қаралды 7 М.
Multivariate Normal (Gaussian) Distribution Explained
7:08
DataMListic
Рет қаралды 26 М.
Probability is not Likelihood. Find out why!!!
5:01
StatQuest with Josh Starmer
Рет қаралды 1,1 МЛН
ML Was Hard Until I Learned These 5 Secrets!
13:11
Boris Meinardus
Рет қаралды 226 М.
ИГРОВОВЫЙ НОУТ ASUS ЗА 57 тысяч
25:33
Ремонтяш
Рет қаралды 320 М.
Игровой Комп с Авито за 4500р
1:00
ЖЕЛЕЗНЫЙ КОРОЛЬ
Рет қаралды 1,8 МЛН