L17.5 A Variational Autoencoder for Handwritten Digits in PyTorch -- Code Example

Рет қаралды 16,407

Күн бұрын

Sebastian's books: sebastianrasch...
Slides: sebastianrasch...
Code:
github.com/ras...
-------
This video is part of my Introduction of Deep Learning course.
Next video: • L17.6 A Variational Au...
The complete playlist: • Intro to Deep Learning...
A handy overview page with links to the materials: sebastianrasch...
-------
If you want to be notified about future videos, please consider subscribing to my channel: / sebastianraschka

Пікірлер: 19

@raghavamorusupalli7557 3 жыл бұрын

Thank you for hand holding the DL aspirants to reach new destinations, Great Service to the Knowledge

@736939 3 жыл бұрын

3:59 In the first decoder's linear layer you have only 2 neurons, - I mean, if you have 2 neurons from z_mean and 2 neurons from z_log_var then, the decoder's linear layer must contains 4 neurons instead of 2. I don't get it.

@SebastianRaschka 3 жыл бұрын

Good question! The best way to see how these get actually used is by looking at the forward method: def forward(self, x): x = self.encoder(x) z_mean, z_log_var = self.z_mean(x), self.z_log_var(x) encoded = self.reparameterize(z_mean, z_log_var) decoded = self.decoder(encoded) So here we can see that these (z_mean and z_var) get passed to self.reparameterize, which returns "encoded" that is then passed to the decoder. Upon inspecting "self.reparameterize" you will see that we use z_mean, z_log_var as parameters for a normal distribution to sample the vector "z" (same as "encoded"): def reparameterize(self, z_mu, z_log_var): eps = torch.randn(z_mu.size(0), z_mu.size(1)).to(z_mu.get_device()) z = z_mu + eps * torch.exp(z_log_var/2.) return z In other words, the two dimensional vectors z_mean & z_log_var are not used directly in the encoder but are just used to sample from a 2D Gaussian distribution via torch.randn to get a 2D vector z. I.e, the input to the decoder is the 2D vector "z" (aka "encoded")

@MohitGupta-zf8kx 3 жыл бұрын

Your video is really amazing. Thank you very much for giving us so much knowledge. Can you please tell us how can we get the validation loss evaluation curves? Thanks :)

@SebastianRaschka 3 жыл бұрын

Glad to hear you are liking it. I plotted the losses with matplotlib, you can find the code here at the top: github.com/rasbt/stat453-deep-learning-ss21/blob/main/L17/helper_plotting.py

@yogendra-yatnalkar 6 ай бұрын

Thanks a lot for the VAE series. A small question: Since we need a encoder output to be as close to standard distribution as possible, why dont we enforce activation function on the encoder linear layer ? --> The mean layer will have sigmoid activation fcn and variance layer will have tanh ...something like this ?

@vineetgundecha7872 10 ай бұрын

Thanks for the explanation! Unlike the reconstruction loss which is interpretable, how should we interpret the KL divergence loss? What is an acceptable value? How would the sampled images look if we have a low reconstruction error but high KL divergence ?

@siddhantverma532 2 жыл бұрын

First of all, thanks a lot! The scatter plot really gives a nice intuition about latent space.But it got me thinking that will every 2d space trained will look like this, or will it depend on how someone has made architecture or trained it.Then I saw your plot it was different from mine so I guess its not universal then. If it was universal it would be like a huge thing! Another thing that we are trying to learn the probability distribution if I'm not wrong I wanna know and visualise the distribution that our network has learnt how can we know that, its in 2d so it can be visualised in 3d graph.

@SebastianRaschka 2 жыл бұрын

The latent space will depend a bit on the weight of the KL-divergence term (if it is too weak, it will resemble a 2D Gaussian less). Also, since random sampling is involved, the plot may look different every time. Btw. regarding the plot, to plot the distribution in 3D, you'd need some sort of density estimation. This reminds me, I actually wrote a blog post about this long long time ago: sebastianraschka.com/Articles/2014_kernel_density_est.html

@kadrimufti4295 2 жыл бұрын

Thank you for the lecture. If we sample from a multivariate random normal distribution (to decode and see what numbers we can get), then will we be more likely to decode some digits over others due to the nature of the distribution we are sampling from? And so based on your 2-D plot, would we get the digits at the center more often than the ones on the periphery?

@SebastianRaschka 2 жыл бұрын

Yes, this sounds about right. Really , to sample specific classes efficiently, you'd would consider a conditional VAE in practice: github.com/rasbt/deeplearning-models/blob/master/pytorch_ipynb/autoencoder/ae-cvae.ipynb

@cricketcricket20 Жыл бұрын

Hello, at 13:53 you said that you are summing over the latent dimension. But aren't the z_mean and z_log_var tensors of the shape (batch size, channels, latent dimension)? In that case wouldn't you sum over axis = 2? Thanks a lot for the videos!

@cricketcricket20 Жыл бұрын

Following up on this, I think the sum over axis = 1 is correct because it carries out the kl divergence formula element-wise the way it should be done. This outputs a tensor of shape (batch size, channels, latent dimension), then you compute the average of this tensor. This is analogous to taking the MSE loss (with reduction = 'mean'), which first computes the squared differences element-wise and then take the average.

@prashantjaiswal5260 Жыл бұрын

running the code on google colab it shows error in model.to(DEVICE ) part how it can be corrected??? set_all_seeds(RANDOM_SEED) model = VAE() model.to(DEVICE) optimizer = torch.optim.Adam(model.parameters(), lr=LEARNING_RATE)

@hillarykavagi7349 2 жыл бұрын

Hi Sebastian, I like your Videos, I has helped me, but am working on a personal project on Variational Autoencoders using Dirichlet distribution, and am stuck at the point of calculating Binary cross Entropy loss, I would kindly like to request for assistance

@jalv1499 Жыл бұрын

thank you for the video! What's the formula of backpropagation? I did not see the code of backward propagation part.

@SaschaRobitzki 7 ай бұрын

It's part of PyTorch.

@MrBeefSlapper 2 жыл бұрын

5:00 could you please explain how just using a linear layer nn.Linear is able to calculate the mean and log variance of the latent space for z_mean and z_log_var? It looks like z_mean and z_log_var compresses the space into 2 latent dimensions, but shouldn't there be an additional step to explicitly compute the mean and variance of the 2 latent dimensions before sampling?

@SebastianRaschka 2 жыл бұрын

Oh yeah this is a bit tricky at first glance. So here the values that are output by the linear layers represent the mean and var. So if the mean is a two dimensional vector, it corresponds to a two-dimensional normal distribution. They go into "torch.randn(...)" to sample from the distribution. How to adjust the values for the mean and the var? That happens via backpropagation (together with the KL divergence loss term that tries to make it similar to a standard normal distribution)