Variational Autoencoder from scratch in PyTorch

  Рет қаралды 31,759

Aladdin Persson

Aladdin Persson

Күн бұрын

Пікірлер: 70
@nikitaandriievskyi3448
@nikitaandriievskyi3448 2 жыл бұрын
100% Agree that if you write everyting from scratch line by line it is much better than having it prewritten
@chrisoman87
@chrisoman87 2 жыл бұрын
Hey just watched your video, really good! But its obvious this is a new area for you (which is not bad), so I thought I'd give you some pointers to improve your algorithm. 1. In practice VAE's are typically trained by estimating the log variance not the std, this is for numerical stability and improves convergence of the results so your loss would go from: `- torch.sum(1 + torch.log(sigma.pow(2)) - mu.pow(2) - sigma.pow(2))` -> `-0.5 * torch.sum(1 + log_var - mu.pow(2) - log_var.exp()` (where log_var is the output of your encoder, also your missing a factor 0.5 for the numerically stable ELBO) Also, the ELBO is the Expectation of the reconstruction loss (the mean in this case) and the negative sum of the KL divergence 2. The ELBO (the loss) is based on a variational lower bound its not just a 2 losses stuck together as such arbitrarily weighting the reconstruction loss and the KL divergence will give you unstable results, that being said your intuition was on the right path. VAEs are getting long in the tooth now and there are heavily improve versions that focus specifically on "explainable" if you want to understand them I would look at the Beta-VAE paper (which weights the KL divergence) then look into Disentagled VAE (see: "Structured Disentangled Representations", "Disentangling by Factorising") these methodologies force each "factor" into a normal Gaussian distribution rather than mixing the latent variables. The result would be for the MNIST with a z dim of 10 each factor representing theoretically a variation of each number so sampling from each factor will give you "explainable" generations. 3. Finally your reconstruction loss should be coupled with your epsilon (your variational prior), typically (with some huge simplifications) MSE => epsilon ~ Gaussian Distribution, BCE => epislon ~ Bernoulli distribution
@Retroiu99_01
@Retroiu99_01 2 ай бұрын
I was thinking the same and I kinda got lost in that part, thank you for clarifying that out mate
@marcocastangia4394
@marcocastangia4394 2 жыл бұрын
Great content. I've always loved your "from scratch" tutorials.
@AladdinPersson
@AladdinPersson 2 жыл бұрын
Thanks Marco!
@tode2227
@tode2227 6 ай бұрын
Again an awesome from-scratch video! I have never seen programming videos in which it is so simple to follow what the person is coding, thank you. Currently, there are no videos about stable diffusion from scratch, which include the training scripts. It would be great to see a video on this!
@starlite5097
@starlite5097 2 жыл бұрын
Awesome work. Please do more stuff with GANs or visual transformers.
@CrypticPulsar
@CrypticPulsar 2 ай бұрын
You are incredible.. thanks for sharing all this knowledge and skill with the world..
@dr_rahmani_m
@dr_rahmani_m Жыл бұрын
I like the thought process. So, thanks for the 'from scratch' tutorials.
@pheel_flex
@pheel_flex 2 жыл бұрын
Great video, thank you! Please don't change to having pre-written code. Your approach is the best that can be found these days.
@Uminaty
@Uminaty 2 жыл бұрын
As usual, it's amazing content ! Thank you so much for your work
@LucaBovelli
@LucaBovelli 4 ай бұрын
are you the son of notch (markus persson)?
@oceanwave4502
@oceanwave4502 23 күн бұрын
10:03 I think the sigma here, which is the output of "self.hid_2sigma", should be interpreted as "log of variance". Why? Because the Linear layer can somehow output negative values while Variance (of a distribution) can't be negative. By interpreting as "log of variance", we can get the variance inside by using exp(). As a result, we have two needed code changes: doing the exp() against this "log of variance" when calculating z as well as calculating the Loss.
@nathantrance7558
@nathantrance7558 Жыл бұрын
You are truly a life saver sir. Thank you for keeping everything simple instead of using programming shenanigans just to make it more complicated and unreadable. Love your tutorials, I learned a lot from your line of thinking, including the ranting things.
@СемёнСемёныч-е4д
@СемёнСемёныч-е4д 9 ай бұрын
Code from 15:05 so you don't need to type it all: import torch import torchvision.datasets as datasets from tqdm import tqdm from torch import nn, optim from model import VariationalAutoEncoder from torchvision import transforms from torchvision.utils import save_image from torch.utils.data import DataLoader
@carlaquispe9738
@carlaquispe9738 2 жыл бұрын
Maybe the "Attention Is All You Need" is worth to go through
@에움길-f8n
@에움길-f8n Жыл бұрын
Great tutorials!! I can understand how to work on VAE!! ☺☺☺☺
@donfeto7636
@donfeto7636 Жыл бұрын
23:09 since you used sigmoid your pixels will be between 0 and 1 so it's okay to use sigmoid in this case otherwise if you use no activation function in the last layer of the decoder you need to use the new loss function of MSE +Reconstartion loss that what i think
@prabhavkaula9697
@prabhavkaula9697 2 жыл бұрын
Awesome implementation tutorial❤️
@chickenp7038
@chickenp7038 2 жыл бұрын
please do not have the code prewritten
@AladdinPersson
@AladdinPersson 2 жыл бұрын
Agree. I get overwhelmed if someone shows the entire code. Much easier to get guided through it step by step imo, but open to the idea that there might be better ways to explain code
@zetadoop8910
@zetadoop8910 2 жыл бұрын
Your videos are shockingly good! Among programming channels it is the best one.
@AladdinPersson
@AladdinPersson 2 жыл бұрын
Appreciate you saying that
@sahhaf1234
@sahhaf1234 10 ай бұрын
First of all, thank you very much... Secondly, in line 74, should'nt we have epsilon = torch.randn_like(1) instead of epsilon = torch.randn_like(sigma)? Because we want an epsilon distributed in N(0,1) and then the next line will generate z which will be distributed in N(sigma, epsilon).
@Ismail_Qayyum
@Ismail_Qayyum 16 күн бұрын
which color theme isn it?
@mizhou1409
@mizhou1409 2 жыл бұрын
Really helpful! you are awesome!!!
@parthsoni1076
@parthsoni1076 Жыл бұрын
Thanks for the tutorial, it was simple yet insightful. Can you also make a video where you can combine different architecture such as Transformers or Residual blocks in Encoder-Decoder block of VAE.
@TsiHang
@TsiHang 2 ай бұрын
I won't have passed my fyp and graduated without you. THANK YOU
@TsiHang
@TsiHang 11 ай бұрын
Had to learn about VAE with zero experience in coding or ML. Thank God I found this video 😅
@danyahhussein1073
@danyahhussein1073 5 ай бұрын
Thanks Aladdin, you helped me a lot, thanks for the unique explanation, keep up the good!
@0liver19
@0liver19 4 ай бұрын
you are awesome. thank you for this immensely valuable resource!!
@teetanrobotics5363
@teetanrobotics5363 2 жыл бұрын
ALADDIN PERSSON. YOUR CONTENT IS AMAAAZZZIIINNGGG !!! THANK YOU FOR PRACTICAL DEEP LEARNING WITH PYTORCH
@AladdinPersson
@AladdinPersson 2 жыл бұрын
Thanks & np!!
@teetanrobotics5363
@teetanrobotics5363 2 жыл бұрын
@@AladdinPersson If possible, please include this in the playlist and make more tutorials please. Loving it !!
@ensabinha
@ensabinha 2 жыл бұрын
I see few videos of you about GAN, so probably you want to have a look at Adversarial Autoencoders. Instead of using KLD, you can impose a prior on the latent using a discriminator.
@kl_moon
@kl_moon Жыл бұрын
I love "from scratch" series, plz make more videos..!! and thank you so much!!!
@riis08
@riis08 2 жыл бұрын
Its always to good write the code from scratch....
@manolisnikolakakis7292
@manolisnikolakakis7292 Жыл бұрын
Thank you very much for your tutorials. They have been incredibly helpful and insightful.
@benx1326
@benx1326 2 жыл бұрын
do more vids about vision transformers
@YoungS-y7d
@YoungS-y7d Жыл бұрын
why machine learning is easy to learn? Because a lot of amazing guys are making videos about explaining papers and writing codes line by line.
@AladdinPersson
@AladdinPersson Жыл бұрын
Thanks for the kind words ❤️
@edgarromeroherrera2886
@edgarromeroherrera2886 11 ай бұрын
lovely video man, thankyou
@nikitaandriievskyi3448
@nikitaandriievskyi3448 2 жыл бұрын
I'm speechleess, the content is too good
@fizipcfx
@fizipcfx 2 жыл бұрын
isnt self.activation = nn.relu better?
@AladdinPersson
@AladdinPersson Жыл бұрын
Yeah, maybe slightly confusing if we’d be using multiple activations?
@fizipcfx
@fizipcfx Жыл бұрын
@Aladdin Persson i guess you are right that way its more clear
@sangrammishra699
@sangrammishra699 2 жыл бұрын
very informative love the explanation of content and implementation from scratch
@davidlourenco6989
@davidlourenco6989 Жыл бұрын
I prefer from scratch too for all the reasons you've mentioned. Thanks for the content .
@chyldstudios
@chyldstudios 2 жыл бұрын
Doing it from scratch is way better than just typing some pre-written code.
@AladdinPersson
@AladdinPersson 2 жыл бұрын
What do you mean by this?
@lorandy_lo2283
@lorandy_lo2283 2 жыл бұрын
Thanks for your amazing implemention and interpretation!
@travelthetropics6190
@travelthetropics6190 2 жыл бұрын
Thanks a lot, is there any recommendation on TensorFlow VAE tutorial ?
@AladdinPersson
@AladdinPersson 2 жыл бұрын
Recommendation is to use pytorch ;)
@avp300
@avp300 Жыл бұрын
great explanation, thanks!
@mitchellphiri5054
@mitchellphiri5054 2 жыл бұрын
Finally kicking off this series, I've been waiting for years. Curious if you'll do VQ-VAEs like in the Jukebox example from OpenAI?
@AladdinPersson
@AladdinPersson 2 жыл бұрын
Yeah.. Don't have a structured plan for what's next but VQ-VAEs would be cool to understand
@LiquidMasti
@LiquidMasti 2 жыл бұрын
Very informative content . Also can you make shorts that explains small stuffs
@AladdinPersson
@AladdinPersson 2 жыл бұрын
Thanks Dhairya! Good idea, haven't figured out what to make on yet but will think about it:)
@AmeyMindex
@AmeyMindex Жыл бұрын
Awesome Dude!!!! So great!!
@mjmoosavizade8355
@mjmoosavizade8355 2 жыл бұрын
PyTorch has a loss function for KL divergence, I was wondering if it's possible to use that instead of writing it?
@AladdinPersson
@AladdinPersson 2 жыл бұрын
Yeah that should be possible.. haven’t tried it though.
@AladdinPersson
@AladdinPersson 2 жыл бұрын
Yeah that should be possible.. haven’t tried it though.
@francescolee8233
@francescolee8233 2 жыл бұрын
So fast!Awesome!
@silencedspec1711
@silencedspec1711 2 жыл бұрын
Miss it!
@gtg238s
@gtg238s 2 жыл бұрын
Dope stuff!
@bennisyoutube
@bennisyoutube 2 жыл бұрын
Amazing!
@agsantiago22
@agsantiago22 Жыл бұрын
Merci !
@marcel2711
@marcel2711 9 ай бұрын
mnist dataset lol. all samples/videos using the same DS. so boring. create your own dataset, implement something interesting
@macx7760
@macx7760 26 күн бұрын
the dataset is not the interseting part, its a means to an end, the model is the interesting part, but i guess thats subjective
@GoldenMunkee
@GoldenMunkee Жыл бұрын
I just have to say that, even as someone with a Master's in Data Science from a top university, I still use your tutorials for my work and my projects. Your stuff is incredibly helpful from a practical perspective. In school, they teach you theory with little to no instruction on how to actually build anything. Thank you so much for your hard work!!
Variational AutoEncoder Paper Walkthrough
40:03
Aladdin Persson
Рет қаралды 13 М.
哈哈大家为了进去也是想尽办法!#火影忍者 #佐助 #家庭
00:33
РОДИТЕЛИ НА ШКОЛЬНОМ ПРАЗДНИКЕ
01:00
SIDELNIKOVVV
Рет қаралды 2,9 МЛН
Which One Is The Best - From Small To Giant #katebrush #shorts
00:17
Worst flight ever
00:55
Adam W
Рет қаралды 30 МЛН
Diffusion models from scratch in PyTorch
30:54
DeepFindr
Рет қаралды 251 М.
The Reparameterization Trick
17:35
ML & DL Explained
Рет қаралды 20 М.
Pytorch Transformers from Scratch (Attention is all you need)
57:10
Aladdin Persson
Рет қаралды 310 М.
Autoencoder In PyTorch - Theory & Implementation
30:00
Patrick Loeber
Рет қаралды 69 М.
ML Was Hard Until I Learned These 5 Secrets!
13:11
Boris Meinardus
Рет қаралды 311 М.
Variational Autoencoders
15:05
Arxiv Insights
Рет қаралды 501 М.
Where Are Laid Off Tech Employees Going? | CNBC Marathon
41:28
哈哈大家为了进去也是想尽办法!#火影忍者 #佐助 #家庭
00:33