How I Understand Diffusion Models

  Рет қаралды 18,292

Jia-Bin Huang

Jia-Bin Huang

4 ай бұрын

Diffusion models are powerful generative models that enable many successful applications like image, video, and 3D generation from texts.
In this tutorial, I share my understanding of the diffusion model basics, including training, guidance, resolution, and speed.
Below are some other great resources to learn more about diffusion models.
===== Slides =====
Here are the slides used in this video
Training: bit.ly/3WudEPH
Guidance: bit.ly/3wedCky
Resolution: bit.ly/4bqxHmo
Speed: bit.ly/4bpJzoJ
===== Tutorials =====
[CVPR 2022 Tutorial] Denoising Diffusion-based Generative Modeling: Foundations and Applications
cvpr2022-tutorial-diffusion-m...
[CVPR 2023 Tutorial] Denoising Diffusion Models: A Generative Learning Big Bang
cvpr2023-tutorial-diffusion-m...
[A short course by DeepLearning.AI] How Diffusion Models Work
• How Diffusion Models W...
===== Training =====
[Sohl-Dickstein et al. 2015] Deep Unsupervised Learning using Nonequilibrium Thermodynamics
arxiv.org/abs/1503.03585
[Ho et al. 2020]: Denoising Diffusion Probabilistic Models
arxiv.org/abs/2006.11239
[Luo 2022] Understanding Diffusion Models: A Unified Perspective arxiv.org/abs/2208.11970
[Karras et al. 2022] Elucidating the design space of diffusion-based generative models
arxiv.org/abs/2206.00364
[Karras et al. 2023] Analyzing and Improving the Training Dynamics of Diffusion Models
arxiv.org/abs/2312.02696
===== Guidance =====
[Dhariwal and Nichol 2021] Diffusion Models Beat GANs on Image Synthesis
arxiv.org/abs/2105.05233
[Ho and Salimans 2022] Classifier-Free Diffusion Guidance
arxiv.org/abs/2207.12598
[Sander Dieleman 2022] Guidance: a cheat code for diffusion models
sander.ai/2022/05/26/guidance...
[Sander Dieleman 2023] The geometry of diffusion guidance
sander.ai/2023/08/28/geometry...
===== Resolution =====
[Ho et al. 2021] Cascaded Diffusion Models for High Fidelity Image Generation
arxiv.org/abs/2106.15282
[Saharia et al. 2022] Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding
arxiv.org/abs/2205.11487
[Rombach et al. 2021] High-Resolution Image Synthesis with Latent Diffusion Models
arxiv.org/abs/2112.10752
[Vahdat et al. 2021] Score-based Generative Modeling in Latent Space
proceedings.neurips.cc/paper_...
[Podell et al. 2023] SDXL: Improving Latent Diffusion Models for High-resolution Image Synthesis
arxiv.org/abs/2307.01952
[Hoogeboom et al. 2023] Simple diffusion: End-to-end diffusion for high resolution images
arxiv.org/abs/2301.11093
[Chen et al. 2023] On the importance of noise scheduling for diffusion models
arxiv.org/abs/2301.10972
[Gu et al. 2023] Matryoshka Diffusion Models
arxiv.org/abs/2310.15111
===== Speed =====
[Song et al. 2021] Denoising Diffusion Implicit Models
arxiv.org/abs/2010.02502
[Salimans and Ho 2022] Progressive Distillation for Fast Sampling of Diffusion Models
arxiv.org/abs/2202.00512
[Meng et al. 2023] On Distillation of Guided Diffusion Models
arxiv.org/abs/2210.03142
[Song et al. 2023] Consistency models
arxiv.org/abs/2303.01469
[Luo et al. 2023] Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference
arxiv.org/abs/2310.04378
[Luo et al. 2023] LCM-LoRA: A Universal Stable-Diffusion Acceleration Module
arxiv.org/abs/2311.05556
[Sauer et al. 2023] Adversarial Diffusion Distillation
arxiv.org/abs/2311.17042
[Yin et al. 2023] One-step Diffusion with Distribution Matching Distillation
arxiv.org/abs/2311.18828

Пікірлер: 60
@4thlord51
@4thlord51 4 күн бұрын
I'm building my own diffusion model myself. This is the best breakdown and visualization of the mathematics and implementation. Well done.
@jbhuang0604
@jbhuang0604 4 күн бұрын
Thank you! This comment just made my day!
@rtluo1546
@rtluo1546 Ай бұрын
This is truly a great tutorial video, so well-made. Cannot believe covering so many things within only 17 minutes.
@jbhuang0604
@jbhuang0604 Ай бұрын
Thanks a lot! Happy that you enjoyed the video!
@ayushsaraf8421
@ayushsaraf8421 4 ай бұрын
incredible explanation with so much detail packed in so little time. Looking forward to more of these
@jbhuang0604
@jbhuang0604 4 ай бұрын
Thanks, Ayush! Glad that you like it!
@curiousobserver2006
@curiousobserver2006 Ай бұрын
seriously one of the best educational videos I've ever watched.
@jbhuang0604
@jbhuang0604 Ай бұрын
Thank you so much!
@wangy01
@wangy01 Ай бұрын
Thank you for your great work removing the need of the audience to know much prior knowledge before they could enjoy your video. For example, you mentioned maximum likelihood and explain what it is immediately. It is such a challenge to straighten all these in a 17-minute video, but you did a great work. Thank you!
@jbhuang0604
@jbhuang0604 Ай бұрын
Glad that you liked it! Appreciate your kind words! This made my day!
@khalilsabri7978
@khalilsabri7978 15 күн бұрын
Just one minute in the video, you know it's extremely well done. Thanks for the video !
@jbhuang0604
@jbhuang0604 15 күн бұрын
Glad you liked it! Thanks so much for the comment!
@alexpeng6705
@alexpeng6705 4 ай бұрын
Thanks for your efforts in making such a high-quality video! I like the way you break down such complex ideas in a concise manner and visualize them intuitively and elegantly. I wish I could have this video six months ago, lol.
@jbhuang0604
@jbhuang0604 4 ай бұрын
Thanks for your kind words! It's a fun video to make, and I also learn a lot about diffusion models through the process.
@pinkpig7505
@pinkpig7505 4 ай бұрын
What a timing 🙌 needed this explanation so bad... thanks ✌️
@jbhuang0604
@jbhuang0604 4 ай бұрын
Glad it helps! Thanks a lot!
@youtube_showcase
@youtube_showcase 21 күн бұрын
Amazing work! Thank you for sharing 😀
@jbhuang0604
@jbhuang0604 20 күн бұрын
Thank you! Cheers!
@JionghaoWang-fs1uq
@JionghaoWang-fs1uq 4 ай бұрын
You are a true educator! Great video!
@jbhuang0604
@jbhuang0604 4 ай бұрын
Thank you so much! Glad that you like the video.
@AIwithAndy
@AIwithAndy 3 ай бұрын
I appreciated the explanation of conditional generations. Nice job!
@jbhuang0604
@jbhuang0604 3 ай бұрын
Thanks so much! Glad that you like it.
@Charles-my2pb
@Charles-my2pb 4 ай бұрын
Thank you so much for your contribution. It's a tutorial make me clear about Diffusion, as beginner.
@jbhuang0604
@jbhuang0604 4 ай бұрын
You are welcome. Glad it was helpful!
@bingzha6099
@bingzha6099 4 ай бұрын
Really enjoying watching this video and learned a lot. Hope more such videos in the future.
@jbhuang0604
@jbhuang0604 4 ай бұрын
Will do! Stay tuned! 😊
@Funnyshoes321
@Funnyshoes321 4 ай бұрын
Thanks a lot for the videos! I've been self-studying diffusion models on the side for a few months now and this is the only video I've seen that gives an in-depth yet intuitive explanation of the math.
@jbhuang0604
@jbhuang0604 4 ай бұрын
Glad it was helpful!
@ElLoza
@ElLoza 4 ай бұрын
I would say Top quality video! Congratulations!🎉 More like this would by awesome!
@jbhuang0604
@jbhuang0604 4 ай бұрын
Thank you! Will do!
@emreakbas9289
@emreakbas9289 4 ай бұрын
Great explanation, Jia-Bin! Thanks!
@jbhuang0604
@jbhuang0604 4 ай бұрын
Thanks, Emre!
@user-pk4yz7wn3s
@user-pk4yz7wn3s 2 ай бұрын
BRAVO! No one ever have explained the diffusion model in such an easy way with all the details.
@jbhuang0604
@jbhuang0604 2 ай бұрын
Thank you so much for your kind words! This makes my day!
@fwahhablums
@fwahhablums 4 ай бұрын
Very compressive and precise. Thanks. Also thanks for tweedie formula and simplifying score based model. That is the most convoluted part in most papers. Looking forward to demystified NERFs from you!
@jbhuang0604
@jbhuang0604 4 ай бұрын
Glad it was helpful!
@nikitadrobyshev7953
@nikitadrobyshev7953 2 ай бұрын
OK, this is the best video explanation of diffusion models I saw. Ideal ratio between simplifications and depth☺👏
@jbhuang0604
@jbhuang0604 2 ай бұрын
Glad it was helpful! Thank you so much for your kind words!
@wangy01
@wangy01 Ай бұрын
I agree. The author must have carefully chosen the most efficient way cutting into the complex concept hierarchy and every single word to achieve that efficiency.
@orisenbazuru
@orisenbazuru 27 күн бұрын
Great video! At 1:21 should be maximizing similarity between two distributions. Or minimizing the distance between two distributions.
@jbhuang0604
@jbhuang0604 27 күн бұрын
Thanks for pointing this out! Yes, you are right! It should be *maximizing* the similarity between the two distributions.
@420_gunna
@420_gunna 4 ай бұрын
Awesome video, hope I'm smarter when I try to rewatch it in 3 months ;)
@jbhuang0604
@jbhuang0604 4 ай бұрын
Glad you liked it! Let me know if you have questions.
@nutshell1811
@nutshell1811 Ай бұрын
Best video on diffusion!!
@jbhuang0604
@jbhuang0604 Ай бұрын
Great! Glad that it’s helpful!
@Raymond-zv5gr
@Raymond-zv5gr Ай бұрын
BRO YOU ARE EPIC
@jbhuang0604
@jbhuang0604 27 күн бұрын
Thank you thank you!
@jasoncampbell1464
@jasoncampbell1464 4 ай бұрын
Saw the cow, heard the moo. 5 stars.
@jbhuang0604
@jbhuang0604 4 ай бұрын
🤣🤣🤣
@pedroenriquelopezdeteruela6545
@pedroenriquelopezdeteruela6545 2 ай бұрын
Awesome post, Jiang, thank you so much for the great job! Anyway, a small comment/question on your video (without too much importance, I assume). At minute 5:56 you comment that (direct derivation of formula (7) in the paper "Denoising Diffusion Probabilistic Models"), mu^hat_t(x_t,x_0) is on the line joining x_0 and x_t. And, while this is approximately true for "normal" beta_t scheduling, I think that the estimated mean as a function of x_0 and x_t need not be exactly on such a line since, in general, the respective multipliers of x_0 and x_t in such an equation need not (in general) add up to one. In fact, in "normal" scheduling, as t increases, it seems that this sum keeps progressively moving away from 1, so that although obviously mu_t will continue to be a simple linear combination of both x_t and x_0, the fact is that it will progressively move away (although by a small amount) from this line. Would you agree with this observation? Greetings, and again, congratulations for the video and thank you very much for clarifying us the inners of diffusion models!
@jbhuang0604
@jbhuang0604 Ай бұрын
Thank you so much for your comment! You are right! It won’t be on the line when the multipliers are not adding up to one.
@mcarletti
@mcarletti 12 күн бұрын
My like comes with the 5th Symphony (9:39) 😸🎶
@jbhuang0604
@jbhuang0604 12 күн бұрын
Oh My! Finally one person noticed that! (Spent a lot of time making that lol)
@yuktikaura
@yuktikaura 4 ай бұрын
@Jia-Bin Huang we want to maximize likelihood and also minimize KL divergence so that we can "maximize" similarity between two distributions..it is stated other-way round at timestamp 1:19 to 1:121
@jbhuang0604
@jbhuang0604 4 ай бұрын
Yes! You are right! Maximize likelihood -> Minimize KL divergence -> Maximize similarity between the two distributions. I got confused with too many negations. :-P
@truonggiangnguyen8844
@truonggiangnguyen8844 Ай бұрын
I have a question: Are all distribution mentioned is distribution of a continuous variable, since we're using integral here?
@jbhuang0604
@jbhuang0604 Ай бұрын
Good question! I think there are some development of discrete variational autoencoder and diffusion models. Those methods can deal with discrete variables.
@herrbonk3635
@herrbonk3635 4 ай бұрын
Wish I could hear what you say: 0:36 "this stickholder"? 0:43 "hyber we do not know" 1:13 "just the cadirabigdes" and so on
@jbhuang0604
@jbhuang0604 4 ай бұрын
You can see the full script by turning on the subtitles/CC. Hope this helps.
@herrbonk3635
@herrbonk3635 4 ай бұрын
@@jbhuang0604 I will try, thanks!
Diffusion and Score-Based Generative Models
1:32:01
MITCBMM
Рет қаралды 64 М.
Diffusion Models | Paper Explanation | Math Explained
33:27
Outlier
Рет қаралды 219 М.
[Vowel]물고기는 물에서 살아야 해🐟🤣Fish have to live in the water #funny
00:53
Тяжелые будни жены
00:46
К-Media
Рет қаралды 3,1 МЛН
AI 3D Generation, explained
12:59
Jia-Bin Huang
Рет қаралды 8 М.
Diffusion Models for Inverse Problems
42:09
Generative Memory Lab
Рет қаралды 13 М.
How does OpenAI's Sora work?
4:27
Jia-Bin Huang
Рет қаралды 47 М.
Ali Ghodsi, Deep Learning, Diffusion Models, DDPMs,  Fall 2023, Lecture 17
1:09:23
Data Science Courses
Рет қаралды 4,6 М.
CVPR #18546 - Denoising Diffusion Models: A Generative Learning Big Bang
3:04:32
ComputerVisionFoundation Videos
Рет қаралды 12 М.
Stable Diffusion - What, Why, How?
54:08
Edan Meyer
Рет қаралды 226 М.
Why Does Diffusion Work Better than Auto-Regression?
20:18
Algorithmic Simplicity
Рет қаралды 58 М.
The U-Net (actually) explained in 10 minutes
10:31
rupert ai
Рет қаралды 72 М.
Радиоприемник из фольги, стаканчика и светодиода с батарейкой?
1:00
На iPhone можно фоткать даже ночью😳
0:30
GStore Mobile
Рет қаралды 1,4 МЛН
Вы поможете украсть ваш iPhone
0:56
Romancev768
Рет қаралды 650 М.