Рет қаралды 30,490
Johno shows us what is happening behind the scenes when we create an image with Stable Diffusion, looking at the different components and processes and how each can be modified for further control over the generation process. The notebook is available in this repository: github.com/fastai/diffusion-nbs
This was made as a companion to lesson 9 of the FastAI 2022 course by Jonathan Whitaker (his channel: kzbin.info/door/P6gT9X2....
00:00 - Introduction
00:40 - Replicating the sampling loop
01:17 - The Auto-Encoder
03:55 - Adding Noise and image-to-image
08:43 - The Text Encoding Process
15:15 - Textual Inversion
18:36 - The UNET and classifier free guidance
24:41 - Sampling explanation
36:30 - Additional guidance
Errata: there should be some scaling done to the model inputs for the unet demo in cell 49 (19 minutes in) - see scheduler.scale_model_input in all the loops for the code that is missing. And in the autoencoder part the 'compression' isn't exactly 64 times since there are 4 channels in the latent representation and only 3 in the input.