Lesson 25: Deep Learning Foundations to Stable Diffusion

  Рет қаралды 6,792

Jeremy Howard

Jeremy Howard

Күн бұрын

(All lesson resources are available at course.fast.ai.) In this final lesson of the series, Johno begins by showing us how we can convert sounds into pictures, and then take advantage of what we've learned in this course to generate audio! He builds and demonstrates a very effective bird-song generator using this approach.
Then Jeremy wraps up "Stable diffusion from scratch" by showing how to use the latents in a variational encoder as the "pixels" in a regular diffusion model. He also describes an intriguing new idea for students to follow up: what if you use latents for other purposes, such as a classification model? Perhaps this would open up a whole world of possibilities, such as latents-FID, latents-perceptual-loss, and new approaches to diffusion guidance!

Пікірлер: 12
@ADHDOCD
@ADHDOCD Жыл бұрын
Amazing lecture as always! Cannot wait for the LLM lectures. MIT's lectures pale in comparison to what Jeremy and his team produce.
@abhishekmann
@abhishekmann Жыл бұрын
There are going to be LLM lectures?
@ADHDOCD
@ADHDOCD Жыл бұрын
@@abhishekmann Yes
@NewMateo
@NewMateo Жыл бұрын
@@ADHDOCD any link to where I can read more about this to confirm? That would be wonderful! Ill be taking huggin faces course after but I would love to see jeremy and their team really dig into it as their own series.
@briansmithphotos
@briansmithphotos Жыл бұрын
@@NewMateo If you watch just the first few minutes and the last few minutes they mention the LLM ;)next course - but plenty of value in watching all the parts in between too!
@NewMateo
@NewMateo Жыл бұрын
@lunchwithalens im on Lesson 10 lol. I just skipped ahead to see how far the course went. Nontheless thanks for the heads up!
@hacklife8363
@hacklife8363 Жыл бұрын
How do I join the live classes for LLM?
@kashifsiddhu
@kashifsiddhu 6 ай бұрын
00:02 Introduction to NLP and delay in completing stable diffusion 02:19 Creating a subset dataset from longer call recordings for deep learning analysis 06:16 Mel spectrogram focuses on human audible frequencies and transforms them into a log space. 08:26 Converting audio to spectrogram and back to audio 12:46 The model uses Transformer blocks for stable diffusion 14:58 Generating fake bird calls with spectrogram diffusion 19:23 Creating a simple autoencoder with a single hidden layer MLP 21:35 The simple autoencoder compresses and regenerates images. 26:00 Log variance affects standard deviation in deep learning 28:10 Utilizing BCE loss for deeper learning stability 32:21 Minimize log variance in deep learning foundations 34:47 Mapping inputs to a restricted range for better decoding 38:39 Introducing new metrics for model evaluation 40:45 VAE benefits from pre-trained models for efficient generation 44:51 Creating a data set and pre-processing images for deep learning 47:02 Using parallel processing to speed up image reading in deep learning 51:14 Discussion on spatial resolution and training objectives 53:21 Deep learning foundations include perceptual loss and adversarial loss 57:54 Pre-training generator and discriminator for GANs 59:49 Using memory mapped numpy files to save latents efficiently 1:04:01 Creating memory-mapped numpy array of latents 1:06:02 Training and validation set creation 1:10:06 Creating high-quality 256 by 256 pixel images in a few hours with stable diffusion VAE 1:11:56 Experimenting with diffusers and stable diffusion models for better results. 1:16:03 Data set acquisition process explained 1:18:01 Creating a cache for quicker access to files 1:22:11 Preparing and transforming training data for deep learning 1:23:58 Implementing data augmentation techniques in deep learning training process 1:27:47 Achieved 66% accuracy after 40 epochs of training a new model 1:30:07 Pre-training with perceptual loss yields promising results 1:33:40 Congratulations on completing the course, consider experimenting and collaborating further 1:35:36 Deep Learning Foundations to Stable Diffusion Crafted by Merlin AI.
@shubh9207
@shubh9207 4 ай бұрын
Please roll out the next series, although I'm in the first part, I just can't wait to reach here and learn from such amazing tutors.
@starlite5097
@starlite5097 Жыл бұрын
Thank you so much for this course. Even though I don't have the time to try things on my own, I noted down 207 useful things for me.
@pj-nz6nm
@pj-nz6nm 9 ай бұрын
this course is literly overwhelming for me, sometimes I felt sometime its not for me even this lecture also. it's hard
@ankile
@ankile Жыл бұрын
When is the next episodes coming out, does anyone know?
Lesson 9A 2022 - Stable Diffusion deep dive
41:09
Jeremy Howard
Рет қаралды 32 М.
Goodbye, TAM
12:01
Action Retro
Рет қаралды 8 М.
Officer Rabbit is so bad. He made Luffy deaf. #funny #supersiblings #comedy
00:18
Funny superhero siblings
Рет қаралды 7 МЛН
💩Поу и Поулина ☠️МОЧАТ 😖Хмурых Тварей?!
00:34
Ной Анимация
Рет қаралды 1,9 МЛН
Apple peeling hack @scottsreality
00:37
_vector_
Рет қаралды 132 МЛН
MIT 6.S191 (2023): Convolutional Neural Networks
55:15
Alexander Amini
Рет қаралды 256 М.
OpenAI’s New ChatGPT: 7 Incredible Capabilities!
6:27
Two Minute Papers
Рет қаралды 194 М.
Software engineering is dead. ChatGPT killed it
4:58
Jason Goodison
Рет қаралды 81 М.
Lesson 23: Deep Learning Foundations to Stable Diffusion
1:40:56
Jeremy Howard
Рет қаралды 6 М.
Lesson 10: Deep Learning Foundations to Stable Diffusion, 2022
1:49:14
How Stable Diffusion Works (AI Image Generation)
30:21
Gonkee
Рет қаралды 149 М.
Will AI replace programmers?
4:08
NeetCode
Рет қаралды 177 М.
The End of Finetuning - with Jeremy Howard of Fast.ai
1:24:48
Latent Space
Рет қаралды 20 М.
Data Structures Explained for Beginners - How I Wish I was Taught
17:06
Internet Made Coder
Рет қаралды 577 М.
Officer Rabbit is so bad. He made Luffy deaf. #funny #supersiblings #comedy
00:18
Funny superhero siblings
Рет қаралды 7 МЛН