Lesson 13: Deep Learning Foundations to Stable Diffusion

Рет қаралды 17,158

Күн бұрын

Пікірлер: 13

@mattst.hilaire9101 Жыл бұрын

That e^a trick shows that, even though algebra is such a pain, it comes in handy so often to make things move smooth. Reminds me of that trick to avoid overflow in binary search: mid = low + ((high - low) / 2). Favorite thing about these lectures are the small hints for math and Python along the way. Thanks for being so detail oriented!

@michaelmuller136 5 ай бұрын

Great, very enlightening, liked the small details also, thank you!

@make_education 9 күн бұрын

Thanks a lot!

@MichaelChenAdventures 7 ай бұрын

Thank you Jeremy!

@markozege Жыл бұрын

When we compare result of the softmax with the one-hot vector (at 1:21:00), we take only the value of the softmax where one-hot vector is one. Isn't this a missed opportunity to incorporate other "wrong" predictions into the loss function? E.g. if the model is highly confident in making the prediction for some other wrong class (eg. numbers that look similar) then getting more penalised for this could further speed up the training?

@SKULDROPR Жыл бұрын

I think I understand what you are getting at. Focal loss lets you control the amount of penalty you are talking about for the other wrong predictions.

@amitaswal7359 Жыл бұрын

if our prediction is wrong then the log value of our wrong prediction will be a large -ve number, so it won't matter

@SKULDROPR Жыл бұрын

@@amitaswal7359 Now I think about it, you are correct, it is no big deal either way

@maxim_ml Жыл бұрын

Softmax makes it so that the larger the probability for a wrong class is, the smaller the probability for the right class is. So there already is a penalty for having a high probability for the wrong class. Maybe having a loss that penalizes an uneven distribution of probabilities among the wrong classes would be useful. I guess Soft Labels already end up doing that.

@myfolder4561 4 ай бұрын

I've found the walk thru on backward propagation in this lesson a bit lacking and jumpy. Highly recommend Andrej Karpathy's zero to hero series for those who're interested to dig a bit deeper into the math and step thru of application of the chain rule in deriving gradients

@bomb3r422 3 ай бұрын

i second that, it was a bit rushed and unclear. Andrej does a fantastic job with explaining backprop.