Deriving Matrix Equations for Backpropagation on a Linear Layer

  Рет қаралды 5,718

Alex-AI

Alex-AI

Күн бұрын

Doing the index tracking to figure out the matrix form of backpropagation is one of the more tedious aspects of working with neural networks but still quite useful to go through in detail every now and then. I can't claim you'll find this video entertaining or particularly interesting, but I hope some of you will find it useful.
Note that at 1:53 I made a mistake. It should be that b ∈ R^N. The batch dimension B was already accounted for when I wrote the bias matrix as repeated rows of b.
Sections:
0:00 - Setting up notation
6:50 - ∂L / ∂W
20:10 = ∂L / ∂b
23:30 = ∂L / ∂x

Пікірлер: 8
@alex-ai7517
@alex-ai7517 Жыл бұрын
Note that at 1:53 I made a mistake. It should be that b ∈ R^N. The batch dimension B was already accounted for when I wrote the bias matrix as repeated rows of b.
@sagarshravane961
@sagarshravane961 9 ай бұрын
yeah................to be precise ------> b ∈ R ^ (1XN) and how it is added to each instance really depends on the implementation in the code... but it is actually frustatingly confusing for beginners..as rows need not be repeated in pytorch or numpy due to their elementwise operational capability.. thanks for the awesome lecture.
@huseyinsenol1769
@huseyinsenol1769 7 ай бұрын
That was the one of the greatest lecture videos I've ever seen. Thanks.
@guoguowg1443
@guoguowg1443 Ай бұрын
great stuff Alex
@ludwigkraken9935
@ludwigkraken9935 9 ай бұрын
great explanation!
@Glaszg
@Glaszg Жыл бұрын
Great video, really useful! If you could also do dL/d sigmoid(y) and dL/d Prelu(h) That’d help a lot!
@beniaminradomir9798
@beniaminradomir9798 7 ай бұрын
This is a very helpful video! I'm learning back propagation for the first time and was totally confused be the shapes of the matrices that would never align for me. However one thing makes me wonder: Can this method only be used for (linear) NNs that don't use activation functions? This appears to be the case. Does that mean that if I'd want to do the same derivations for NNs that do use activation functions it would be even more complicated? Oh man, and there is me, thinking that this would be easy and straight forward haha
@swazza9999
@swazza9999 7 ай бұрын
I'd say this video covers the "hardest" bit. Activations are easy to incorporate because they typically act on one neuron at a time, so there's no index tracking to do, it's just i -> i. In fact, this video already shows how to deal with activations. If you look at my final expressions they still have dL/dy in them. I left the loss function general, leaving you to fill that in depending on which specific loss function you are using. But what if this was a layer somewhere in the middle of the neural network, and I was really calculating da/dx, da/dW and da/db (a is for "activation"). All the math in the video would be exactly the same, but instead of dL/dy in the final expressions, you'd have da/dy. So you see, incorporating an activation just amounts to incorporating its derivative into the chain rule, and since the activation is a scalar to scalar function, there's no matrix multiplication. It's just a scalar. For example, what if my activation is a(y) = y**2 / 2. Then da/dy = y. Then you just plug y in where I have all the dL/dy in the video. If it's still not clear to you after reading this, I'd encourage you to just sit with it for some time and try to work through it. I feel like you are close.
The Most Important Algorithm in Machine Learning
40:08
Artem Kirsanov
Рет қаралды 258 М.
Neural Networks Matrix Math and NumPy
13:58
Kie Codes
Рет қаралды 9 М.
Whyyyy? 😭 #shorts by Leisi Crazy
00:16
Leisi Crazy
Рет қаралды 17 МЛН
Backstage 🤫 tutorial #elsarca #tiktok
00:13
Elsa Arca
Рет қаралды 34 МЛН
ДЕНЬ РОЖДЕНИЯ БАБУШКИ #shorts
00:19
Паша Осадчий
Рет қаралды 7 МЛН
Backpropagation in Convolutional Neural Networks (CNNs)
9:21
Application of Calculus in Backpropagation
14:45
Orblitz
Рет қаралды 15 М.
Back-Propagation with Tensors
8:46
Deep Foundations
Рет қаралды 898
Understanding Backpropagation In Neural Networks with Basic Calculus
24:28
Backpropagation Algorithm | Neural Networks
13:14
First Principles of Computer Vision
Рет қаралды 32 М.
Neural Network from Scratch | Mathematics & Python Code
32:32
The Independent Code
Рет қаралды 118 М.
Solve any equation using gradient descent
9:05
Edgar Programmator
Рет қаралды 52 М.
Essential Matrix Algebra for Neural Networks, Clearly Explained!!!
30:01
StatQuest with Josh Starmer
Рет қаралды 44 М.
Backpropagation : Data Science Concepts
19:29
ritvikmath
Рет қаралды 33 М.
Vector and matrix derivatives
12:39
Herman Kamper
Рет қаралды 30 М.
Где раздвижные смартфоны ?
0:49
Не шарю!
Рет қаралды 783 М.
Apple watch hidden camera
0:34
_vector_
Рет қаралды 59 МЛН
Выложил СВОЙ АЙФОН НА АВИТО #shorts
0:42
Дмитрий Левандовский
Рет қаралды 2,1 МЛН