Why Residual Connections (ResNet) Work

Рет қаралды 8,001

DataMListic

Күн бұрын

Пікірлер: 13

@datamlistic 2 жыл бұрын

Do you know any other theories about why residual connections work?

@aaronguo607 Жыл бұрын

Thank you for making this explanation video! It helped a lot!

@datamlistic Жыл бұрын

You're welcome! Glad it was helpful! :)

Ай бұрын

Thanks for the video! But you said that in the Residual architecture, the Network can use any of the previous outputs, what did you mean by that? It's not like it's connected to each of its previous output??

@horserl1872 18 күн бұрын

No, it is. For ex. With res connects, obviously layer 38 connects to 37, but it also connects to 36. This shows why res connects are so novel. Additionally, 38 can connect to 12, 15, and any other previous layers. If you look at the math equations, it makes more sense.

@coolerthenu458 11 ай бұрын

Great, succinct explanation

@datamlistic 11 ай бұрын

Thanks! Glad you liked it! :)

@alsonyang230 4 ай бұрын

Thanks for the explanation, there is one thing that I hope you can elaborate on. With the weight matrix randomised, why is it easier for the NN to learn Zero Matrix compared to the Identity matrix?

@datamlistic 4 ай бұрын

Good question! IMO mainly because you also usually use weight regularization (i.e. L2) in your final loss, so the NN can easily shrink the weight's matrix values to 0.

@alsonyang230 4 ай бұрын

@@datamlistic Makes sense, thanks for the explanation!

@dominiquebijnens9718 9 ай бұрын

In contrast with the title, you did NOT explain why ResNets work. Based upon the original paper you only explained how they are built. A good start would be to elaborate a bit more about the identity functionality.

@datamlistic 9 ай бұрын

Thanks for the feedback! I've put on my list a follow up video about ResNet where I dig deeper into the identity functionality. :)