Do you know any other theories about why residual connections work?
@aaronguo607 Жыл бұрын
Thank you for making this explanation video! It helped a lot!
@datamlistic Жыл бұрын
You're welcome! Glad it was helpful! :)
Ай бұрын
Thanks for the video! But you said that in the Residual architecture, the Network can use any of the previous outputs, what did you mean by that? It's not like it's connected to each of its previous output??
@horserl187218 күн бұрын
No, it is. For ex. With res connects, obviously layer 38 connects to 37, but it also connects to 36. This shows why res connects are so novel. Additionally, 38 can connect to 12, 15, and any other previous layers. If you look at the math equations, it makes more sense.
@coolerthenu45811 ай бұрын
Great, succinct explanation
@datamlistic11 ай бұрын
Thanks! Glad you liked it! :)
@alsonyang2304 ай бұрын
Thanks for the explanation, there is one thing that I hope you can elaborate on. With the weight matrix randomised, why is it easier for the NN to learn Zero Matrix compared to the Identity matrix?
@datamlistic4 ай бұрын
Good question! IMO mainly because you also usually use weight regularization (i.e. L2) in your final loss, so the NN can easily shrink the weight's matrix values to 0.
@alsonyang2304 ай бұрын
@@datamlistic Makes sense, thanks for the explanation!
@dominiquebijnens97189 ай бұрын
In contrast with the title, you did NOT explain why ResNets work. Based upon the original paper you only explained how they are built. A good start would be to elaborate a bit more about the identity functionality.
@datamlistic9 ай бұрын
Thanks for the feedback! I've put on my list a follow up video about ResNet where I dig deeper into the identity functionality. :)