C4W2L03 Resnets

  Рет қаралды 191,648

DeepLearningAI

DeepLearningAI

Күн бұрын

Пікірлер: 59
@joshsmit779
@joshsmit779 6 жыл бұрын
your videos are iconic and should be preserved in the national library
@zy4663
@zy4663 3 жыл бұрын
Internet is a nice global digital library right?
@saranghaeyoo8239
@saranghaeyoo8239 2 жыл бұрын
*international library
@JonesDTaylor
@JonesDTaylor 4 жыл бұрын
I am doing the DL specialization after finishing your old ML course. By far you are the best teacher out there. Thank you so much for this.
@gravitycuda
@gravitycuda 6 жыл бұрын
Hi Andrew, You are my first professor who taught me ML. I studied your course at Coursera nice seeing you again.
@ThamizhanDaa1
@ThamizhanDaa1 3 жыл бұрын
Periyar oru porukki.. avan nijamaana naathigane kidaiyaadhu.. avar berum hindukkal manathai punpaduthara maathiriye pesuvar..
@viniciussantana8737
@viniciussantana8737 4 жыл бұрын
Andrew os simply the best instrutor in neural networks subject out there. Helped me a Lot.
@zekarias9888
@zekarias9888 4 жыл бұрын
WooW. After I watched 5 other videos about ResNets, I was still lost. Now I got this video and it cleared my misunderstandings out of my mind. Super cool!
@JohnDoe-vr4et
@JohnDoe-vr4et 4 жыл бұрын
Me after listening to most people explaining Resnet: "What? Why? Why do you do this?" Me after listening to Andrew: "Makes sense. Easy peasy."
@Cicatka.Michal
@Cicatka.Michal 3 жыл бұрын
Finally got it! Sometimes it is hard for me to grasp even the basic concepts if I don't know what I should understand and take from the topic. Thank You very much for these videos that will at first tell you what problem you are trying to solve is and why you should solve it and then clearly explains the soulutions. Thumbs up! :)
@sanjurydv
@sanjurydv 5 жыл бұрын
never seen tutorial videos which such a clear explanation. He is the best
@Alex-xx8ij
@Alex-xx8ij 2 жыл бұрын
Your explanation is very clear! Thank you for the lecture.
@swfsql
@swfsql Жыл бұрын
I think we could use the ResNet concept to improve Dropout, creating a "shutdown" regularization: Select a layer (or rather nodes from that layer) that ought to be shutdown and instead only act on the cost function, by adding a cost relative to that layer not being an identity layer. Then the network is free to gradually adapt itself (hopefully by reducing train-set overfit and generalizing) as to push that layer into being evermore so of an identity one. Then if that layer manages to be an identity, it can be permanently shutdown. This could be a way to reduce the network size, and maybe could automatically be applied on high variance with low bias. As far as linear Z functions go, one way for a layer to be an identity is if it has the same amount of nodes as inputs, and if you make a cost for each node[j] so that only their weight[j] is 1 while all other weights are 0, so this would be similar to a "identity" Z layer. But I think that trying to make the activation function also an identity is a hassle, but even ignoring the activation function, if you could still manage to just shutdown the Z function nodes and stack the posterior activation back into the previous activation, that would already be a network simplification. Edit: We also could try to simplify the activation functions if we generalize them and re-parametrize them. Eg. for a ReLU activation function, we could turn it into a leaky ReLU where the leaky-side parameter starts at zero (so it's just like normal ReLU), then we add a cost of that parameter being zero and we let backprop start pushing it towards 1, in which case that previously ReLU activation has turned into the identity activation, which can them be gracefully shutdown.
@HexagonalClosePacked
@HexagonalClosePacked 5 жыл бұрын
I'm trying to understand the components behind Semantic Segmentation and your videos really helped!
@MeAndCola
@MeAndCola 5 жыл бұрын
"man pain" 2:10 😂
@altunbikubra
@altunbikubra 4 жыл бұрын
omg he is really writing that :D
@davidtorres5012
@davidtorres5012 3 жыл бұрын
You are the best, Andrew
@altunbikubra
@altunbikubra 4 жыл бұрын
Thank you, it was a very brief and simplified explanation, loved it.
@Ganitadava
@Ganitadava 2 жыл бұрын
Sir, very nice explanation as always, thanks a lot.
@promethful
@promethful 4 жыл бұрын
So the skipped connections don't literally skip layers but rather add the original input onto the output of the 'skipped' layers?
@АннаКопатько
@АннаКопатько 3 жыл бұрын
I think so too, at least it is the only explanation that I understand
@iasonaschristoulakis6932
@iasonaschristoulakis6932 2 жыл бұрын
Excellent both theoretically and technically
@MrRynRules
@MrRynRules 3 жыл бұрын
Thank you!
@rahuldogra7171
@rahuldogra7171 3 жыл бұрын
what is the benefit of adding identity blocks and skip it? Instead of skipping it why then we are adding?
@ravivaghasiya5680
@ravivaghasiya5680 2 жыл бұрын
Hello everyone. In this video at time 5.20 you have mentioned that as number of layers increase in plain network , training error gets increased in practice. Could you please explain me or Share me some references why does this actually occurs? One reason,i found that vanishing gradient problem and this can be addressed using ReLU. Thus, one can use ReLU in plain network. Then why does ResNet is very traditional?
@academicconnorshorten6171
@academicconnorshorten6171 6 жыл бұрын
Do you broadcast a[l] to make it match the dimensionality of a[l+2]?
@rahulrathnakumar785
@rahulrathnakumar785 5 жыл бұрын
If a_l skips two layers to directly enter the final ReLU, how do we get the z_(l+2) in the final equation a_(l+2) =g(z_(l+2) + a_(l))? Thanks!
@IvanOrsolic
@IvanOrsolic 5 жыл бұрын
You still calculate them, you just keep a copy of the original a_l value and plug it into the network before calculating a[l+2]. Why would you even do that? It's explained in the next video
@mohe4ever514
@mohe4ever514 3 жыл бұрын
@@IvanOrsolic If we plug this value to the network, then what is the benefit of skipping the layers? still we are going through the layers to calculate a[l+2], we just added one more term here but how it helps in skipping the connection ?
@chrlemes
@chrlemes 2 жыл бұрын
I think the title of the paper is wrong. The correct one is "Deep residual learning for image recognition".
@patrickyu8470
@patrickyu8470 2 жыл бұрын
Just a question for those out there - has anyone been able to use techniques from ResNets to improve the convergence speed of deep fully connected networks? Usually people use skip connections in the context of convolutional neural nets but I haven't seen much gain in performance with fully connected ResNets, so just wondering if there's something else I may be missing.
@astropiu4753
@astropiu4753 4 жыл бұрын
there's some high-frequency noise in many of this specialization's videos which is hurting my ears.
@ntumbaeliensampi6305
@ntumbaeliensampi6305 4 жыл бұрын
use a low pass filter. Lol
@swapnildubey6428
@swapnildubey6428 5 жыл бұрын
how are the dimensions handled, I mean that dimension of a[l] could happen to be unequal to a[l + 2];
@SuperVaio123
@SuperVaio123 5 жыл бұрын
Padding
@s.s.1930
@s.s.1930 3 жыл бұрын
padding with zeros or using an Conv 1x1 inside skip connection
@whyitdoesmatter2814
@whyitdoesmatter2814 4 жыл бұрын
Wait!!!!! Z_{l+1} should normally be equal to W_{l}a_{l} + b_{l+1}??
@kumarabhishek5652
@kumarabhishek5652 3 жыл бұрын
Why training error is increasing in reality as opposed to theory in plain model??
@amir06a
@amir06a 4 жыл бұрын
I have a very silly doubt if the skip layers/connections exist isn't the real layers in play = total layers/2?
@adityaniet
@adityaniet 7 жыл бұрын
Hi Andrew , I have a question, when we are calculating al+2 we need both al and z[l+2] . But z[l+2] can only be calculated by calculating a[l+1] , so how will we get that ? ....Many thanks:-)
@larryguo2529
@larryguo2529 7 жыл бұрын
If I understand your question correctly, a[I+2] = activation of z[I+2] ......
@freee8838
@freee8838 6 жыл бұрын
just like in formula a[l+1]=g(z[l+1])...
@_ashout
@_ashout 6 жыл бұрын
Yep this confuses me as well. How can we have both z[l+2] and a[l+1] if a[l+1] skips a layer?
@steeltwistercoaster
@steeltwistercoaster 4 жыл бұрын
+1 this is great
@mannansheikh
@mannansheikh 4 жыл бұрын
Great
@lorenzosorgi6088
@lorenzosorgi6088 4 жыл бұрын
is there any theoretical motivation justifying the increasing error of a deep plain network during training?
@mohnish.physics
@mohnish.physics 4 жыл бұрын
Theoretically, the error should go down. But in practice, I think the exploding gradients for a network with a large number of layers increases the error.
@tumaaatum
@tumaaatum 4 жыл бұрын
Yes there is. I am not sure why Andrew Ng didn't touch up on this. Basically once you add the skip connection you are including an additive term inside the non-linearity. The additive term can only increase the function space (the range of the function www.intmath.com/functions-and-graphs/2a-domain-and-range.php) as it is inside the original function (the theory of nesting functions). Hence, you allow the network to have more approximating/predictive capacity in each layer. You can visit the D2L lectures about this: d2l.ai/chapter_convolutional-modern/resnet.html?highlight=resnet
@kavorkagames
@kavorkagames 5 жыл бұрын
I find a ResNet behaves as a shallower net. It gives a solution that resembles that of a four to six (roughly) layered net when being eight laters. ResNets are out for me.
@okktok
@okktok 5 жыл бұрын
Kavorka Games ResNets are now that state of the art for image recognition. Every new architecture uses it and doesn’t make sense anymore to use plain networks.
@amir06a
@amir06a 4 жыл бұрын
@@okktok but isn't the actual layers in play = total layers/2 as we are providing a shortcut? so, on a broader note, they are just like plain networks which looks bigger?
@mikebot5361
@mikebot5361 5 жыл бұрын
If we use resnets, are we losing the information in between the layers?
@s.s.1930
@s.s.1930 3 жыл бұрын
no, we're not losing them - we just add x after a amount of layers (in this example 2 layers) - this is our ResNet block
@wliw3034
@wliw3034 3 жыл бұрын
Good
@shivani404sheth4
@shivani404sheth4 3 жыл бұрын
Meet the ML god
@trexmidnite
@trexmidnite 3 жыл бұрын
Sounds like terminator..
@ahmedb2559
@ahmedb2559 Жыл бұрын
Thank you !
C4W2L04 Why ResNets Work
9:13
DeepLearningAI
Рет қаралды 145 М.
Residual Networks and Skip Connections (DL 15)
17:00
Professor Bryce
Рет қаралды 51 М.
Cheerleader Transformation That Left Everyone Speechless! #shorts
00:27
Fabiosa Best Lifehacks
Рет қаралды 16 МЛН
小丑教训坏蛋 #小丑 #天使 #shorts
00:49
好人小丑
Рет қаралды 54 МЛН
Мен атып көрмегенмін ! | Qalam | 5 серия
25:41
MIT Introduction to Deep Learning | 6.S191
1:09:58
Alexander Amini
Рет қаралды 795 М.
Why Does Diffusion Work Better than Auto-Regression?
20:18
Algorithmic Simplicity
Рет қаралды 401 М.
ResNet (actually) explained in under 10 minutes
9:47
rupert ai
Рет қаралды 117 М.
CS 152 NN-17:  CNN Architectures: Resnet
16:26
Neil Rhodes
Рет қаралды 2,9 М.
Learn Machine Learning Like a GENIUS and Not Waste Time
15:03
Infinite Codes
Рет қаралды 279 М.
Residual Networks (ResNet) [Physics Informed Machine Learning]
17:26
Why Does Batch Norm Work? (C2W3L06)
11:40
DeepLearningAI
Рет қаралды 202 М.
C4W3L01 Object Localization
11:54
DeepLearningAI
Рет қаралды 159 М.
CNN Fundamental 3- Why Residual Networks ResNet Works
14:11
KGP Talkie
Рет қаралды 4,3 М.
Cheerleader Transformation That Left Everyone Speechless! #shorts
00:27
Fabiosa Best Lifehacks
Рет қаралды 16 МЛН