You are amazing! Thank you! 7 years later still the best explanation!🎉
@pharmacist664 жыл бұрын
Whenever I don't understand something I immediately come to your channel because I *know* you will make me understand it
@isaacmares55905 жыл бұрын
You are the master of explaining complicated concepts effectively... my dog sitting next to me now understands backpropagation of neural networks better than roll over.
@phumlanigumedze97623 жыл бұрын
Amazing
@TheRainHarvester2 жыл бұрын
I just made a video without subscripts to explain multi hidden layer back propagation. It's easy to understand without so many sub/super scripts.
@vigneshreddy12134 ай бұрын
It has been 7 years for this video but still he rocks
@abhinavarg4 жыл бұрын
Sir, I don't even know how to express my joy after hearing this from you. Nicely done!!!
@programmingsoftwaredesign78873 жыл бұрын
You are very good at breaking things down. I've been through a few videos trying to understand how to code my backpropagation. You are the first one to really give a visual of what it's doing at each level for my little math mind.
@vaibhavsingh10495 жыл бұрын
I'm on day 3 of understanding backpropation, you made cry "Finally"
@kae48814 жыл бұрын
Dude. Best. Explanation. Ever. Straight Facts. EXCELLENT DAN. You, sir, are a legend.
@SigSelect5 жыл бұрын
I read quite a few comprehensive tutorials on backprop with the full derivation of all the calculus, yet this was the first source I found which explicitly highlighted the method for finding the error term in layers preceding the output layer, which is a huge component of the overall algorithm! Good job for sniffing that out as something worth making clear!
@TheCodingTrain5 жыл бұрын
Thanks for the nice feedback!
@gnorts_mr_alien2 жыл бұрын
Exactly. Watched at least 20 videos on backprop but this one made sense finally.
@Twitchi7 жыл бұрын
The whole series has been amazing, but I particularly enjoy these theory breakdowns :D
@manavverma48367 жыл бұрын
Man you are awesome. Someday I'll be able to understand this.
@drdeath26674 жыл бұрын
do u now? :D
@prashantdwivedi90734 жыл бұрын
@@drdeath2667 🤣🤣😂
@billykotsos46424 жыл бұрын
The day for me to understand is today.... 2 years later !!!!!
@jxl7213 жыл бұрын
do you understand it now :)
@kaishang64062 жыл бұрын
has the day come yet?
@TopchetoEU4 жыл бұрын
I'm quite honestly impressed by the simplicity of the explanation you gave. Right now I'm trying to get started with the AI but could not find a good explanation of backpropagation. That is until I found your tutorial. The only thing I didn't like is that this tutorial doesn't include any bias-related information. Regardless, this tutorial is simply great.
@sidanthdayal86206 жыл бұрын
When i start working i am going to support this channel on patreon. Helped me so much.
@artyomchernyaev7303 жыл бұрын
Did u start working?
@Meuporgman7 жыл бұрын
thanks Daniel for ur involvment in this serie and in all the others ! You're probably the best programming teacher on youtube, we can see that you put a lot of effort into making us understand all the concepts you go through ! much love
@Ezio-Auditore946 жыл бұрын
I love KZbin University
@giocaste6194 жыл бұрын
Nicolas Licastro io o ok oo Olivetti
@matig5 жыл бұрын
Even though we speak different languages you are a thousand times clearer than my teacher. Thanks a lot for this, you are the best
@narenderrawal2 жыл бұрын
Thanks for all the effort you put together in helping us understand. Best I've come so far.
@drugziro22755 жыл бұрын
I am studying these things in Korea. But before I can see this lecture, can not take classes, but now I can show my professor a smile, not a frustrated face. So..... thanks be my light.
@sanchitverma28925 жыл бұрын
wow im actually impressed you managed to make me understand all of that
@onionike41985 жыл бұрын
That was an excellent stroll through the topic. I feel like I can implement it in code now, it was one of the few hang ups for me. Thank you very much 😁
@phishhammer97336 жыл бұрын
These videos are immensely helpful and informative. You have a very clear way of explaining concepts, and think through problems in an intuitive manner. Also, I like your shirt in this video.
@dayanrodriguez13922 жыл бұрын
I always love your honesty and sense of humor
@santiagocalvo3 жыл бұрын
You should stop selling yourself short, iv'e seen dozens of videos on this exact subject because iv'e strugle a lot trying to understand backprop and i have to tell you this might be the best one iv'e seen so far, great work!! keep it up!!
@nemis123 Жыл бұрын
After watching entire KZbin I had no idea what bp is, thankfully found yours.
@MircoHeinzel6 жыл бұрын
You're such a good teacher! It's fun to watch your high quality videos!
@mrshurukan7 жыл бұрын
Incredible, as always! Thank you so much for this Neural Network series, they are very interesting and helpful
@lucrezian20246 жыл бұрын
I swear this is the one video which made me understand the delta of weights!!! THANK YOU!!!!!
@robv38722 жыл бұрын
I commend you for great videos and such an honest video! You are a great person! Thank you for the effort you put into this content you are helping people and a big part of us solving important problems throughout the future. I just commend you for being a great person which comes out in this video!
@CrystalMusicProductions5 жыл бұрын
I have used backpropagation in my NNs in the past but I never knew how the math works. Thank you so much ❤❤ I finally understand this weird stuff
@magnuswootton61812 жыл бұрын
really awesome doing this lesson, everywhere else is cryptic as hell on this subject!!!
@codemaster17683 жыл бұрын
This concept has been taught way better as compared to my University professors.
@aakash109755 жыл бұрын
superb explanation that I ever saw for back propagation
@roshanpawara87177 жыл бұрын
I m glad that you came up with this series of videos on Neural Networks. It has inspired me to choose this as a domain to work on as a mini project for this semester. Love you. Big Fan. God bless!! :-)
@lehw9163 жыл бұрын
Man, watching your video after 3Blue1Brown series on back-propagation is a breeze. Thanks for sharing!
@SistOPe6 жыл бұрын
Bro, I admire you so much! Someday I wanna teach algorithms the way you do! Thanks a lot, greetings from Ecuador :)
@moganesanm973 Жыл бұрын
the best teacher i ever seen☺
@artania067 жыл бұрын
Awesome video ! Keep it up :) I love your way of teaching to code with happiness
@qkloh68044 жыл бұрын
3blues1brown + this video is all we need. Great content as always.
@c12311666 жыл бұрын
Would you mind making a video about how you learn things? because it seems to me you can learn basically everything and be thorough about it. This is a skill I would like to own.
@dolevgo85356 жыл бұрын
when you try to study something, just practice it. learning how to create a neural network? sweet, try to create one yourself while doing so. there's actually no way that you'd do it perfectly, and you WILL come back to where you study from, or google things that popped up to your head that you started wondering about, and that is how you become thorough about these things. its just mostly about curiosity and practicing :)
@yolomein4155 жыл бұрын
Find a book for beginners, look at the reviews, buy the book, read it, try it out, watch KZbin videos, google your questions(if not answered ask on stackoverflow)
@mzsdfd4 жыл бұрын
That was amazing!! You explained it very well.
@mikaelrindmyr5 жыл бұрын
What if the one of the weights are negative? Is it the same formula when you calculate the greatness of the error-Hidden-1? or should I use Math.abs on all the denominators. Like Weight-1 = -5, Weight-2 = 5 error = 0.5, then it should look like this right? Error-Hidden-1 = error * ( W1 / ( Math.Abs(W1) + Math_abs(W2)) ) ty
@mikaelrindmyr5 жыл бұрын
//mike form sweden
@minipy31643 жыл бұрын
When you are about to give up on Neural network and you see this awesome video😍😍😍😍😍
@luvtv74333 жыл бұрын
You know what would be nice, that you could teach about algorithms on graphs using matrices, I feel that helped me a lot to practice and understand the importance of matrices in other topics including neural networks. Some exercises are to find if two graphs are isomorphic, find cycles in a vertex, if a graph is complete, planar, bipartite, find a tree and paths using matrices, I am not sure but that might be called spectral graph theory.
@kalebbruwer5 жыл бұрын
Thanks, man! This makes it easier to debug code I wrote months ago that still doesn't work, because this is NOT what I did
@sirdondaniel6 жыл бұрын
Hi, Daniel. At 8:44, I have to point something out: you said that h1 is 67% responsible for the error because it has a W1 = 0.2 which is 2 times bigger than W2. Well I think that is false. If in this particular case the value stored by h1 is 0 then nothing is coming from it and for the entire 0.7 output is h2 responsible with its W2 = 0.1. Check at 5:14 the "What is backpropagation really doing? | Chapter 3, deep learning" from "3Blue1Brown". I'm not 100% I'm right. Anyway you are making a really good job with this series. I've watched some videos about this topic on pluralsight, but the way you explain it makes way more sense than over there. I really look forward to see you implement the Digits recognition thing. If you need some assistance please don't hesitate to message me.
@TheCodingTrain6 жыл бұрын
Thank you for this feedback, I will rewatch the 3blue1brown video now!
@sirdondaniel6 жыл бұрын
I've watched the entire playlist and I saw that you actually take care of the node's value (x) in the ΔW equation. These error-equations that you present in this video are just a technique of spreading the error inside the nn. Also they are present in the "Make Your Own Neural Network" of Tariq Rashid, so they should be right :)
@quickdudley6 жыл бұрын
I actually made the same mistake when I implemented a neural network the first time. Surprisingly: it actually worked, but needed a lot more hidden nodes than it would have if I'd done it right.
@sirdondaniel6 жыл бұрын
Wait... Which mistake do you mean Jeremy?
@landsgevaer6 жыл бұрын
Yeah, I noticed this too! Although understandable, the explanation is wrong. Also, what if w1 and w2 cancel, that is w1+w2=0? Then the suggested formulas lead to division by zero, so infinite adjustments. I find it more intuitive to consider every parameter (all weights and all biases) as parameters. Then you look what happens to the NN's final output when you change any such parameter by a small (infinitesimal) amount, keeping all others constant. If you know delta_parameter and the corresponding delta_output, you know the derivative of the output with respect to the parameter, equal to delta_output/delta_parameter. Gradient descent then dictates that you nudge the parameter in proportion to that derivative (times error times learning rate). Finally, the derivative can be expanded using the chain rule to include the effects of all the intermediate weights and sigmoids separately. Backpropagation is "merely" a clever trick to keep track of these products-of-derivatives. Apart from that, kudos for this great video series!
@jiwon53155 жыл бұрын
You probably know already but you are amazing 💕
@merlinak18785 жыл бұрын
Question: If w2 is 0.1 and it gets tuned by 1/3 of 0.3, the new weight of w2 ist 0.2. And now the error of that is new w2 - old w2. So the error of hidden2 is 0.1? Is that correct? And do i need learning rate for that?
@amanmahendroo17845 жыл бұрын
That seems correct. And you do need a learning rate because the formula dy/dx = ∆y/∆x is only accurate for small changes (i.e. small ∆x). Good luck!
@merlinak18785 жыл бұрын
Aman Mahendroo ok thank you
@carlosromerogarcia17692 жыл бұрын
Daniel, I have a liittle doubt. When I see the weights here I think in Markowitz portfolio model and I wonder if the sum of the weights in Neural Networks should be one w1 + w2 + w3 + ... + w n = 1 Do you know if in Keras it´s possible compute this type of constraints... just to experiment. Thank you I love your videos
@lirongsun58484 жыл бұрын
Best teacher ever
@ulrichwake16566 жыл бұрын
good video man. it really helps a lot. u explain it clearly. thank u very much
@snackbob1004 жыл бұрын
are all errors across all layers calculated first, then gradient descent is done? or are they done in cognate with each other?
@YashPatel-fo2ec5 жыл бұрын
what a detailed and simple explanation. thank you so much.
@michaelelkin95424 жыл бұрын
I think you answered my question, but to be sure. Is backward propagation only 1 layer at a time? As in you calculate the errors in the weights to the final layer and then as if the last layer went away. Then use the new expected values that just done to adjust the previous layers weights and so on. The key is that you do not simultaneously adjust all weights in all layers, just one layer at a time. Seems like a very simple question but I have never found a clear answer. Thank you.
@jyothishmohan56134 жыл бұрын
Why do we need to do backpropagation to all hidden layers but only to the previous layer to the output?
@ganeshhananda6 жыл бұрын
A really awesome explanation which can be understood by a mere human being like me ;)
@lornebarnaby74766 жыл бұрын
Brilliant series, have been following the whole thing but I am writing it in go
@atharvapagare71886 жыл бұрын
Thank you, I am finally able to grasp this concept slowly
@sanchitverma28925 жыл бұрын
hello
@justassaltyasthesea55336 жыл бұрын
Does the coding train have a coding challence about missiles slowly turning towards their target, trying to intercept them? And maybe instead of flying where the target is, the missile uses some advanced navigation? On wikipedia is Proportional Navigation where they talk about a LOS-rate. I think this would be a nice coding challence, but where do I suggest them?
@roger109z5 жыл бұрын
thank you so much, I watched the 3blue1brown videos and read a few books but this never clicked for me watching you made it click for me.
@Vedantavani3100BCE4 жыл бұрын
HELP !!!!! In RNN we have only 3 unique weight parameters, so during back prop. their will be only 3 parameters to update then why are RNN goes till the 1st input & creates long term dependencies thereby creates vanishing gradient problem ????
@zareenosamakhan97803 жыл бұрын
Hi, Can you please explain the back propagation with cross entropy loss error.
@SetTheCurve5 жыл бұрын
I would love it if you told us how to include activation in these calculations, because in this example you're only including weights. A high activation and low weight can have the same impact on error as a low activation and high weight.
@jchhjchh6 жыл бұрын
Hi, I am confused. ReLU will kill the neuron only during the forward pass? Or also during the backward pass?
@hamitdes78655 жыл бұрын
Sir thank you for teaching backpropagation😊😊
@ksideth2 жыл бұрын
Many thanks for simplifying.
@priyasingh99844 жыл бұрын
awesome person you taught so well and kept a tuff topic go on interesting
@johnniemeredith91415 жыл бұрын
At timestamp 15:15 why was error 2 changed from .3 to .4... it seems like he saw something wrong there that he didn't explain.
@iagosoriano37345 жыл бұрын
He just didn't want two equal values of error
@12mkamran5 жыл бұрын
How would you deal with the fact that in some cases error of H1 and h2 may be 0. Do you not adjust it or is there bias associated with it as well? Thanks
@marcosvolpato81356 жыл бұрын
do we have to update all the weights before we calculate the all errors or first calculate all the errors and then update all the weights?
@shimronalakkal5233 жыл бұрын
Oh yeah. Thank you so much. This one helped a lot.
@BrettClimb5 жыл бұрын
I feel like the derivative of the activation function is also a part of the equation for calculating the error of the hidden nodes, but maybe it's unnecessary if you aren't using an activation function?
@MRarvai5 жыл бұрын
Is this the same thing as in the 3Blue1Brown video about the calculus of back propagation, just its way less mathematical ?
@TheCodingTrain5 жыл бұрын
This is the same! Only 3Blue1Brown's explanation is much better!
@surtmcgert50874 жыл бұрын
I understand everything here I am just confused. I watched a video by 3Blue1Brown on backpropogation and their explanation was nothing like this, they had a cost function and derivatives and they were averaging their formulae over every singe training example, Im confused because iv got two completlly different takes on the exact same topic.
@arindambharati23124 жыл бұрын
same here..
@wakeatmethree40237 жыл бұрын
Hey Dan! You might want to check out computational graphs as a way of explaining backpropagation (Colah's blog post on computational graphs and Andrew Ng's video on computational graph/derivatives as well. )
@TheCodingTrain7 жыл бұрын
Thank you for this, I will take a look!
@phumlanigumedze97623 жыл бұрын
@@TheCodingTrain humility,appreaciated, thank God
@borin28823 жыл бұрын
Hi guys, why don't we calculate output at every hidden neural layer to get value of error?
@FilippoMomesso7 жыл бұрын
In "How to make a Neural Network" by Tariq Rashid weight notation is reversed. For example, in the book the weight of the connection between input node x1 and hidden node h2 is noted as w1,2 but in your videos is w2,1 Which one is more correct? Or is it only a convention?
@TheCodingTrain7 жыл бұрын
I would also like to know the answer to this question. But I am specifically using w(2,1) since it shows up in row 2 and column 1 in the matrix. And I believe rows x columns is the convention for linear algebra stuff?
@volfegan7 жыл бұрын
The Notation is a convention. As long you keep using the same notation system, it should not be a problem. Let mathematicians argue about the details. Engineers just have to make the stuff works.
@FilippoMomesso6 жыл бұрын
The Coding Train ok, just asked my math professor. He said the right convention is (row, column). I read again the section of the book where it talks about matrices. The author contradicts himself. At page 52, he says "it is convention to use row first then columns" but then, when he applies matrices to weight notation he does the opposite. His notation is W(i,h) (i for number of input node and h for number of hidden node) but it is column,rows. Your notation is W(h,i) but with the right convention for matrices row,column. So in the end, using one notation or the other it is the exact same thing because weight w(1,2) in the book is w(2,1) in your videos. Hope I've been enough clear :-P
@jonastjepkema6 жыл бұрын
TariqnRashid actually doesn't use the conventional matrix notation because he actually looks layer per layer, not as a matrix with rows and columns : he writes w11 meaning "weight from the first neuron of the first layer to the first neuron of second layer". And he goes on so that the weights leaving from one neuron have the same first number which is his own way of representing this. Both work thought, as someone said before me, it just notation, he just doesn't look at it as a matrix (which makes the matrix notation to calculate the outputs less readable unfortunately) hope i managed to make myself clear hahaha
@TheRainHarvester2 жыл бұрын
The reason for the swapped position is so that when multiplying matricies, the notation is correct for multiplying: M2x3 X M3x2 Where the 3s need to be in those inner positions next to the times symbol, X.
@pythondoesstuff29694 жыл бұрын
What if there are more neurons in the hidden layer. Then how to calculate the error
@alexanderfreyschuss13077 жыл бұрын
so after each guess your're getting like a new generation of weights, right? could you figure out, like after a couple generations if any specific tuning was correct or it would have been correct to adjust just a single weight instead of all of them? maybe it is sometimes right to "blame" just one of the weights from a huge number of them in order to get perfect guesses in the future, even though it might be unlikely. so my question comes down to, how do you check, if tunings the you (or the network) made were right or if there were tuninges which would do even better?
@FilippoMomesso7 жыл бұрын
you do it with gradient descent, it will be covered in the next videos. Try to watch 3blue1brown playlist on neural networks if you don't want to wait(I think the third video is about gradient descent). It is very well explained.
@skywalkerdk016 жыл бұрын
Awesome video. Thank you for this! First time i understand backpropagation. +1
@masterstroggo6 жыл бұрын
Daniel Shiffman. I'm following along these tutorials and re-creating the neural network in Processing3. I know you´re a bit of a wiz on that topic so I thought I'd ask you about it. I got a working prototype of the neural network up and running in Processing even though I had to do some workarounds and compromises. One Issue I've run into however is that I cannot seem to figure out how to use static variables or methods in processing. Is this not implemented? None of the standard Java ways of doing it works in the processing environment. I've tried the same code snippets in other more standard Java environments and they work there.
@masterstroggo6 жыл бұрын
It's kind of solved. I found out that Processing wraps all code in a class which means that all user created classes are treated like inner classes, and from what I understand Java does not support static inner classes unless the parent class (the whole processing sketch in this case) is static. I've found workarounds for that, but I thought I'd share my findings.
@TheCodingTrain6 жыл бұрын
Yes, this is indeed the case! Let me know how I can help otherwise.
@OskarNendes2 жыл бұрын
If backprolagation really works why we need to apply it more than once, and why we need to test the Neural network again if the error is possible to be calculated? How do we know we are no just doing a random search in the vicinity?
@OneShot_cest_mieux7 жыл бұрын
Thank you so much ^^ I have a question: we divide by the sum of weights, but what if the sum of weights is equal to zero ? and your weights are beetween 0 and 1 or beetween -1 and 1 ?
@oooBASTIooo6 жыл бұрын
gabriel dhimoila weight 0 would mean tgat there is no edge between the vertices. So if the sum would be 0, the output vertex wouldn't be connected to the graph at all and you couldn't measure anything there... What he does is assign every edge its portion of the area by using the arithmetic mean..
@OneShot_cest_mieux6 жыл бұрын
I don't understand what do you mean by "graph", "edge" and "area" but if weights are beetween -1 and 1 or if weights are initialized by 0 it's probable that the program has to make a division by 0 sorry for my bad english I am french
@nagesh007 Жыл бұрын
Awesome tutorial
@unnikked7 жыл бұрын
Let me tell you that you are an amazing teacher! ;)
@jagdeepakrawat60285 жыл бұрын
In your single output neuron situation, why do you assume that 1/3 of error is from h2 and 2/3 from h1 . since you really don't know their inputs, or the activations they produce. how can you just compare weights and decide that one is more responsible. It's not always true. is it?
@TheRainHarvester2 жыл бұрын
Yeah you need to consider input.
@samsricatjaidee4052 жыл бұрын
Thank you. This is very clear.
@teamsalvation6 жыл бұрын
Well done, a bit of rambling here and there, but overall well done. Now that we have the error, what do we do with it? How do we tweak the weights? -- How do we know to apply 33% of the error to W2 and 67% of the error to W1
@TheCodingTrain6 жыл бұрын
Only a bit of rambling? That's kind of you to say. Hah. I think I address this if you keep watching parts 2 etc.
@teamsalvation6 жыл бұрын
The Coding Train You did and it still made no sense ;-) - hahaha - at first. I spent the last 3 days jumping around web sites and other KZbin videos and...after breaking down to using multi-color pen and paper (old school) things finally started to come together, dots connected and I have a working neural network using C++ :-) As I believe you mentioned, it was the backward propagation that was tricky, and even then, it’s the math that threw me off. If I ignored the math for a moment and only focused on the formula, as-is, I had to apply, it all made sense. I then went back and reviewed all the math to understand why it works. So, for all you intro-peeps or just plain thick-headed peeps like me, ignore the math (for a moment). Simply take the formula and code it. At the very least you’ll have working code with hopes that it keeps the momentum going so you then go back to understand the math.
@GurpreetSingh-th1di6 жыл бұрын
the kind of video i want , thanks
@lucaslopesf7 жыл бұрын
I finally understand! It's so simple
@andrefurlan3 жыл бұрын
More than 3 years since this video has been uploaded and i still wander what happened at 3:04
@sz70636 жыл бұрын
It is amazing! When will you teach us about recurrent neural network and LSTM?? Looking forward to that!!
@IgorSantarek6 жыл бұрын
You're doing great job! Keep it up!
@iagosoriano37345 жыл бұрын
Why not add w0 in the sum that goes in the denominator?
@raonioliveira87586 жыл бұрын
So you are saying that eh1 should be its part of e1 + its part of e2. It seems counter-intuitive for me since I feel like sometimes, maybe e1 can contain a bit of e2 in itself. So, summing those diferent instances of error measurements shouldn't be an effective way of computing eh1 or eh2. I hope I made myself clear, sorry if my english suck.
@esu71165 жыл бұрын
Are the cost function and the error the same thing?
@kosmic0006 жыл бұрын
amazing vid as always dan very informative
@annidamf2 жыл бұрын
thank you very much for your videos. incredibly helpful!! :D
@sumitzanje52075 жыл бұрын
While watching this video I realised I have to stop and first write a comment : "You are awesome!!" Thank you.
@affandibenardi5484 жыл бұрын
DATA = model + error model = data - error supervised model = data - ( tunning error )
@centuryfreud5 жыл бұрын
“y_hat” would be a better term to differentiate it from ground truth “y”.