CS231n Winter 2016: Lecture 4: Backpropagation, Neural Networks 1

Рет қаралды 302,948

Күн бұрын

Пікірлер: 110

@charleswetherell9436 7 жыл бұрын

This is a terrific lecture on a difficult subject. Andrej strips all the complication away, gets to the essentials, and avoids jargon and "academese". The slides are great and the presentation engaging and easy to understand. The kind of teacher we wish was teaching every class!

@pe2roti 6 жыл бұрын

Having suffered a lot and struggled through the cs229 of Andrew Ng, I can certify that this lecture is far more accessible to the least math-oriented souls out there. The way Andrej explains concepts *in english* and connects ideas to each other is one of the reasons i did not give up studying these topics.

@vil9386 11 ай бұрын

Can't thank Andrej, the cs231n team, Stanford enough. Thoroughly enjoy your lectures. Knowledge is one form of addiction and pleasure and thank you so much for providing it freely. I hope you all enjoy giving it as much as we enjoy receiving it.

@nausheenfatma 7 жыл бұрын

Thank you so much for making these lectures public !! I had been struggling to understand backpropagation for a long time. You have made it all so simple in this lecture. I am planning to watch the entire series.

@UAslak 8 жыл бұрын

This was gold to me! "... and this process is called Back Propagation: it's a way of computing, through recursive application of the chain rule in a computational graph, the influence of every single intermediate value in that graph on the final loss function." (14:48)

@portebella1 7 жыл бұрын

This is by far the most intuitive neural network video I have seen. Thanks Andrej!

@WahranRai 6 жыл бұрын

Are you waiting for a first date ?

@madmax-zz3tt 5 жыл бұрын

@@WahranRai hahaha online pervert

@tomashorych394 3 жыл бұрын

Exactly my thoughts. Ive seen quite a few backprop videos but this lecture truly shows how simple yet elegant way of computing this is.

@essamal-mansouri2689 6 жыл бұрын

Andrej Karpathy's lectures and articles are basically the reason I know anything about anything.The webpage he uses near end of lecture to show decision boundary is quite literally one of the reasons I got interested in machine learning at all.

@cryptobling 4 жыл бұрын

Finally, a lecture that lifts the curtain of mystery on backward propagation! Excellent delivery of core concepts in neural networks. This is the best lecture I have watched on the topic so far (including the most popular ones that are heavily promoted on public domains!)

@leixun 4 жыл бұрын

*My takeaways:* 1. Computational graph and backpropagation examples 2:55 2. Implementation: forward/backward API 32:37 3. Vectorized implementation 43:26 4. Summary so far 50:50 5. Neural Networks 52:35 6. Summary 1:16:05

@BrandonRohrer 8 жыл бұрын

Excellent explanation Andrej. Thanks for making the rest of us a little smarter.

@dashingrahulable 7 жыл бұрын

You do quite the same thing with your videos as well, Brandon. Find them really helpful, thanks.

@bayesianlee6447 6 жыл бұрын

Thanks for Andrej and Brandon also. Brandon ur works also gave me huge chance on interpretation very good way. thanks for both :)

@proGrammar 8 жыл бұрын

You are a genius of explication, Andrej! Can't tell you how much this helped. Thank you!

@avinashpanigrahi3232 4 жыл бұрын

After seeing a dozen videos on BackPropagation (that includes CSxxx courses too) , this is by far the best explanation given by anybody. Thank you Andrej, you have made life bit simpler in COVID.

@mostinho7 4 жыл бұрын

Starts at 6:30 13:40 the local gradient is computed on the forward pass, this node while doing the forward pass can immediately know what dz/dx and dz/dy are because it knows what function it’s computing 31:30 when two branches coming into a node when doing backprop we add their gradients 34:15 example of backward pass code 37:00 the gradient of the loss wrt x is NOT just “y” but it’s y*z because this z is dL/dZ and we want to find dL/dX, dL/dX = dL/dZ * dZ/dX, and z the incoming gradient is dL/dZ and y = dZ/dX 43:30 why are x,y,z vectors? Because recall the notation for forward propagation, we do forward prop for more than one input at a time. Each input can be represented as a column vector or a row vector in the input matrix X? 1:10:00 decision boundaries formed by neural network

@nguyenthanhdat93 8 жыл бұрын

Great backprop explanation. Thanks!

@krajkumar6 6 жыл бұрын

Most intuitive explanation I've seen for backprop. Thank you Andrej!! Great resource for deep learning enthusiasts

@mishrpun 6 жыл бұрын

one of the Best back prop video on internet

@reachmouli Жыл бұрын

This is a beautiful lecture - gave a very fundamental understanding of backward propagation and its concepts - I see backward propagation correlates to demultiplexing and forward prop corresponds to multiplexing where we are multiplexing the input .

@GermanGarcia_Hi 7 жыл бұрын

This course is pure gold.

@shahainmanujith2109 Ай бұрын

MINDBLOWING SIMPLICITY!!! Well explained!

@AG-dt7we 2 жыл бұрын

at 25:00 add distributes the upstream gradient, max “routes” the upstream gradient & mul switches the upstream gradient:

@adityapasari2548 7 жыл бұрын

This is the best introductory explanation for NN Thanks a lot

@Alley00Cat 7 жыл бұрын

At 12:33, I love the switch from Canadian to American: "compute on zed...ahum...on zee". ;)

@dhanushc5744 5 жыл бұрын

That's how we pronounce in India as well "Zed".

@IchHeiseAndre 6 жыл бұрын

Fun fact: The lake in the first slide is the "Seealpsee" at the Nebelhorn (Oberstdorf, Germany)

@getrasa1 7 жыл бұрын

That was god-like explanation. Thank you so much!

@erithion 6 жыл бұрын

Excellently structured lecture with brilliant intuitive visualization. Thanks a lot, Andrej!

@anetakufova7180 4 жыл бұрын

Best explanation about backprop and neural networks in general I've ever seen. Also good questions from the students (I like how interactive they are)

@vocdex 3 жыл бұрын

This was very satisfying piece of information. Especially, the follow-up questions helped me clear the doubt about many details. Thank you so much

@tomatocc4565 4 жыл бұрын

This is the best explanation for backpropagation. Thank you!

@kevinxu5203 3 жыл бұрын

35:10, thought dz/dx = y, for each unit of dx change, dz changes by y times. why the equation says dx = self.y * dz?

@mumbaicarnaticmusic2021 3 жыл бұрын

So there dx means dL/dx and dz means dL/dz, so dx = self.y * dz would be read as dL/dx = y * dL/dz, or dL/dx = dz/dx * dL/dz

@arnabmukherjee 4 жыл бұрын

Excellent explanation of backpropagation through computational graphs

@sherifbadawy8188 Жыл бұрын

This is the course, I finally understood it. Thank you so so much!!!!

@budiardjo6610 Жыл бұрын

i am come from his blog, and he say he put a lot of effort for this class.

@y-revar 9 жыл бұрын

Great explanation and intuition on back propagation. Thank you!

@cp_journey 5 жыл бұрын

1:00:31 why update are added instead of taking it negative += instead of -=

@SAtIshKumar-qt3wx 4 жыл бұрын

great explanation, helps me to understand backpropagation better. thanks a lot.

@jmeliu 3 жыл бұрын

This video made me cry😿. This guy did his best and enthusiastically tried to make the students understand BackProp. Stanford students are so lucky. It’s not that you are smarter, but that other former children, such as me, don’t have such a good teacher.

@nikolahuang1919 7 жыл бұрын

This lecture is enlightening and interesting.

@rishangprashnani4386 5 жыл бұрын

suppose output of (1/x) is F and (+1) is Q. So, the local gradient on the F gate should be (dF/dQ). why is that not the case? here the local gradient over F is (dF/dx).. pls explain? @ 18:20

@rishangprashnani4386 5 жыл бұрын

never mind, got it. Q is x.

@ThuongNgC 8 жыл бұрын

Great explanation , very intuitive lecture.

@shanif 8 жыл бұрын

Echoing everyone here, great explanation, thanks!

@ArthurGarreau 7 жыл бұрын

Great lecture, this is really valuable

@ThienPham-hv8kx 3 жыл бұрын

In summary: we calculate forward to get loss value , calculate backward to get gradient value and then update weight for the next step. And so on, until we 're happy with loss value.

@LouisChiaki 7 жыл бұрын

Amazing class about the back propagation and the demo!!!

@maxk7171 8 жыл бұрын

These videos are helping so much

@inaamilahi5094 6 жыл бұрын

Excellent Explanation

@ServetEdu 8 жыл бұрын

That back prop explanation!

@miheymik7984 5 жыл бұрын

w1x1 - "this weigh represents how this neuron likes... that neuron". Bravo!

@serdarbaykan2327 6 жыл бұрын

(27:33) "..because calculus works" :D

@saymashammi7695 4 жыл бұрын

Where can I find the lecture slides? The lectures are too good!

@sergeybaramzin6030 9 жыл бұрын

great, looking forward to the next part)

@vineetberlia3470 5 жыл бұрын

Awesome Explanation Andrej!

@letsl2 8 жыл бұрын

Great explanation of backprop algorithm ! thanks a lot !

@reubenthomas1033 2 жыл бұрын

Are the homework assignments accessible? If yes, could someone link me to it? I can't seem to find them.

@theempire00 8 жыл бұрын

Around 24:00, shouldn't the gradients be switched? I.e: x0: [-1] X [0.2] = -0.2 w0: [2] X [0.2] = 0.4 Oh wait nevermind, I see it's explained a couple slides further!

@vorushin 8 жыл бұрын

No. dL/dx0 = df/dx0 * dL/d(w0 * x0) df/dx0 (local gradient) = 2.0 (because the value of x0 is multiplied by w0 = 2.0, every time we increase the value of x0 by h, the value of w0 * (x0 + h) increases by h * w0) dL/d(w0 * x0) is already calculated as 0.20 2.0 * 0.20 = 0.40 The same is applied to calculation of dL/dw0

@proGrammar 8 жыл бұрын

So glad someone else saw this! I thought I was trippin'. Looking forward to the reveal! : )

@questforprogramming 4 жыл бұрын

@@proGrammar have you understood...its because the mul logic

@rvopenl6402 6 жыл бұрын

Thank you Andrej, Interesting lecture.

@bobythomas4427 6 жыл бұрын

Great video. One question regarding the computation of dx at around 35 minutes. How is dx becomes y * dz? Why is it not dz / y?

@hmm7458 4 жыл бұрын

chain rule ..

@Nur-vr6vb 5 жыл бұрын

ConvnetJS, Neural Network demo around 1:10:00 cs.stanford.edu/people/karpathy/convnetjs/demo/classify2d.html

@guanjuexiang5656 8 жыл бұрын

It's so clear and helpful. Thank you!

@olegkravchuk5751 7 жыл бұрын

Just one question. Why here ( kzbin.info/www/bejne/n2qXgKmPl5uhpdE )you take input value to the gate for its own derivative while in all the other parts you take the already computed value ( kzbin.info/www/bejne/n2qXgKmPl5uhpdE )? It's kinda confusing. I'd expect that in the beginning we take just 1 not the input to the gate.(1/x)

@vatsaldin 3 жыл бұрын

When such an excellent piece of knowledge shared by Andrej -- still there are some idiots who have disliked this great piece.

@bhavinmoriya9216 Жыл бұрын

Awesome as always! Is notes available to general public? If so, where do I find it?

@smyk1975 5 жыл бұрын

This entire course is a triumph

@cinderellaman7534 3 жыл бұрын

Thank you Andrej.

@sumitvaise5452 5 жыл бұрын

does the output layer has a bias? I read in StackOverflow, that if we use sigmoid funtion then we have to use a bias.

@420lomo 4 жыл бұрын

The output layer has to match the dimensions of the output, so the dimensions are fixed. Also, since the output is not used for computation, don't think about it in terms of weights and biases. Output layers have no weights and biases, it's just the output. If you're using a sigmoid layer at the end, there is no bias coming in from the previous layer, since that would skew your probability estimates. All the hidden layers can have biases, though

@Pranavissmart 9 жыл бұрын

It's a bit weird for me to see him in the video. I was used to Badmephisto's voice only.

@WorldMusicCafe 9 жыл бұрын

+Profilic Locker LOL, Thank God, it's not only me. He picked up a pretty accent too. Well, his recent years in US did him good. These 8 years changed Badmephisto's voice a lot.

@saurabh7337 3 жыл бұрын

Thanks for the great lecture.

@TomershalevMan 6 жыл бұрын

amazing tutor

@jpzhang8290 7 жыл бұрын

what are add gate, max gate and mul gate ?

@cyrilfurtado 8 жыл бұрын

Really good lectures & assignments. I could not figure out how to get dW, I worked out other parts of the assignments, now that Assignments 1 & 2 are graded can some one please tell me Thanks

@snippletrap 5 жыл бұрын

Maybe "mux" instead of "router"? On the gate analogy.

@gauravpawar1416 7 жыл бұрын

I cannot thank you enough :)

@tyfoodsforthought 4 жыл бұрын

Extremely helpful. Thank you!

@brucelau6929 7 жыл бұрын

I loves this lecture.

@slashpl8800 7 жыл бұрын

Fantastic lecture, thanks!

@vaishali_ramsri 8 жыл бұрын

Excellent Lecture !

@mark_larsen 3 жыл бұрын

Wow actually so helpful!

@uVeVentrue 6 жыл бұрын

I feel like a ninja backpro.

@M0481 6 жыл бұрын

Loved this, thanks a lot!

@rohan1427 5 жыл бұрын

Great lecture :)

@quishzhu 2 жыл бұрын

Grazie!

@shobhitashah1929 4 жыл бұрын

Nice video Sir

@eduardtsuranov712 2 жыл бұрын

ty very much!!!

@owaisiqbal4160 4 жыл бұрын

Link for COVNETS demo: cs.stanford.edu/people/karpathy/convnetjs/demo/classify2d.html

@memojedi 6 жыл бұрын

bravo!

@jiminxiao1779 9 жыл бұрын

quite cool.

@jklasfjkl 3 жыл бұрын

amazing

@alexanderskusnov5119 3 жыл бұрын

It's a bad decision to use small L in names l1 etc. Bad font does it like big i and 1.

@xuanbole5906 7 жыл бұрын

Thank you!

@SiD-hq2fo 6 ай бұрын

hey internet, i was here

@HolmesPatrick 9 жыл бұрын

awesome

@pravachanpatra4012 Жыл бұрын

28:00

@StanislavOssovsky 5 жыл бұрын

Someone please help. I am a pianist, so my math is weak, I know I am stupid, but... The derivative of (x+y)... If I do an implicit differentiation, it turns out: (x+y)=3-> d(x+y)/dx=d3/dx-> d(x+y)/dx=0-> 1+1*(dy/dx)=0-> dy/dx=-1 And the result must be a positive 1! Where is my mistake? Please someone point out.

@channelforwhat 3 ай бұрын

@14:47

@theVSORIGINALs 3 жыл бұрын

is he speaking too fast or the vdo is fast forwarded

@frodestr 7 жыл бұрын

Good lecture, but not a good idea to teach the next generation of data scientists that (in this case, Python) code should have no comments and maximum terseness.