Lecture 6: Backpropagation

  Рет қаралды 79,013

Michigan Online

Michigan Online

3 жыл бұрын

Lecture 6 discusses the backpropagation algorithm for efficiently computing gradients of complex functions. We discuss the idea of a computational graph as a data structure for organizing computation, and show how the backpropagation algorithm allows us to compute gradients by walking the graph backward and performing local computation at each graph node. We show examples of implementing backpropagation in code using both flat and modular implementations. We generalize the notion of backpropagation from scalar-valued functions to vector- and tensor-valued functions, and work through the concrete example of backpropagating through a matrix multiply operation. We conclude by briefly mentioning some extensions to the basic backpropagation algorithm including forward-mode automatic differentiation and computing higher-order derivatives.
Slides: myumi.ch/R58q1
_________________________________________________________________________________________________
Computer Vision has become ubiquitous in our society, with applications in search, image understanding, apps, mapping, medicine, drones, and self-driving cars. Core to many of these applications are visual recognition tasks such as image classification and object detection. Recent developments in neural network approaches have greatly advanced the performance of these state-of-the-art visual recognition systems. This course is a deep dive into details of neural-network based deep learning methods for computer vision. During this course, students will learn to implement, train and debug their own neural networks and gain a detailed understanding of cutting-edge research in computer vision. We will cover learning algorithms, neural network architectures, and practical engineering tricks for training and fine-tuning networks for visual recognition tasks.
Course Website: myumi.ch/Bo9Ng
Instructor: Justin Johnson myumi.ch/QA8Pg

Пікірлер: 48
@quanduong8917
@quanduong8917 3 жыл бұрын
this lecture is an example of a perfect technical lecture
@vardeep277
@vardeep277 3 жыл бұрын
Dr. JJ, you sly sun of a gun. This is one of the best things ever. 47:39, the way he asks if it is clear. It is damn clear man. Well Done!
@ritvikkhandelwal1462
@ritvikkhandelwal1462 3 жыл бұрын
Amazing! One of the best Backprop explanation out there!
@sachinpaul2111
@sachinpaul2111 2 жыл бұрын
Prof...stop ...stop...it's already dead! Oh BP you thought you were this tough complex thing and then you met Prof. Justin Johnson who ended you once and for all! The internet is 99.99% garbage but content like this makes me so glad that it exists. What a masterclass! What a man!
@minhlong1920
@minhlong1920 Жыл бұрын
Such awesome and intuitive explaination!
@dbzrz1048
@dbzrz1048 2 жыл бұрын
finally some coverage on backprop with tensors
@rookie2641
@rookie2641 2 жыл бұрын
Best lecture ever on explanation of backpropagation in math
@akramsystems
@akramsystems 2 жыл бұрын
Beautifully done!
@user-we8fo6ep5y
@user-we8fo6ep5y 3 жыл бұрын
THANK YOU SO MUCH! finally not shallow and excellent explanation.
@VikasKM
@VikasKM 3 жыл бұрын
wooooowww.. what a superb lecture on backpropagation. simply amazing.
@piotrkoodziej4336
@piotrkoodziej4336 2 жыл бұрын
Sir, you are amazing! I've wasted hours reading and watching internet gurus on this topic, and they could not explain it at all, but your lecture worked!
@AndyLee-xq8wq
@AndyLee-xq8wq Жыл бұрын
Amazing courses!
@user-lt8lp5fx6h
@user-lt8lp5fx6h 10 ай бұрын
this is extremly hard. but this is a great lecture for sure. you are awesome Mr Johnson
@ryliur
@ryliur 2 жыл бұрын
Future reference for anybody, but I think there's a typo @ 50:24. It should be dz/dx * dL/dz when using chain rule to find dL/dx
@NIHIT555
@NIHIT555 3 жыл бұрын
great video.thanks for posting it
@tomashaddad
@tomashaddad 3 жыл бұрын
I don't get how back propagation tutorials by 3B1B, StatQuest, etc, get so much praise, but neither of them are as succinct as you were in those first two examples. Fuck that was simple.
@smitdumore1064
@smitdumore1064 7 ай бұрын
Top notch content
@user-nw8kk6vh2b
@user-nw8kk6vh2b 7 ай бұрын
Great lecture thank you. I have a question, would be great if anyone could clarify. When you first introduce vector valued backpropagation, you have the example showing 2 inputs to the node, each input is a vector of DIFFERENT dimension - when would this be the case in a real scenario? I thought the vector formulation was so that we could compute the gradient for a batch of data (e.g. 100 training points) rather than running backprop 100x. In that case the input vectors and output vectors would always be of the same dimension (100). Thanks!
@user-me1ry6lg6d
@user-me1ry6lg6d 2 ай бұрын
You earned a like, a comment and a subscriber ... what an explanation .
@mohamedgamal-gi5ws
@mohamedgamal-gi5ws 3 жыл бұрын
The good thing about these lectures is that finally Dr.Johnson has more time to speak compared to cs231n !
@shoumikchow
@shoumikchow 3 жыл бұрын
10:02. Dr. Johnson means, "right to left" not "left to right"
@tornjak096
@tornjak096 8 ай бұрын
1:03:00 should the dimension of grad x3 / x2 be D2 x D3?
@jungjason4473
@jungjason4473 3 жыл бұрын
Can anyone explain 1:08:05? dL/dx1 should be next to dL/dL, not L when it is subject to function f2'. Thereby back propagation cannot connect fs and f's.
@dmitrii-petukhov
@dmitrii-petukhov 3 жыл бұрын
Awesome explanation of Backpropagation! Amazing slides! Much better than CS231n.
@anupriyochakrabarty4822
@anupriyochakrabarty4822 Жыл бұрын
how come u are getting the value of e^x as -0.20. Could u explain
@artcellCTRL
@artcellCTRL Жыл бұрын
22:22 the local gradient should be "[1-sigma(1.00)]*sigma(1.00)" where 1.00 is the input to the sigmoid-fcn block
@haowang5274
@haowang5274 2 жыл бұрын
thanks, good god, best wish to you.
@qingqiqiu
@qingqiqiu Жыл бұрын
Can anyone clarify the computation of hessian matrix in detail ?
@apivovarov2
@apivovarov2 5 ай бұрын
@49:44 - Mistake in dL/dx formula - 2nd operand should be dL/dz (not dL/dx)
@matthewsocoollike
@matthewsocoollike 3 ай бұрын
19:00 where did w2 come from?
@arisioz
@arisioz Жыл бұрын
At around 18:20 shouldn't the original equation have a w_2 term that gets added to w_0*x_0+w_1*x_1?
@sainikihil9785
@sainikihil9785 6 ай бұрын
w2 is a bias
@aoliveira_
@aoliveira_ Жыл бұрын
Why is he calculating derivatives relative to the inputs?
@zainbaloch5541
@zainbaloch5541 Жыл бұрын
19:14 Can someone explain computing the local gradient of exponential function. I mean how the result -0.2 comes? I'm lost there!!!
@beaverknight5011
@beaverknight5011 Жыл бұрын
Our upstream gradient was -0.53 right? And now we need the local gradient of e^-x which is -e^-x and -e^-(-1)= -0.36. So upstreamgrad(-0.53) multiplied with local grad (-0.36) is 0.1949 which is approximately 0.2. So 0.2 is not local grad it is local multiplied with upstream
@zainbaloch5541
@zainbaloch5541 Жыл бұрын
@@beaverknight5011 got it, thank you so much!
@beaverknight5011
@beaverknight5011 Жыл бұрын
@@zainbaloch5541 you are welcome, good luck with your work
@Valdrinooo
@Valdrinooo Жыл бұрын
I don't think beaver's answer is quite right. The upstream gradient is -0.53. But the local gradient comes from the function e^x not e^-x. The derivative of e^x is e^x. Now we plug in the input which is -1 and we get e^-1 as the local gradient. This is approximately 0.37. Now that we have the local gradient we just multiply it with the upstream gradient -0.53 which results in approximately -0.20.
@genericperson8238
@genericperson8238 Жыл бұрын
46:16, shouldn't dl/dx be 4, 0, 5, 9 instead of 4, 0, 5, 0?
@kevalpipalia5280
@kevalpipalia5280 Жыл бұрын
No, the operation is not relu, its calculation of the downstream gradient. since last row of jacobian is 0 meaning that changes in that value does not affect the output, so 0.
@kevalpipalia5280
@kevalpipalia5280 Жыл бұрын
For the point of passing or killing the value of the upstream matrix, you have to decide pass or kill by looking at the input matrix, here that is [ 1, -2, 3, -1] so looking at -1, we will kill that value from the upstream matrix, so 0.
@bibiworm
@bibiworm 2 жыл бұрын
45:00 Jacobean matrix does not have to be diagonal right?
@blakerichey2425
@blakerichey2425 2 жыл бұрын
Correct. That was unique to the ReLU function. The "local gradient slices" in his discussion at 53:00 are slices of a more complex Jacobian.
@jorgeanicama8625
@jorgeanicama8625 11 ай бұрын
It is actually muchhhhhh more simpler than the way he used to explain. I believe he was redundant and too many symbols that hides the beauty of the underneath reason of the algorithm and the math behind it. It all could have been explained in less amount of time.
@maxbardelang6097
@maxbardelang6097 2 жыл бұрын
54:51 when my cd player gets stuck on a old eminem track
@benmansourmahdi9097
@benmansourmahdi9097 10 ай бұрын
terrible sound quality !
@Hedonioresilano
@Hedonioresilano 2 жыл бұрын
it seems the coughing guy got the china virus at that time
@arisioz
@arisioz Жыл бұрын
I'm pretty sure you'd be called out as racist back in the days of your comment. Now that it's almost proven to be a china virus...
Lecture 7: Convolutional Networks
1:08:53
Michigan Online
Рет қаралды 45 М.
The Most Important Algorithm in Machine Learning
40:08
Artem Kirsanov
Рет қаралды 164 М.
Teenagers Show Kindness by Repairing Grandmother's Old Fence #shorts
00:37
Fabiosa Best Lifehacks
Рет қаралды 14 МЛН
Who enjoyed seeing the solar eclipse
00:13
Zach King
Рет қаралды 99 МЛН
одни дома // EVA mash @TweetvilleCartoon
01:00
EVA mash
Рет қаралды 4,8 МЛН
The Absolutely Simplest Neural Network Backpropagation Example
9:22
Lecture 4: Optimization
1:03:07
Michigan Online
Рет қаралды 42 М.
MIT Introduction to Deep Learning | 6.S191
1:09:58
Alexander Amini
Рет қаралды 69 М.
MAMBA from Scratch: Neural Nets Better and Faster than Transformers
31:51
Algorithmic Simplicity
Рет қаралды 48 М.
Backpropagation in Convolutional Neural Networks (CNNs)
9:21
Back Propagation in training neural networks step by step
32:48
Bevan Smith 2
Рет қаралды 43 М.