After going through a lot of articles, it is this video through which I understood the backpropagation logic clearly. Thanks a lot.
@StarProduction3698 жыл бұрын
I've been studying this subject for quite a while but cannot understand it that well but my interest is to get the program and play with it can you steerme in the right direction on which programs to use, so I can play with the neural Networks program an developed the ultimate software program..Thanks
@ShravanKumar1477 жыл бұрын
Hi, can you provide the slides here. Thank you
@jeffthom31558 жыл бұрын
Hi teacher. Great explanation . Won another registered ! Greetings from Brasil.
@konstantinburlachenko28439 жыл бұрын
Hi. Thanks for video. I will be very very-very-very glad if you answered my question here. It can be shown via differential theory and counting methods that (1) Gradient times -1 is the vector in function domain where function decrease more fast. (2) that convex function has one local and global minimum (3) linear function is convex (4) Composition of convex non decreasing function(convex function) of two convex function is convex.....My point is that gradient descent is wrong method to solve this task because 1/(1+exp(-x)) is not convex. Am i wrong? It's the question for which I dream to know the answer.
@AlexanderIhler9 жыл бұрын
+Konstantin Burlachenko Thanks for the interest. I'm not sure I understand your question -- but: (1) the logistic function, 1/(1+exp(-x)), is convex; hence logistic regression (a 1-layer NN with "logistic" or "cross entropy" loss) is convex and relatively easy to optimize (2) a multi-layer NN is not convex in its parameters (3) gradient descent, as an optimization strategy, is very useful on both convex and non-convex objective functions. For non-convex objectives, we generally have no guarantees on the quality of the local minimum found (compared to the global minimum), although in *some* problems such claims can be made. However, for general non-convex problems there are typically few guarantees for any optimization approach. So I don't see that as a drawback of gradient descent. (4) As a more advanced method, Newton or pseudo-Newton methods may be preferable, as they can take adaptive steps depending on the curvature (2nd derivative) of the loss function. Typically these are used to replace batch gradient descent, when the data set size is not too large.
@konstantinburlachenko28439 жыл бұрын
+Alexander Ihler Thanks (1) But to be honest I don't understand why it is convex... (A) #!/usr/bin/env python import math def sigmoid(x): #return 1.0/(1+math.exp(-x)) return x*x alpha = 0.0 x1 = -10.0 x2 = 10.0 while (alpha < 1.0): if (sigmoid(alpha*x1 + (1-alpha)*x2) > alpha * sigmoid(x1) + (1-alpha)*sigmoid(x2)): print "PROBLEM" alpha += 0.01 (B) Visually it seems that sigmoid'' change it's curvature.
@AlexanderIhler9 жыл бұрын
+Konstantin Burlachenko It does change its curvature -- but its curvature is always positive, hence it is convex.
@konstantinburlachenko28439 жыл бұрын
+Alexander Ihler I disagree with you (C) #!/usr/bin/env python import math def sigmoid(x): # return x*x return 1.0/(1+math.exp(-x)) def sigmoidSecondDerivative(x): h = 0.01 return (sigmoid(x-h) - 2*sigmoid(x) + sigmoid(x+h))/(h*h) print sigmoidSecondDerivative(-0.5) print sigmoidSecondDerivative(0.5) (D) It change sign of f'' in zero point mathworld.wolfram.com/SigmoidFunction.html
@AlexanderIhler9 жыл бұрын
+Konstantin Burlachenko I'm sorry -- are you arguing that the sigmoid function is non-convex (true), or that its log is non-convex (false)? For logistic regression we require that it be log-convex, due to the form of the log-likelihood loss function; this is what I was referring to. (Sorry for any confusion.) The convexity / non-convexity of the non-linearities are not so important, except as they contribute to non-convexity of the loss function.
@Varuu8 жыл бұрын
wow thank you a lot!
@konstantinburlachenko28439 жыл бұрын
Thanks
@zsh39695 жыл бұрын
Hi Alexander, Very nice work. I wrote the code in matlab but the results are not promising. If you can help me to confirm if the code is right or I am missing something. If you are welling to help, please send me you email and I will send my code to check it. Thanks