Derivative of log(x) is 1/(ln10*x) . Since ln10 is a constant, it doesn't spoil gradient descent.
@patite31033 жыл бұрын
You've done an amazing work! I love your videos! I wish you could do some more math videos! For example about neural network backpropagation, perceptron convergence theorem, mixture gaussian models....At 8:39 shouldn't we add the negativ sign to the NLL function?
@DarshanSenTheComposer5 жыл бұрын
Having watched a couple of Prof. Andrew Ng's ML videos on Coursera, watching this video is pure satisfaction. I really liked the fact that you walked through the vectorized approaches as well. Isn't ML just a glorified form of Numerical Methods? Thank you. :)
@northsand3 жыл бұрын
This video saved my life
@kingkong792 Жыл бұрын
love your content, thank you for taking the time to share your skills!
@danielrosas22404 жыл бұрын
I like these series of videos. Thank u so much!
@saminchowdhury79955 жыл бұрын
this is amazing brother. u made me understand maths ur a G.
@mmm777ization3 жыл бұрын
prob of data inc. loss dec @4:20
@dvdsct3 жыл бұрын
dude, AMAZING video, thanks!
@kennethleung44873 жыл бұрын
Fantastic work! Keep it up!
@forcedlevy3 жыл бұрын
Thank you Tobias Funke
@forcedlevy3 жыл бұрын
11:46 should'nt the update term of gradient descent be "thetha=theta-alpha*gradient" ? you seem to use addition instead of subtraction
@kestonsmith13543 жыл бұрын
Alpha*gradient will give you a negative number so that's why he puts the addition.
@NajmehMousavi3 жыл бұрын
amazing and clear explanation, great work :)
@imed_rahmani3 жыл бұрын
Thank you so much, sir. I really appreciate your videos
@amarnathjagatap23395 жыл бұрын
Sir make more videos on machine learning
@95Bloulou4 жыл бұрын
Thank you ! Great video in my opinion !
@rahuldeora58155 жыл бұрын
I had a question which has been troubling me for quite some time. When we take the derivative of the weight matrix times the output of its pervious layer we get a transposed matrix. I never understood how the transpose appeared. I searched alot but could not find a very clear explanation anywhere. How you guide me towards where I can find it explained simply?
@ripsirwin13 жыл бұрын
It kinda blows my mind that the loss function is independent of the model. Is this correct?
@danishnawaz78695 жыл бұрын
Challenging you to make me understand the maths of gradient boosting algorithm. 😝
You said to add a negative sign to the l function, and then proceeded to leave functional expression exactly the same. I do not see that you added the negative sign.
@KoKi-nx1xo4 жыл бұрын
why is the Theta transposed??
@animeshsharma73323 жыл бұрын
Just to make the two matrices compatible with each other for multiplication. Theta is of order (1xd), also X is of (1xd). Transposing theta will change its order to (dx1), which will make it multipliable with X (1xd).