There is a text book named applied logistic regression. mean squared error actually is one of several applicable cost functions for logistic regression. In this video, it is true that the magnitude of wrong classification goes from 0.81 to 2.302. But if you observe that 0.81/ 0.0025 = 324 and 2.302/0.051 =45 , actually the MSE sees much larger gap btw correct and wrong prediction.
@rajeshwarsehdev23184 жыл бұрын
Why are does logistic reg gives prob values between (0,1)?
@atahankocak48114 жыл бұрын
Thank you for this video. You made this simple. I have a couple of questions. There is something that does not make sense to me around 7:00. You said log loss function penalizes the classifications more as compared to the MSE. Are you just comparing misclassifications of MSE and log loss ? I may be wrong, but shouldn't we account for correct classifications as well. In other words, shouldn't we consider the ratio of misclassification to correct classification in each cost method to compare, instead of 2 misclassification costs to each-other directly?
@scottaspeaks95315 жыл бұрын
This was really clear. I needed this explanation.
@dhananjaykansal80975 жыл бұрын
Thank you Bhaveshbhai. Keep it up!
@widowsson81925 жыл бұрын
Great video. At 6:55, do you sum the costs from each y(predicted) to get the final cost?
@bhattbhavesh915 жыл бұрын
Yes
@widowsson81925 жыл бұрын
So when doing regularization with l2 penalty, is it the loss function + 1/C(what^2)?
@dhananjaykansal80975 жыл бұрын
How come you arrived at 0.051 and 2.302. I didn’t understand. Sorry I’m a Noob
@gauravgtv58735 жыл бұрын
This is the exact explanation what I was looking for. Thanks bro.
@DipakDas-be6nl7 ай бұрын
loved the video ..easy explanation
@shaharrefaelshoshany94424 жыл бұрын
You are a real Rockstar my friend, amazing stuff :))
@bhattbhavesh914 жыл бұрын
Thank you so much 😀
@kmnm94633 жыл бұрын
Hi Bhavesh, Excellent video on Log Loss. I have a request, first - computing Log Loss for binary classification(like here 1 and 0) looks quite simple. I guess most of the videos are not touching on the multi-class classification and log loss in such cases. You would surely know of Fashion MNIST dataset with 10-class classification. I want to know how to derive the Cost function for the Fashion Dataset. Here the actual values will be 0 to 9 and for each observation our model ( with SoftMax activation) predicts the output by providing probability value for each of the classes 0-9 for each observation - and the model is supposed to predict that class with the max value of the 10 probability values. So for an example observation , how to calculate the loss function? Please help Regards Krish
@rajeshsomasundaram72995 жыл бұрын
This is amazing Bhavesh
@randyj37062 жыл бұрын
Thanks man great video 👍
@rajatmishra35724 жыл бұрын
Very Clear Explanation!!!
@bhattbhavesh914 жыл бұрын
Glad it was helpful!
@krishnab64442 жыл бұрын
nicely explained sir
@swathys78185 жыл бұрын
Awesome .Thank you for very clear explanation!!
@sakaar-lok91092 жыл бұрын
Great brother
@techbyteslearning54764 жыл бұрын
No surprise.......again u proved as Gem
@akshayjoshi3837 Жыл бұрын
Hi Bhavesh, excellent video. I appreciate your efforts in making it. Its very helpful and cleared my concept for why we use log loss function as cost function for logistic regression. Can you also make a video on regularization in logistic regression ?
@bhattbhavesh91 Жыл бұрын
It's my pleasure
@pmanojkumar52605 жыл бұрын
Great work bro ..subscribed
@rvg2964 жыл бұрын
Can you help me understand how did you come up with Cost Function equation?
@ashforwin4 жыл бұрын
This is very neat and clean explanation. Thanks for sharing.
@bhattbhavesh914 жыл бұрын
Thanks for watching!
@arindambhadra14614 жыл бұрын
This is a really great explanation bro.
@bhattbhavesh914 жыл бұрын
Glad you liked it
@abhinabaghose33805 жыл бұрын
which book of deep learning did you mention for the explanation?
@楓善慶5 жыл бұрын
Excellent explanation👍
@bhuwanjoshi31643 жыл бұрын
Thank you so much for this bro
@bhattbhavesh913 жыл бұрын
My pleasure!
@Pedritox09533 жыл бұрын
Great explanation !
@bhattbhavesh913 жыл бұрын
Glad you liked it!
@abhijeetpatil66345 жыл бұрын
Thanks alot, could you please make series on SVM
@VarunKumar-pz5si4 жыл бұрын
Bro how to find the global minimum using the gradient descent? Please make a video on it.
@akashkewar5 жыл бұрын
Brilliant explanation.
@snay6869 Жыл бұрын
well explained
@bhattbhavesh91 Жыл бұрын
Thank you
@jongcheulkim72843 жыл бұрын
Thank you so much^^.
@bhattbhavesh913 жыл бұрын
You're welcome 😊
@tahsinchowdhury48075 жыл бұрын
Thanks for the explanation, really helped. :)
@reetikagour12035 жыл бұрын
Hi bhavesh, I am using log loss function like this: -np.mean(y * np.log(y_pred) + (1-y) * np.log(1-y_pred)) where i have 50k data points in X_train so what should be the shape of output when i want to print log loss . I m confused because of mean i used here.....could u please help
@brijesh08084 жыл бұрын
Nice tips. I just wish you would have utilised the pages more effeciently. Lots of wastage of paper.
Great👍 Can you suggest some book for mathematics for ml?
@SATISHKUMAR-qk2wq5 жыл бұрын
Ptsp
@pranitbhisade31745 жыл бұрын
Thank you sir
@kakarnyori54574 жыл бұрын
Thank you!
@bhattbhavesh914 жыл бұрын
You're welcome!
@yodaiam50444 жыл бұрын
Like it, Yoda does. You, Yoda Thanks. :]
@kleymenovarkadiy24192 жыл бұрын
тот самый индус с ютуба)
@hobbeldibbel4 жыл бұрын
Excellent explanation of the cost function, but you did not explain cross-entropy.
@gardeninglessons39493 жыл бұрын
thanku
@bhattbhavesh913 жыл бұрын
you are welcome :)
@JustMe-pt7bc5 жыл бұрын
Python part I didn't understand
@Claymore_29045 жыл бұрын
Hey.. for cost function we take -log(ho(x)) if y=1.. you have taken ln instead of log. Please confirm. Because as per cost function formula log is correct
@Thiyagadhayalan5 жыл бұрын
no big animations but still good
@purelogic45335 жыл бұрын
Absolutely off the real reason why you use cross entropy as loss function for logistic regression. Read up on maximum likehood estimator and you will have a better understanding and can remake your video. Good effort to educate. Keep it up.
@bhattbhavesh915 жыл бұрын
Hello! Thanks for your valuable input. The resources I referred pointed out that MSE with sigmoid will give non-convex functions which is why we go for cross-entropy/log loss function. This is the link that double checks on the above statement - stats.stackexchange.com/questions/251496/why-sum-of-squared-errors-for-logistic-regression-not-used-and-instead-maximum-l But yes I am open to remake the video if you can share a link across!
@purelogic45335 жыл бұрын
@@bhattbhavesh91 the MSE is simply the cross entropy between any actual distribution and the gaussian distribution. As a negative log loss you can easily see that this is equivalent to MSE when you take logs of the gaussian dist. As such it is limiting - if you have any model of probability making predictions which in general is necessarily non-gaussian (any non linear reg) like logistic regression ( neural network function approximator); you should not use MSE as it will assume that the predicted distributon is gaussian which is wrong. So generally we compute the cross entropy of the actual dist on predicted dist and minimize it in optimization to calibrate the predicted dist to optimally fit the actual data. This is the method of maximum likelihood estimation. Hope this explanation makes sense.
@shaharrefaelshoshany94424 жыл бұрын
@@purelogic4533 Amm it is way to advance for my friend, can you share a good resource
@vaibhavpatel22784 жыл бұрын
what with your mic or you should really speak loadly