Log Loss or Cross-Entropy Cost Function in Logistic Regression

Рет қаралды 48,720

Bhavesh Bhatt

Күн бұрын

Пікірлер: 78

@monicabhogal 4 ай бұрын

Nicely explained Bhavesh! Kudos!

@leozheng370 3 жыл бұрын

There is a text book named applied logistic regression. mean squared error actually is one of several applicable cost functions for logistic regression. In this video, it is true that the magnitude of wrong classification goes from 0.81 to 2.302. But if you observe that 0.81/ 0.0025 = 324 and 2.302/0.051 =45 , actually the MSE sees much larger gap btw correct and wrong prediction.

@rajeshwarsehdev2318 4 жыл бұрын

Why are does logistic reg gives prob values between (0,1)?

@atahankocak4811 4 жыл бұрын

Thank you for this video. You made this simple. I have a couple of questions. There is something that does not make sense to me around 7:00. You said log loss function penalizes the classifications more as compared to the MSE. Are you just comparing misclassifications of MSE and log loss ? I may be wrong, but shouldn't we account for correct classifications as well. In other words, shouldn't we consider the ratio of misclassification to correct classification in each cost method to compare, instead of 2 misclassification costs to each-other directly?

@scottaspeaks9531 5 жыл бұрын

This was really clear. I needed this explanation.

@dhananjaykansal8097 5 жыл бұрын

Thank you Bhaveshbhai. Keep it up!

@widowsson8192 5 жыл бұрын

Great video. At 6:55, do you sum the costs from each y(predicted) to get the final cost?

@bhattbhavesh91 5 жыл бұрын

Yes

@widowsson8192 5 жыл бұрын

So when doing regularization with l2 penalty, is it the loss function + 1/C(what^2)?

@dhananjaykansal8097 5 жыл бұрын

How come you arrived at 0.051 and 2.302. I didn’t understand. Sorry I’m a Noob

@gauravgtv5873 5 жыл бұрын

This is the exact explanation what I was looking for. Thanks bro.

@DipakDas-be6nl 7 ай бұрын

loved the video ..easy explanation

@shaharrefaelshoshany9442 4 жыл бұрын

You are a real Rockstar my friend, amazing stuff :))

@bhattbhavesh91 4 жыл бұрын

Thank you so much 😀

@kmnm9463 3 жыл бұрын

Hi Bhavesh, Excellent video on Log Loss. I have a request, first - computing Log Loss for binary classification(like here 1 and 0) looks quite simple. I guess most of the videos are not touching on the multi-class classification and log loss in such cases. You would surely know of Fashion MNIST dataset with 10-class classification. I want to know how to derive the Cost function for the Fashion Dataset. Here the actual values will be 0 to 9 and for each observation our model ( with SoftMax activation) predicts the output by providing probability value for each of the classes 0-9 for each observation - and the model is supposed to predict that class with the max value of the 10 probability values. So for an example observation , how to calculate the loss function? Please help Regards Krish

@rajeshsomasundaram7299 5 жыл бұрын

This is amazing Bhavesh

@randyj3706 2 жыл бұрын

Thanks man great video 👍

@rajatmishra3572 4 жыл бұрын

Very Clear Explanation!!!

@bhattbhavesh91 4 жыл бұрын

Glad it was helpful!

@krishnab6444 2 жыл бұрын

nicely explained sir

@swathys7818 5 жыл бұрын

Awesome .Thank you for very clear explanation!!

@sakaar-lok9109 2 жыл бұрын

Great brother

@techbyteslearning5476 4 жыл бұрын

No surprise.......again u proved as Gem

@akshayjoshi3837 Жыл бұрын

Hi Bhavesh, excellent video. I appreciate your efforts in making it. Its very helpful and cleared my concept for why we use log loss function as cost function for logistic regression. Can you also make a video on regularization in logistic regression ?

@bhattbhavesh91 Жыл бұрын

It's my pleasure

@pmanojkumar5260 5 жыл бұрын

Great work bro ..subscribed

@rvg296 4 жыл бұрын

Can you help me understand how did you come up with Cost Function equation?

@ashforwin 4 жыл бұрын

This is very neat and clean explanation. Thanks for sharing.

@bhattbhavesh91 4 жыл бұрын

Thanks for watching!

@arindambhadra1461 4 жыл бұрын

This is a really great explanation bro.

@bhattbhavesh91 4 жыл бұрын

Glad you liked it

@abhinabaghose3380 5 жыл бұрын

which book of deep learning did you mention for the explanation?

@楓善慶 5 жыл бұрын

Excellent explanation👍

@bhuwanjoshi3164 3 жыл бұрын

Thank you so much for this bro

@bhattbhavesh91 3 жыл бұрын

My pleasure!

@Pedritox0953 3 жыл бұрын

Great explanation !

@bhattbhavesh91 3 жыл бұрын

Glad you liked it!

@abhijeetpatil6634 5 жыл бұрын

Thanks alot, could you please make series on SVM

@VarunKumar-pz5si 4 жыл бұрын

Bro how to find the global minimum using the gradient descent? Please make a video on it.

@akashkewar 5 жыл бұрын

Brilliant explanation.

@snay6869 Жыл бұрын

well explained

@bhattbhavesh91 Жыл бұрын

Thank you

@jongcheulkim7284 3 жыл бұрын

Thank you so much^^.

@bhattbhavesh91 3 жыл бұрын

You're welcome 😊

@tahsinchowdhury4807 5 жыл бұрын

Thanks for the explanation, really helped. :)

@reetikagour1203 5 жыл бұрын

Hi bhavesh, I am using log loss function like this: -np.mean(y * np.log(y_pred) + (1-y) * np.log(1-y_pred)) where i have 50k data points in X_train so what should be the shape of output when i want to print log loss . I m confused because of mean i used here.....could u please help

@brijesh0808 4 жыл бұрын

Nice tips. I just wish you would have utilised the pages more effeciently. Lots of wastage of paper.

@bhattbhavesh91 4 жыл бұрын

Don't worry, I used it after the video!

@brijesh0808 4 жыл бұрын

@@bhattbhavesh91 Thats really Great. Thanks:)

@rrrprogram8667 5 жыл бұрын

Nice one bro

@VIGNESHPRAJAPATI 5 жыл бұрын

Good video

@salonigandhi8567 4 жыл бұрын

awesome

@bhattbhavesh91 4 жыл бұрын

Thank you!

@darkhorse9694 5 жыл бұрын

Hi, why we use negative sign in cost function?

@bhattbhavesh91 5 жыл бұрын

datascience.stackexchange.com/questions/53225/why-does-the-logistic-regression-cost-function-need-to-be-the-negative-of-log

@NotFound-hy7qb 5 жыл бұрын

Great👍 Can you suggest some book for mathematics for ml?

@SATISHKUMAR-qk2wq 5 жыл бұрын

Ptsp

@pranitbhisade3174 5 жыл бұрын

Thank you sir

@kakarnyori5457 4 жыл бұрын

Thank you!

@bhattbhavesh91 4 жыл бұрын

You're welcome!

@yodaiam5044 4 жыл бұрын

Like it, Yoda does. You, Yoda Thanks. :]

@kleymenovarkadiy2419 2 жыл бұрын

тот самый индус с ютуба)

@hobbeldibbel 4 жыл бұрын

Excellent explanation of the cost function, but you did not explain cross-entropy.

@gardeninglessons3949 3 жыл бұрын

thanku

@bhattbhavesh91 3 жыл бұрын

you are welcome :)

@JustMe-pt7bc 5 жыл бұрын

Python part I didn't understand

@Claymore_2904 5 жыл бұрын

Hey.. for cost function we take -log(ho(x)) if y=1.. you have taken ln instead of log. Please confirm. Because as per cost function formula log is correct

@Thiyagadhayalan 5 жыл бұрын

no big animations but still good

@purelogic4533 5 жыл бұрын

Absolutely off the real reason why you use cross entropy as loss function for logistic regression. Read up on maximum likehood estimator and you will have a better understanding and can remake your video. Good effort to educate. Keep it up.

@bhattbhavesh91 5 жыл бұрын

Hello! Thanks for your valuable input. The resources I referred pointed out that MSE with sigmoid will give non-convex functions which is why we go for cross-entropy/log loss function. This is the link that double checks on the above statement - stats.stackexchange.com/questions/251496/why-sum-of-squared-errors-for-logistic-regression-not-used-and-instead-maximum-l But yes I am open to remake the video if you can share a link across!

@purelogic4533 5 жыл бұрын

@@bhattbhavesh91 the MSE is simply the cross entropy between any actual distribution and the gaussian distribution. As a negative log loss you can easily see that this is equivalent to MSE when you take logs of the gaussian dist. As such it is limiting - if you have any model of probability making predictions which in general is necessarily non-gaussian (any non linear reg) like logistic regression ( neural network function approximator); you should not use MSE as it will assume that the predicted distributon is gaussian which is wrong. So generally we compute the cross entropy of the actual dist on predicted dist and minimize it in optimization to calibrate the predicted dist to optimally fit the actual data. This is the method of maximum likelihood estimation. Hope this explanation makes sense.

@shaharrefaelshoshany9442 4 жыл бұрын

@@purelogic4533 Amm it is way to advance for my friend, can you share a good resource