Neural Networks Part 6: Cross Entropy

Рет қаралды 225,110

Күн бұрын

When a Neural Network is used for classification, we usually evaluate how well it fits the data with Cross Entropy. This StatQuest gives you and overview of how to calculate Cross Entropy and Total Cross Entropy.
NOTE: This StatQuest assumes that you are already familiar with...
The main ideas behind neural networks: • The Essential Main Ide...
The main ideas behind backpropagation: • Neural Networks Pt. 2:...
Neural networks with multiple inputs and outputs: • Neural Networks Pt. 4:...
ArgMax and SoftMax: • Neural Networks Part 5...
For a complete index of all the StatQuest videos, check out:
statquest.org/video-index/
If you'd like to support StatQuest, please consider...
Buying my book, The StatQuest Illustrated Guide to Machine Learning:
PDF - statquest.gumroad.com/l/wvtmc
Paperback - www.amazon.com/dp/B09ZCKR4H6
Kindle eBook - www.amazon.com/dp/B09ZG79HXC
Patreon: / statquest
...or...
KZbin Membership: / @statquest
...a cool StatQuest t-shirt or sweatshirt:
shop.spreadshirt.com/statques...
...buying one or two of my songs (or go large and get a whole album!)
joshuastarmer.bandcamp.com/
...or just donating to StatQuest!
www.paypal.me/statquest
Lastly, if you want to keep up with me as I research and create new StatQuests, follow me on twitter:
/ joshuastarmer
0:00 Awesome song and introduction
1:48 Cross Entropy defined
2:50 General equation for Cross Entropy
4:11 Calculating Total Cross Entropy
5:41 Why Cross Entropy and not SSR?
#StatQuest #NeuralNetworks #CrossEntropy

Пікірлер: 261

@statquest 2 жыл бұрын

The full Neural Networks playlist, from the basics to deep learning, is here: kzbin.info/www/bejne/eaKyl5xqZrGZetk Support StatQuest by buying my book The StatQuest Illustrated Guide to Machine Learning or a Study Guide or Merch!!! statquest.org/statquest-store/

@NStewF 3 жыл бұрын

Josh once again demonstrates his amazing ability to simplify complicated topics into elemental concepts that can be easily understood. BAM!

@statquest 3 жыл бұрын

Thanks Neil!

@gregk.2466 11 ай бұрын

@@statquest it is amazing ! thank you SQ!

@meeseeks1489 9 ай бұрын

What is this guy made of??? what does he eat??? Are you a God?? An alien?? You are so smart and dope man!!! How do you do all this? He should be a lecturer at MIT! SO underrated content💞💞💞💞💞💞

@statquest 9 ай бұрын

Wow, thanks!

@akaabdullahmateen9983 3 ай бұрын

@@statquest OMEGA BAAM!

@AlexandrSarioglo 5 ай бұрын

I’ve inquired about the reasons behind using the logarithm to calculate the loss for so long, no one could explain well enough to develop intuition about it. This did it. Thank You!

@statquest 5 ай бұрын

Thanks!

@NJCLM 12 күн бұрын

Your videos are the best for fundamental knowledge regarding ML/AI. I'im in transformer for 4 months, and i come back very often for the fundamental thing. THANKS Josh !!

@statquest 11 күн бұрын

Thank you very much!

@sattanathasiva8080 2 жыл бұрын

Happy teacher's day, from India. It's teacher's day today in India. Thanks for all your teaching

@statquest 2 жыл бұрын

Thank you very much!

@josyulaprashanth2976 10 ай бұрын

I just can’t believe how you opened my eyes. How can you be so awesome 👌👌. Sharing this knowledge for free is amazing.

@statquest 10 ай бұрын

Happy to help!

@somanshbudhwar 3 жыл бұрын

This is a life saver! Thank you so much again and again. Love your simple and elegant explainations.

@statquest 3 жыл бұрын

Thank you very much! :)

@mukulbarai1441 Жыл бұрын

I was predetermined that I would need to watch several videos to grasp this concept. OMG!! You have explained it so intuitively. Thanks a lot for saving my time and energy.

@statquest Жыл бұрын

Thank you! :)

@c3realpt 2 жыл бұрын

So refreshing and so different from the mathematical riddles that are used in university to teach us this stuff. Thank you!

@statquest 2 жыл бұрын

You're very welcome!

@vusalaalakbarova7378 2 жыл бұрын

Another fantastic video. You make these topics such straightforward to understand, that most lecturers overcomplicate by writing down unnecessarily long formulas and just showing off with their knowledge. Thanks a lot!

@statquest 2 жыл бұрын

Glad it was helpful!

@vijaykumarlokhande1607 3 жыл бұрын

Whenever I 'wonder' while watching statquest, josh tells me the solution just after:)

@statquest 3 жыл бұрын

bam! :)

@shivangisingh1190 2 жыл бұрын

Love each and every video by StatQuest. Thank you Josh and team for providing such clear, easy-to-digest concepts with a bonus of fun and entertainment. Quadruple BAM!!!

@statquest 2 жыл бұрын

Thank you!

@RaynerGS 2 жыл бұрын

I admire this professor a lot. I hope one day be a good teacher like you. Salute from Brazil. In my classes, I try to do it also, take a subject and make it easy as possible.

@statquest 2 жыл бұрын

Muito obrigado!

@Noah-zp2fn 2 жыл бұрын

this really clarified so much of the concepts of this topic! i always wondered what is the purpose of having cross entropy when we can use other loss functions like mean squared error! thank you so much!

@statquest 2 жыл бұрын

Awesome! I'm glad the video was helpful.

@sanchibharti858 Жыл бұрын

What an amazing video. Never found any content or video better than this one anywhere on this topic. Thank you so much.

@statquest Жыл бұрын

Glad it was helpful!

@neoklismimidis6403 2 жыл бұрын

Hello Josh! I have to say WOW!! I love every single of your videos!! They are so educational. I recently started studying ML for my master's degree and from the moment I found your channel ALL my questions that I wonder get answered! Also, I noticed that u reply to every post in the comment section. I am astonished.. no words. A true professor! Thanks for everything! Thank you for being a wonderful teacher.

@statquest 2 жыл бұрын

Thank you very much! :)

@sandeepganage9717 2 жыл бұрын

This is the shortest and the easiest explanation. Excellent job Josh!

@statquest 2 жыл бұрын

Awesome, thank you!

@SushilKumar-dr9rj 2 жыл бұрын

Josh, you are a savior man. I cannot emphasize this enough. I would have given up on understanding these concepts long ago if you had you not made these videos.

@statquest 2 жыл бұрын

Glad they are helpful! :)

@fndpires 2 жыл бұрын

He just cant stop getting better, THANK YOU MA MAN!

@statquest 2 жыл бұрын

Thanks!

@scarlaticious 4 ай бұрын

What an awesome channel. I've been learning and using ML for years and still these videos help me build intuition around basic concepts that I realize I never had. Also, love the songs and the BAMs. Thank you!

@statquest 4 ай бұрын

Thank you very much!

@LH-wc5vz 2 жыл бұрын

This video helps a lot! The explanation is brief and clear.

@statquest 2 жыл бұрын

Thank you!

@shandou5276 3 жыл бұрын

The softmax fuzzy bear cracks me up so much :D Fantastic video Josh!

@statquest 3 жыл бұрын

Thanks! 😀

@supersnowva6717 9 ай бұрын

I would not be able to get how neural networks fundamentally work without this series. Thank you so much Josh! Amazing and clear explainations!

@statquest 9 ай бұрын

Happy to help!

@tiago9617 2 жыл бұрын

The hero we wanted, and the hero we needed, StatQuest...

@statquest 2 жыл бұрын

Thanks!

@MLLearner-sb1ds 4 ай бұрын

Thank you sooo much I have a masters degree in CS and this is substantially better than anything I learnt in college, I understand it at an intuitive level. Thank you sooo much!!

@statquest 4 ай бұрын

Thank you!

@pabloxdark 8 ай бұрын

I am currently starting my bachelor's thesis on particle physics and I was told that a big part of it consists in running a neural network with pytorch. Your videos are really really useful and thanks to you I have at least have a vague idea on how a NN works. Looking forward to watch the rest of your Neural Networks videos!! TRIPLE BAM!!

@statquest 8 ай бұрын

Awesome! And good luck with your thesis!

@jiesun31 2 жыл бұрын

Love your videos, they are so intuitive!

@statquest 2 жыл бұрын

Thanks!

@gwinnifer4609 Жыл бұрын

Hey Josh! the way you teach is incredible. THANKS A LOT!❤

@statquest Жыл бұрын

Thank you! 😃

@salutoitoi 3 жыл бұрын

It is on time. I am actually using tensorflow for an image classifier, thank you for your video :)

@statquest 3 жыл бұрын

Glad it was helpful!

@iliasaarab7922 3 жыл бұрын

Amazing explanation as always! BAM!

@statquest 3 жыл бұрын

@andreasapostolatos6624 2 жыл бұрын

That is an outstanding teaching video, thank you tons!

@statquest 2 жыл бұрын

BAM! :)

@gbchrs 2 жыл бұрын

amazing work, can't wait to start on the book once I finish all your videos

@statquest 2 жыл бұрын

BAM! :)

@AhmedKhaliet Жыл бұрын

Wow , I feel when I say thank you it's nothing in compare with what you do ! Very impressive❤❤

@statquest Жыл бұрын

Thank you!

@user-xn7ot7ij7d Жыл бұрын

Your video makes my mind triple BAM!!

@statquest Жыл бұрын

HOORAY! :)

@elielberra2867 Жыл бұрын

Thank you so much! Your videos are always very clear and easy to follow :)

@statquest Жыл бұрын

Glad you like them!

@WIFI-nf4tg 3 жыл бұрын

Thank you for saving so much of my time. There are so many blogs on NN that I have wasted so many hours and days on across various topics, then I found your channel. Thank God for that.

@statquest 3 жыл бұрын

Glad I could help!

@amitanand7534 3 жыл бұрын

That wasn't shameless self-promotion. That was selfless giving.

@statquest 3 жыл бұрын

Thank you! :)

@shima_ce 21 күн бұрын

Great Explanation! Thank you!

@statquest 21 күн бұрын

Thank you!

@nathanzorndorf8214 Жыл бұрын

Thank you. Another complicated topic made simple. !!!!

@statquest Жыл бұрын

bam! :)

@petercourt Жыл бұрын

Really well explained! Thanks Josh :)

@statquest Жыл бұрын

Thank you!

@hashbrwn1339 Жыл бұрын

really good explanation .Difference of squared error vs cross entropy is very well explained .

@statquest Жыл бұрын

Thank you very much!

@alberteinstein4547 3 жыл бұрын

This video is godlike. Thank you.

@statquest 3 жыл бұрын

@satyamgupta4808 9 ай бұрын

very very intuitive and very great explanation

@statquest 9 ай бұрын

Glad you liked it!

@hadizand3753 3 жыл бұрын

Excellent teaching skill

@statquest 3 жыл бұрын

Thank you! :)

@dengzhonghan5125 3 жыл бұрын

great video as always.

@statquest 3 жыл бұрын

Thank you!

@phattailam9814 Жыл бұрын

thank you so much for your explanation!

@statquest Жыл бұрын

Glad it was helpful!

@lindaaa3299 2 жыл бұрын

I love this this way of learning!

@statquest 2 жыл бұрын

Hooray! :)

@derbaur2330 3 жыл бұрын

this is just the best channel in youtube!

@statquest 3 жыл бұрын

Thank you very much! :)

@Pedritox0953 2 жыл бұрын

Wonderful explanation

@statquest 2 жыл бұрын

Thank you! :)

@xxxiu13 7 ай бұрын

Great explanation in an entertaining way. Bam!

@statquest 7 ай бұрын

Glad you liked it!

@skippy1234459 2 жыл бұрын

This is great! Thank you!!!

@statquest 2 жыл бұрын

Glad you liked it!

@datasciencefreeedu4066 3 жыл бұрын

Josh u r vedios are amazing great work 👏 ❤

@statquest 3 жыл бұрын

Thank you so much 😀

@grabbenskarve3518 2 жыл бұрын

I found my treasure. So great!

@statquest 2 жыл бұрын

Thanks!

@user-se8id1xv6x 4 ай бұрын

love your songs!

@statquest 4 ай бұрын

Thank you!

@jacobcrowley8207 6 ай бұрын

Thank you, this makes sense now.

@statquest 6 ай бұрын

Happy to help!

@Piccadilly_ 10 ай бұрын

Thank you for this video! It and others helped me pass my exam! :D

@statquest 10 ай бұрын

TRIPLE BAM!!! Glad it helped!

@PrasadHonavar Жыл бұрын

This is a amazing video

@statquest Жыл бұрын

Thanks!

@user-sd4fi5em1g Жыл бұрын

Just wow, thumbs up, great explanation sir

@statquest Жыл бұрын

Thanks!

@user-sd4fi5em1g Жыл бұрын

@@statquest Just bought the PDF version of your book 'The StatQuest Illustrated Guide to Machine Learning'....really excited

@statquest Жыл бұрын

@@user-sd4fi5em1g Hooray!!! Thank you so much for your support! I hope you enjoy it!

@user-sd4fi5em1g Жыл бұрын

@@statquest you are welcome, I am doing PhD so yes I hope it helps me build a good understanding of machine learning

@pauledam2174 4 ай бұрын

as always a truly wonderful presentation. It could be good to do the KL Divergence first and then explain that minimizing the KL divergence results in minimizing the cross entropy.

@statquest 4 ай бұрын

I'll keep that in mind.

@user-bz8nm6eb6g 3 жыл бұрын

Thank you so much~!

@statquest 3 жыл бұрын

@CaptainBravo87 2 жыл бұрын

Pure Brilliance

@statquest 2 жыл бұрын

Thanks!

@palsshin 2 жыл бұрын

you nailed it!

@statquest 2 жыл бұрын

Thanks!

@adilahnafmugdho8132 2 жыл бұрын

Thank you sir

@statquest 2 жыл бұрын

You're welcome! :)

@ericdarcy 3 жыл бұрын

Thanks a lot

@statquest 3 жыл бұрын

@muhammadumarsotvoldiev8768 Жыл бұрын

It's amazing !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

@statquest Жыл бұрын

Thanks!

@ferminbereciartua6432 6 ай бұрын

you rock josh!!

@statquest 6 ай бұрын

Thanks!

@shishi1826 7 ай бұрын

I'm in a stats phd program and we had a guest speaker last week. During the lunch time the speaker asked us which course we liked most in our school, one of my classmates said actually no, he likes statquest "course" the most. And I was like nodding my head 100 times per minute. We discussed like why US universities hire professors good at research but not hire professors good at teaching, why there is no tenure-tracked teaching position......US education system really needs to change

@statquest 7 ай бұрын

Thank you! And that's a good question. One day I'd like to teach a real course somewhere. I love making videos, and want to do it forever, but it's also very lonely, and maybe teaching in person would change that.

@exoticcoder5365 7 ай бұрын

@@statquest I hope you don't feel lonely, as I value your feelings and I am very willing to give you feedbacks, I think video is a very gorgeous format, we can pause and ponder, we can revisit when we forget, for example now, It's my 2nd time to rewatch this video after months to solidify my memory ! Thank you Josh ! I love you so much !

@statquest 7 ай бұрын

@@exoticcoder5365 Thank you!

@leozheng370 2 жыл бұрын

I can imagine it takes a lot of time to make the videos. Thanks for the amount of efforts! cross entropy function is convex while squared error is not due to the logit function in softmax.

@statquest 2 жыл бұрын

Thank you! I'm not sure it is correct to say that the squared error is not convex - it's that just over the range, from 0 to 1, it doesn't do much.

@leozheng370 2 жыл бұрын

@@statquest kzbin.info/www/bejne/fnq0naB-ra5pY5I

@leozheng370 2 жыл бұрын

@@statquest in Applied Logistic Regression, actually MSE is one of several applicable cost function for LR.

@dsazz801 Жыл бұрын

2:30 wtf just happened 😂 I enjoy watching your videos. Thank you for the great explanations!!

@statquest Жыл бұрын

@satya8411 2 жыл бұрын

Kudos!!!! 🙌🏻 BAM!!!!!

@statquest 2 жыл бұрын

Thanks!

@tonirbaena1 3 жыл бұрын

Thanks! One question, in 5:40 you measure the total cross entropy as the sum of the 'train' set cross entropy. Could it be bias in unbalanced datasets? Do you recommend this method in these datasets? Thanks again

@statquest 3 жыл бұрын

I'm not sure I understand your question. Can you clarify it?

@p-niddy 2 жыл бұрын

My friend, the news is good.

@statquest 2 жыл бұрын

bam!

@Noah_R98292 2 ай бұрын

I am a student at Merton College, Oxford University. Please consider visiting our university sometime. Thank you for your absolutely brilliant content.

@statquest 2 ай бұрын

I would love to! Feel free to put me in touch with anyone who could make it happen.

@yoonchaena671 2 жыл бұрын

Baam!!!! Thank you~!!!!!!!! It is so clear~!

@statquest 2 жыл бұрын

bam!

@tamtran2274 2 жыл бұрын

love your song :))

@statquest 2 жыл бұрын

@nguyenkhoituan9291 3 жыл бұрын

Hi Josh, just want to let you know that the link for "Neural networks with multiple inputs and outputs" in the description is broken (though I was able to find the video in your Neural Network playlist).

@statquest 3 жыл бұрын

Thanks for the note! I've fixed the link.

@jiaqint961 Жыл бұрын

Gold!

@statquest Жыл бұрын

Thank you! :)

@amnont8724 Жыл бұрын

Hey Josh, I saw one of your videos about entropy in general - which is a way to measure uncertainty or surprise. Regarding Cross Entropy, the idea is the same - but now it's for the SoftMax outputs for the predicted Neural Network values?

@statquest Жыл бұрын

To be honest, I have no idea how this (cross entropy) is related to the general concept of entropy for data science.

@amnont8724 Жыл бұрын

@@statquest They both seem to be calculated in a similar way, I assumed they measured similar things

@statquest Жыл бұрын

@@amnont8724 They probably do. It's just not something I've thought about before or know about.

@knowledgedistiller 2 жыл бұрын

What happens if we want to use soft labels? Would the cross entropy loss still be a good loss function that would help the neural net converge to a good predictor of the soft labels?

@statquest 2 жыл бұрын

I don't know off the top of my head.

@pauldevereaux5537 9 ай бұрын

Hi Josh, it is me again. Thank you for all these amazing videos. Currently upgrading my own programmed NN to support classification. However, my own softMax function results in slightly different values around 1:30 namely .68, .11 and .21. So, is this just due to rounding or is there something wrong with my function. Many thanks in advance!

@statquest 9 ай бұрын

It looks like it is probably due to rounding.

@pauldevereaux5537 9 ай бұрын

@@statquest ah okay thank you :) Good to know I am not going insane :)

@muhammedyavas5765 Жыл бұрын

Thank you for the videos. They are really helpful. I have question about the softmax process in minutes 5:13. All outputs for Versicolor species, were already smaller than 1. Even all outputs smaller than 1, do we have to continue to process of softmax ? And Can't we use raw outputs for cross entropy? Thank you again.

@statquest Жыл бұрын

Not only should probabilities be between 0 and 1, but when we add up all possible options (Setosa, Versicolor and Virginica) they should add up to 1. Using the softmax function ensures that both of those are true.

@muhammedyavas5765 Жыл бұрын

@@statquest Oh I see. Thanks. BAM! :)))

@r0cketRacoon 3 ай бұрын

have u had a video to prove the formula of cross-entropy? I noticed that the formula of cross-entropy kind of different from that of entropy (which u had a video of demonstrating it very well)

@statquest 3 ай бұрын

I'm not sure I understand what you mean by "prove". However, I believe that cross entropy is simply defined the way it is.

@r0cketRacoon 3 ай бұрын

@@statquest I mean where the formula comes from and what the meaning behind it. I just figured out the formula of cross entropy is derived from the KL divergence. Its used to quantify the different between one probability distribution P_true is different from a second, reference probability distribution P_predicted. That's my opinion after I did some diggings

@mahdimohammadalipour3077 2 жыл бұрын

Hello josh. could you explain me how you used entropy function here? I've saw the entropy StatQuest for data science and now I'm wondering why you used observed probability outside of the log and predicted inside it not the opposite (observed inside and predicted outside) I know that results in log(0) which is undefined but I'm seeking the exact intuition. Thank you in advance :)

@statquest 2 жыл бұрын

See 2:57.

@mahdimohammadalipour3077 2 жыл бұрын

@@statquest Thank you. I watched it again and let me express my Q differently. I say we know that generally the entropy equation is this: entropy = -sum(p*log2(p)) and in this equation we have two p. one of them is multiplying the log and the other is its argument. now in cross entropy we have two probability, one we know by having training set and the second one is calculated by the NN (i.e. : first is observed and second is predicted ). I want to know why we have this equation for cross entropy : -sum(Pobserved*log(Ppredicted) and not this one: -sum(Ppredicted*log(Pobserved).

@statquest 2 жыл бұрын

@@mahdimohammadalipour3077 For details, see: en.wikipedia.org/wiki/Cross_entropy (by the way, I plan on covering the Kullback-Leibler divergence soon). Anyway the key part of that article is in the section titled "Motivation". It says... "Therefore, cross-entropy can be interpreted as the expected message-length per datum when a wrong distribution q is assumed while the data actually follows a distribution p. That is why the expectation is taken over the true probability distribution p and not q."

@mahdimohammadalipour3077 2 жыл бұрын

@@statquest I really appreciate it. Thank you :))))

@Ram-oj4gn 7 ай бұрын

As you said in the current video, if the cross-entropy function helps us more in the gradient descent process than what Sum of squared function does, why don't we use the same cross-entropy for the optimization of linear models such as Linear regression also.. why we use SS there and not entropy ? Thank you for the wonderful videos to understand the math and functions ..

@statquest 7 ай бұрын

For linear regression we use SSR because it works better when the range of y-axis values is not limited.

@technojos 3 жыл бұрын

Hi Josh. You have prepared a amazing video again. Firstly thanks a lot for this. I have a question. You said that sum of predicted probabilities must be equal to 1. But sum of the probabilities in the video differs 1. are they random datas to explain for video making or I am wrong?Please clarify this issue. Thanksss a lot in advance :) Bam !!

@statquest 3 жыл бұрын

What time point, minutes and seconds, are you asking about?

@technojos 3 жыл бұрын

@@statquest 05:49 we can see in the "p" column.(0.57+0.58+0.52=1.67 not 1)

@technojos 3 жыл бұрын

I knew my mistake:).Bamm!!!)

@statquest 3 жыл бұрын

@@technojos The column of numbers for "p" are not supposed to add up to 1 because each row is a value taken from a completely different set of SoftMax values (however, each set of SoftMax values does, in fact, add up to 1). Let me explain: At 2:14 we are running the first row of the training data through the neural network and at 2:22 we are applying SoftMax to the raw Output values. The corresponding probabilities for Setosa, Versicolor and Viriginica are: 0.57, 0.20 and 0.23. If we add those up, we get 1. Bam. However, because the first row of training data is for Setosa, we select the probability for Setosa, 0.57, and add it to the table (and then use that value to calculate the cross entropy). We then do the same thing for the second row of training data, which is for Viriginica. In this case, the SoftMax values at 4:32 are 0.22, 0.22 and 0.58 which add up to 1.02 instead of 1 because of rounding errors, but the idea is the same. We then select the one probability associated with Virginica (because the training data is for Virginica), 0.58, and add it to the table (and we use that to calculate cross entropy). Thus, the column in the table, "p", refers to the specific probability from a set of probabilities calculated for that row, and thus, there is no need for the column itself to also add up to 1. Does that make sense?

@technojos 3 жыл бұрын

@@statquest thanks Josh. I understood. You are my ideal.Thanks for everything :).

@iandanforth 9 ай бұрын

Please note that the image for -log("p") at 7:07 is incorrect. Both -log(x) and -ln(x) are 0 at x=1. The image also makes it looks like the function asymptotically approaches 0, but that is not the case. It is actually a much steeper decent than pictured.

@statquest 9 ай бұрын

I agree that the y-axis is confusing, but I wouldn't say it is incorrect since the y-axis values are not labeled.

@bobsmithy3103 2 жыл бұрын

3:28 Please correct me if I'm wrong, but the negative sign is before the summation sign, so shouldn't it be - ObservedSetosa*log(PredictedSetosa) + ObservedVersicolor*log(PredictedVersicolor) + ObservedVirginica*log(PredictedVirginica)? Like shouldn't the negative signs be positive instead?

@statquest 2 жыл бұрын

Just like -1 * (a + b + c) = -a - b - c, the minus sign outside of the summation carries through and turns the addition into subtraction.

@InsiderMiner 4 ай бұрын

Hi. Can I ask where the formula for Cross Entropy is defined. It appears around minute 2:22. Is that the definition of it? It looks like the definition of Entropy although at minute 2:22 it doesn't have the sigma sign at the beginning. In Wikipedia, I see a definition but it is not exactly this one. It is -E sub p of log q. I didn't see a definition of Cross Entropy in this video though. Is there another video where Josh defines cross entropy? I saw his supremely wonderful video on Entropy but I don't see any more. I more or less understand the argument that the observed probability that the data comes from virginica and versicolor are zero. Any help would be greatly appreciated! BAM

@statquest 4 ай бұрын

See 2:51

@Noor.kareem6 Жыл бұрын

Argmax and Softmax are using only with classification? Or could use them also with regression? And same question for cross entropy is used for classification only? Thank you

@statquest Жыл бұрын

I'm pretty sure they are only used for classification.

@Noor.kareem6 Жыл бұрын

@@statquest what about the accuracy for regression? MSE ?

@statquest Жыл бұрын

@@Noor.kareem6 SSR or MSE are used for regression.

@kaanzt 10 ай бұрын

Josh, thanks for such an explanatory video. But i couldn't understand why residual^2 graph is linear in 7:31.

@statquest 10 ай бұрын

It's because the range of values for cross entropy is much larger than the range of values for the residual^2. In other words, the scale of the y-axis makes the residual^2 look flat.

@kaanzt 10 ай бұрын

@@statquest ooh, thank you for responding. Now it makes sense. BAM!

@BalajiChegu Ай бұрын

Thanks a lot for simple and elegant explanation. Can you please provide download link for slides of this video ?

@statquest Ай бұрын

This will be in my next book.

@BalajiChegu Ай бұрын

@@statquest Thanks a lot for all your vids. You are the only one on this planet making calculus as simple as playing candy crush game.

@sarbajitg 2 жыл бұрын

@4:49 is it because we know data is from Virginica, we are putting its probability [0.58] there, or 0.58 is the maximum thus we are taking the observation as Virginica? What if the prob were [Setosa, Virginica, Versicolor] = [0.5, 0.2, 0.3], in that case, will we still take 0.2, or take it as Setosa?

@statquest 2 жыл бұрын

We save 0.58 as the probability, and then use it for cross entropy, because we know the data is from Viriginica. If we knew it was from Setosa or Versicolor, we would have used 0.22.

@sarbajitg 2 жыл бұрын

@@statquest Thank you, Sir, it clears my doubt.

@mathrisk 3 жыл бұрын

Thanks \m/

@statquest 3 жыл бұрын

@beshosamir8978 Жыл бұрын

but in a Regression problem we still using SSR right ? so what will happen if we still using SSR in a Classification problem and after the backpropagation ends his work Check for the maximum output ? is that because if we have an output = 1.64 and the observed = 1 it also tends to decrease the distance so we needed to invent a function to control what is the min and maximum value ? in our case 0 and 1

@statquest Жыл бұрын

For a regression problem we would still use SSR. Cross Entropy is only for classification, and I believe it makes training easier.

@beshosamir8978 Жыл бұрын

@@statquest Thank u josh for making Everything easy for us

@bastinjerry2843 2 жыл бұрын

please do a video on RNN network

@statquest 2 жыл бұрын

I'm working on one.

@bastinjerry2843 2 жыл бұрын

@@statquest thank you :)

@oldcowbb 10 ай бұрын

still trying to wrap my head around how is this related to entropy if entropy is the expected surprise, it's like we are using a different distribution for the surprise and the expected value

@statquest 10 ай бұрын

It's easier to see the relationship when you focus on the full equation at 2:57 For more details, see: en.wikipedia.org/wiki/Cross-entropy