Intuitively Understanding the Cross Entropy Loss

  Рет қаралды 95,920

Adian Liusie

Adian Liusie

Күн бұрын

This video discusses the Cross Entropy Loss and provides an intuitive interpretation of the loss function through a simple classification set up. The video will draw the connections between the KL divergence and the cross entropy loss, and touch on some practical considerations.
Twitter: / adianliusie

Пікірлер
@leoxu9673
@leoxu9673 2 жыл бұрын
This is the only video that's made the connection between KL Divergence and Cross Entropy Loss intuitive for me. Thank you so much!
@jasonpittman7853
@jasonpittman7853 2 жыл бұрын
This subject has confused me greatly for nearly a year now, your video and the kl-divergence video made it clear as day. You taught it so well I feel like a toddler could understand this subject.
@nirmalyamisra4317
@nirmalyamisra4317 3 жыл бұрын
Great video. It is always good to dive into the math to understand why we use what we use. Loved it!
@ananthakrishnank3208
@ananthakrishnank3208 Жыл бұрын
Excellent expositions on KL divergence and Cross Entropy loss within 15 mins! Really intuitive. Thanks for sharing.
@kvnptl4400
@kvnptl4400 Жыл бұрын
This one I would say is a very nice explanation of Cross Entropy Loss.
@bo3053
@bo3053 2 жыл бұрын
Super useful and insightful video which easily connects KL-divergence and Cross Entropy Loss. Brilliant! Thank you!
@alirezamogharabi8733
@alirezamogharabi8733 2 жыл бұрын
The best explanation I have ever seen about Cross Entropy Loss. Thank you so much 💖
@yingjiawan2514
@yingjiawan2514 6 ай бұрын
This is so well explained. thank you so much!!! Now I know how to understand KL divergence, cross entropy, logits, normalization, and softmax.
@shubhamomprakashpatil1939
@shubhamomprakashpatil1939 2 жыл бұрын
This is an amazing explanatory video on Cross-Entropy loss. Thank you
@viktorhansen3331
@viktorhansen3331 2 жыл бұрын
I have no background in ML, and this plus your other video completely explained everything I needed to know. Thanks!
@Micha-ku2hu
@Micha-ku2hu 8 ай бұрын
What a great and simple explanation of the topic! Great work 👏
@ssshukla26
@ssshukla26 3 жыл бұрын
And no one told me that (minimizing KL is almost equivalent to minizing CLE) in 2 years studying in a University... Oh man... thank you so much...
@DHAiRYA2801
@DHAiRYA2801 Жыл бұрын
KL = Cross Entropy - Entropy.
@TheVDicer
@TheVDicer 2 жыл бұрын
Fantastic video and explanation. I just learned about the KL divergence and the cross entropy loss finally makes sense to me.
@hansenmarc
@hansenmarc 2 жыл бұрын
Great explanation! I’m enjoying all of your “intuitively understanding” videos.
@matiassandacz9145
@matiassandacz9145 3 жыл бұрын
This video was amazing. Very clear! Please post more on ML / Probability topics. :D Cheers from Argentina.
@yfd487
@yfd487 Жыл бұрын
I love this video!! So clear and informative!
@allanchan339
@allanchan339 3 жыл бұрын
It is a excellent explanation to make use of previous video of KL divergence in this video
@hasankaynak2253
@hasankaynak2253 2 жыл бұрын
The clearest explanation. Thank you.
@whowto6136
@whowto6136 3 жыл бұрын
Thanks a lot! Really helps me understand Cross Entropy, Softmax and the relation between them.
@franco-ruggeri-old
@franco-ruggeri-old 2 жыл бұрын
Great and clear explanation!
@newbie8051
@newbie8051 3 ай бұрын
oh wow this was simple and amazing thanks !
@chunheichau7947
@chunheichau7947 6 ай бұрын
I wish more professors can hit all the insights that you mentioned in the video.
@LiHongxuan-ee7qs
@LiHongxuan-ee7qs 10 ай бұрын
So clear explanation! Thanks!
@HaykTarkhanyan
@HaykTarkhanyan 7 ай бұрын
great video, thank you!
@blakete
@blakete 2 жыл бұрын
Thank you. You should have more subscribers.
@shchen16
@shchen16 2 жыл бұрын
Thanks for this video
@yassine20909
@yassine20909 2 жыл бұрын
Nice explanation, thank you.
@kevon217
@kevon217 2 жыл бұрын
Simple and helpful!
@dirtyharry7280
@dirtyharry7280 2 жыл бұрын
This is so good, thx so much
@mikejason3822
@mikejason3822 2 жыл бұрын
Great video!
@lebronjames193
@lebronjames193 3 жыл бұрын
really superb video, you should record more !
@shahulrahman2516
@shahulrahman2516 2 жыл бұрын
Thank you
@yegounkim1840
@yegounkim1840 Жыл бұрын
You the best!
@sushilkhadka8069
@sushilkhadka8069 Жыл бұрын
This is so neat.
@Darkev77
@Darkev77 3 жыл бұрын
Brilliant and simple! Could you make a video about soft/smooth labels instead of hard ones and how that makes it better (math behind it)?
@SA-by2xg
@SA-by2xg 2 жыл бұрын
Intuitively, information is lost whenever discretizing a continuous variable. Said another way, the difference between a class probability of 0.51 and 0.99 is very different. Downstream, soft targets allow for more precise gradient updates.
@omkarghadge8432
@omkarghadge8432 3 жыл бұрын
Great! keep it up.
@mixuaquela123
@mixuaquela123 2 жыл бұрын
Might be a stupid question but where do we get the "true" class distribution?
@patrickadu-amankwah1660
@patrickadu-amankwah1660 Жыл бұрын
Real world data bro, from annotated samples.
@飛鴻-q1c
@飛鴻-q1c Жыл бұрын
Human is the criteria for everything,so called AI
@AnonymousIguana
@AnonymousIguana 9 ай бұрын
In the classification task, the true distribution has the value of 1 for the correct class and value of 0 for the other classes. So that's it, that's the true distribution. And we know it, if the data is labelled correctly. The distribution in classification task is called probability mass function btw
@kutilkol
@kutilkol 3 жыл бұрын
superb!
@starriet
@starriet 3 жыл бұрын
essence, short, great.
@jakobmiesner3995
@jakobmiesner3995 Ай бұрын
Thanks
@MrPejotah
@MrPejotah Жыл бұрын
Great video, but only really clear if you know what the KL divergence is. I'd hammer that point to the viewer.
@jiwoni523
@jiwoni523 Жыл бұрын
make more videos please , you are awesome
@PoojaKumawat-z7i
@PoojaKumawat-z7i 6 ай бұрын
How does the use of soft label distributions, instead of one-hot encoding hard labels, impact the choice of loss function in training models? Specifically, can cross-entropy loss still be effectively utilized, or should Kullback-Leibler (KL) divergence be preferred?
@vandana2410
@vandana2410 2 жыл бұрын
Thanks for the great video. 1 question though. What happens if we swap the true and predicted probabilities in the formula?
@thinkbigwithai
@thinkbigwithai Жыл бұрын
At 3:25 why don't we model it as argmax Summ P* logP (without minus sign)?
@fVNzO
@fVNzO 4 ай бұрын
I skipped through the video but i don't think you managed to explain how the formula itself deals with the infinites that are created when inputting log(0). That's what i don't understand.
@sukursukur3617
@sukursukur3617 2 жыл бұрын
Why dont we use just mean of (p-q)^2 instead of p*log(p/q) to understand dissimilarity of pdfs?
@quantumjun
@quantumjun 2 жыл бұрын
will the thing in 4:12 be negative if you use information entropy or KL divergence? are they both > 0?
@yassine20909
@yassine20909 2 жыл бұрын
As explained in the video the KL divergence is a measure of "distance", so it has to be >0. There other prerequisites for a function to be a measure of distance like symmetry, and couple other things i forget about.
@madarahuchiha1133
@madarahuchiha1133 9 ай бұрын
what is true class distribution?
@elenagolovach384
@elenagolovach384 8 ай бұрын
the frequency of occurrence of a particular class depends on the characteristics of the objects
@genkidama7385
@genkidama7385 8 ай бұрын
distirbution
@phafid
@phafid 2 жыл бұрын
3:24 .The aha moment when you realize whta's the purpose of the negative sign in cross entrophy
@phafid
@phafid 2 жыл бұрын
4:24. do you know how golden the statement is
@tanvirtanvir6435
@tanvirtanvir6435 Жыл бұрын
0:08 3:30 P* is true prob
@zhaobryan4441
@zhaobryan4441 2 жыл бұрын
hello, handsome could you share the clear slides?
@zingg7203
@zingg7203 2 жыл бұрын
Volumn is low
@ajitzote6103
@ajitzote6103 11 ай бұрын
not really a great explaination, so many terms were thrown in. that's not a good way to explain something.
@commonsense126
@commonsense126 Жыл бұрын
Speak slower please
@Oliver-2103
@Oliver-2103 Жыл бұрын
Your name is commonsense and you still don't use your common sense lol In every KZbin application, there is the option to slow a video down to 75%, 50% or even 25% speed. If you have trouble with understanding his language, you should just select the 0.75 speed option.
@commonsense126
@commonsense126 Жыл бұрын
@@Oliver-2103 Visually Impaired people have problems seeing some of the adjustments one can make on a phone even when they know that they exist.
The Key Equation Behind Probability
26:24
Artem Kirsanov
Рет қаралды 172 М.
Tips Tricks 15 - Understanding Binary Cross-Entropy loss
18:29
DigitalSreeni
Рет қаралды 23 М.
UFC 287 : Перейра VS Адесанья 2
6:02
Setanta Sports UFC
Рет қаралды 486 М.
24 Часа в БОУЛИНГЕ !
27:03
A4
Рет қаралды 7 МЛН
🎈🎈🎈😲 #tiktok #shorts
0:28
Byungari 병아리언니
Рет қаралды 4,5 МЛН
Intuitively Understanding the KL Divergence
5:13
Adian Liusie
Рет қаралды 95 М.
Entropy (for data science) Clearly Explained!!!
16:35
StatQuest with Josh Starmer
Рет қаралды 660 М.
The Most Important Algorithm in Machine Learning
40:08
Artem Kirsanov
Рет қаралды 597 М.
A Short Introduction to Entropy, Cross-Entropy and KL-Divergence
10:41
Aurélien Géron
Рет қаралды 361 М.
Why Does Diffusion Work Better than Auto-Regression?
20:18
Algorithmic Simplicity
Рет қаралды 452 М.
Transformers (how LLMs work) explained visually | DL5
27:14
3Blue1Brown
Рет қаралды 4,7 МЛН
The Science of Illusions with Teller
50:10
StarTalk
Рет қаралды 26 М.
Cross Entropy Loss Error Function - ML for beginners!
11:15
Python Simplified
Рет қаралды 41 М.
Intuitively Understanding the Shannon Entropy
8:03
Adian Liusie
Рет қаралды 110 М.