What is Kurtosis? (+ the "peakedness" controversy!)

  Рет қаралды 114,827

zedstatistics

zedstatistics

Күн бұрын

Пікірлер: 96
@maryanncarrick7903
@maryanncarrick7903 4 жыл бұрын
Once again, you make a potentially confusing topic crystal clear! Thank you.
@anantmulchandani709
@anantmulchandani709 4 жыл бұрын
Thanks a lot for clearing this controversy! I'd been continuously scratching my head over the past few hours.
@_hhbk2128
@_hhbk2128 Күн бұрын
My textbook had it all wrong. Thanks for clarifying.
@hilarytaber6138
@hilarytaber6138 4 жыл бұрын
Thank you so much for doing these videos! I'm taking a statistics class for my master's degree, and I desperately needed a review of the terms! Your videos explain the terms simply and clearly.
@peterwestfall6924
@peterwestfall6924 5 жыл бұрын
Actually, this issue should not be called a "controversy." The word "controversy" suggests that there are valid points on either side. But since there is no logical argument connecting Pearson kurtosis to "peakedness," and copious logical arguments (mathematical theorems, in fact) connecting kurtosis to tail weight, it is settled fact, rather than controversy: Kurtosis measures tail weight, period.
@peterwestfall6924
@peterwestfall6924 5 жыл бұрын
Yes, DeCarlo has some good points inside the paper. But I believe he (and especially the journal Psychological Methods since they published it) have done more harm than good, considering the first sentence of the abstract, which simply parrots the incorrect "peakedness/flatness" meme. I have seen many cites to, and direct quotes of, that first sentence in other published works by researchers worldwide in diverse disciplines. After all, when you Google a paper, that is what you see - the abstract. You might not get access to the paper, but you can typically see the abstract. And given the inflated stature of the journal Psychological Methods, people naturally assume that they publish things that are true and useful. Sadly, this does not seem to be the case. DeCarlo never supported that first sentence at all in the article. You would think that a respectable journal would require the statements in the abstract (particularly the first sentence of the abstract!) to be supported in the article.
@peterwestfall6924
@peterwestfall6924 5 жыл бұрын
Sure! @Britton Leggett
@amitabhawasthi5082
@amitabhawasthi5082 4 жыл бұрын
First time it has ever happened that I learnt something in Mathematics and the person who published it is available on comments. I would be lying if I don't admit that this certainly sparks more interest and curiosity :)
@sivasu07
@sivasu07 4 жыл бұрын
@Britton Leggett can u share to my mail id pls
@shikhapathak2906
@shikhapathak2906 3 жыл бұрын
@Britton Leggett can i get this paper?
@peterwestfall6924
@peterwestfall6924 6 жыл бұрын
Nice. I might suggest writing the formula as kurtosis = (1/n) Sum(z_i^4), though, because the formula is much simpler. It is also much easier to explain what is going on in terms of z-values. In particular, the comment about .5^4 being small now refers to a number that is .5 standard deviations above or below the mean. Also, it is helpful to have a visual image of what is going on with kurtosis. Here is a good one: Take all the z-values for your data, and raise each one to the fourth power. Now, plot them on a number line from 0 to infinity. If there are repeats, just stack them on top of each other, like a dot plot. Considering each dot on the plot as a physical object with a common mass (like 1 gram), the number line balances at the mean, which is the kurtosis. Now, place a fulcrum at 3.0 on the line, which is the kurtosis of the normal distribution. If the number line falls to the right, then the kurtosis is more than the normal distribution. If it falls to the left, then the kurtosis is less than the normal distribution. Now, what causes the plot to fall to the right? The "peakedness" (dots near 0), or the tails (dots far to the right)? This representation shows the complete illogicality of the "peakedness" interpretation, and also the complete logic of the "tailedness" interpretation. It also dispels the more correct, but still incorrect statement that higher kurtosis means more mass in the tail: As the fulcrum example shows, it is not necessarily more mass, it is also the *placement* of the mass that causes the number line to tip to the right. You can have very little mass very far away that will cause high kurtosis. As Archimedes said, "give me a place to stand, and I shall move the Earth." Numerous counterexamples to the silly "peakedness" notion abound. Take the beta(.5,1) distribution, for example. It has kurtosis less than a normal distribution, and is therefore supposed to be "less peaked" than the normal distribution. But the beta(.5,1) distribution is infinitely peaked! It is strange that people ignored Kaplanski; thanks for reminding everyone. I think the real problem is that RA Fisher repeated Pearson's erroneous interpretations through all revisions (through 1964) of his classic text, despite Kaplanski. And who was going to argue with Fisher? More recently, papers have tried to somehow sneak the peak in, seemingly not wanting to contradict Fisher. For example, the very first sentence of the abstract of DeCarlo's (1997) paper in the esteemed journal "Psychological Methods" is simply wrong in the same "peakedness" regard. Strangely, DeCarlo never supports the first sentence of the abstract in his paper! But often, people only read abstracts, and take them as correct and justified in the article, especially when the journal has such high regard as Psychological Methods. Psychological Methods should publish an errata, because that paper is doing more harm than good.
@kumarapillay3122
@kumarapillay3122 2 жыл бұрын
excellent analogy, thanks a lot!
@Undercover_FE
@Undercover_FE Жыл бұрын
Well done! I am still an undergraduate student, but the center of mass analogy is absolutely correct for mean calculation and has been useful in my geometric interpretation of these concepts. Your torque analysis, that is compensating tao via length, is very interesting. It seems that density of tails as definition is not entirely correct. I look forward to learning more. It's interesting since the mathematics here is quite nice, so I'm surprised of the lack of consensus.
@nicolehyden5039
@nicolehyden5039 3 жыл бұрын
I'm taking statistics for psychology (PSY 3010) right now at Southern Utah University and was REALLY struggling to understand this concept until I watched your video. Thank you SOOOO much!
@AlokPratapSingh4001
@AlokPratapSingh4001 2 жыл бұрын
Me too 😀
@juliecongress6278
@juliecongress6278 2 жыл бұрын
Justin is my "spirit animal". I love any description of a huge equation that includes "gross"! Thank goodness for Justin and Zedstatistics! I would never get through my class without it!
@christopherbarrett9900
@christopherbarrett9900 9 ай бұрын
Thanks for keeping it high level. The concept itself is a great place to start before diving in.
@SashaSkay
@SashaSkay Жыл бұрын
Thank you! I couldnt get it untill I've seen your video the "controversy" part helped a lot because that was the thing that confused me
@samuelilozumba7459
@samuelilozumba7459 2 жыл бұрын
I cant thank you enough for making these videos. God bless you.
@John5ive
@John5ive 3 жыл бұрын
excellent explanation. I think visualizing the graph of y= x^4 and how fast the points further away from zero grow helps describe this effect.
@peterwestfall6924
@peterwestfall6924 3 жыл бұрын
Yes, more specifically of the standardized score. See stats.stackexchange.com/a/481022/102879 for elaboration.
@htsaul1569
@htsaul1569 5 жыл бұрын
9:22 So why Excel's Kurtosis formula (=kurt) can return values less than -2?
@laymansmaths7538
@laymansmaths7538 4 жыл бұрын
A negative kurtosis means that your distribution is flatter than a normal curve with the same mean and standard deviation
@suspense_shorts
@suspense_shorts 4 жыл бұрын
@@laymansmaths7538 He is talking about the possible range of excess kurtosis as explained in the video. It lies in [-2 , inf ]
@peterwestfall6924
@peterwestfall6924 3 жыл бұрын
@@laymansmaths7538 The point of the "peakedness" controversy is that your explanation is incorrect. You can have an infinitely peaked distribution with negative kurtosis. (See beta(.5,1) for example.) Negative kurtosis means less tailweight than the normal distribution. It tells you nothing about the shape of the peak.
@quillaja
@quillaja 3 жыл бұрын
One odd thing is that the R package "moments" seems to use population version of kurtosis instead of one of the sample versions. The "e1071" package lets you choose from 3 types, g1, b1, and G1 for skewness and g2, b2, and G2 for kurtosis. I was writing some of these functions in Go for a little project, checking my implementation against R, and found the oddball result from "moments" quite confusing.
@quantaali543
@quantaali543 5 жыл бұрын
The best explanation of Kurtosis ....
@utpalmohane1416
@utpalmohane1416 2 жыл бұрын
Justin I have a request. What books did you study to give this in-depth knowledge of the fundamentals of statistics If there is a single book which teaches the way you do, then kindly suggest me.
@eduardoa.rebollartellez6578
@eduardoa.rebollartellez6578 4 жыл бұрын
Thank you so much. A very clear explanation. You are a star!
@zhuwenhao4852
@zhuwenhao4852 Жыл бұрын
8:08. so the three distributions here have the same mean and variance, right?
@tvvt005
@tvvt005 Жыл бұрын
Is it right to consider kurtosis as a comparison of modes? If it’s the peakedness of data, the peakedness is the mode right? In a regular frequency distribution that is
@peterwestfall6924
@peterwestfall6924 Жыл бұрын
A main point of the video is that kurtosis is *not* peakedness, thus kurtosis doesn't tell you anything about the modes, either.
@mrvy
@mrvy 4 жыл бұрын
Hey Justin - quick question , what statistic is then used more to necessarily describe the peaks as opposed to the tails?
@peterwestfall6924
@peterwestfall6924 4 жыл бұрын
For some distributions, you could (1) find the density of the standardized variable, and (2) evaluate the second derivative of this density at a central position of the distribution (mean, median, mode etc.) This would give you a measure of curvature at the central position, a measure of peakedness.
@pabitrapradhan721
@pabitrapradhan721 4 жыл бұрын
What is low kritosis and what's its value
@Mohan-social1
@Mohan-social1 3 жыл бұрын
Simple and good explanation. Thanks!
@abcdxx1059
@abcdxx1059 4 жыл бұрын
you are wrong its not call controversy or controversy its called controversy
@zedstatistics
@zedstatistics 4 жыл бұрын
This is in my top 10 comments for sure.
@sairajpatki133
@sairajpatki133 2 жыл бұрын
What happens in the scenario where outliers are eliminated from the data and then kurtosis is calculated? Considering the outliers were the ones causing the alteration in the tails
@peterwestfall6924
@peterwestfall6924 2 жыл бұрын
If you delete the outliers, then the kurtosis will be smaller. The best way to understand this is to compare for yourself using a real data set. Now, this does not mean you should delete outliers. Outliers, if really part of the data-generating process under consideration, provide valuable information about that process. They might be the most important thing. For example, if a new drug causes a rare but extreme life-threatening response that manifests in an outlier, you certainly do not want to sweep it under the rug. High kurtosis simply alerts you to the presence of such outliers
@sairajpatki8448
@sairajpatki8448 2 жыл бұрын
@@peterwestfall6924 Thanks Peter! Yes, I completely agree that removing outliers might not be an appropriate strategy for many studies.
@AnshumanPant
@AnshumanPant 7 ай бұрын
5:35 i am not able to find the derivation for bias correction of sample mean and sample standard deviation in estimation of sample kurtosis. Can anyone suggest where to look?
@melam4522
@melam4522 Ай бұрын
I’m looking for it too. If you’ve found anything please share it🙏
@sunitgautam7547
@sunitgautam7547 4 жыл бұрын
Thank you for the amazing explanation!
@user-zy8vk2ip8o
@user-zy8vk2ip8o 2 жыл бұрын
Why are other videos giving values of -7 to +7, and even -3 to +3 for normal range for Kurtosis??? Where do those numbers come from?
@peterwestfall6924
@peterwestfall6924 2 жыл бұрын
It is amazing how easily bad information profligates. Excess kurtosis cannot possibly be lower than -2, and ordinary kurtosis cannot possibly be less than 1. As far as upper "bounds", there can be no magic numbers. Just as for effect sizes, it is a question of degree: larger kurtosis values indicate greater tail extension (or weight).
@user-zy8vk2ip8o
@user-zy8vk2ip8o 2 жыл бұрын
@@peterwestfall6924 Thanks a lot 😊😊
@youreconomicsguide8488
@youreconomicsguide8488 4 жыл бұрын
Love from Kerala 💝💛🇮🇳
@frenchmarty7446
@frenchmarty7446 2 жыл бұрын
Ok, but... If kurtosis has increased by increasing the frequency of outliers *without* changing the variance, isn't it required to have a peaked distribution to cancel that out? If I increase the number of outliers without changing anything else the variance will necessarily increase.
@peterwestfall6924
@peterwestfall6924 2 жыл бұрын
No, kurtosis is more about extent than frequency of outliers. Outliers refer to rare extreme values. If you add a lot of them, then they are no longer rare or extreme, and kurtosis might even decrease. You can find examples online in the cross validated stackexhange posts of families of distributions where the distributions become flatter as kurtosis increases. Thus, there is no mathematical connection between kurtosis and peak. In the other hand, when you draw a histogram of data with high kurtosis, the outlier(s) stretch the horizontal axis greatly, making most of the data occupy a narrow vertical band. This phenomenon helps the erroneous "peakedness" interpretation to persist. But it is all caused by the outlier(s), not the kurtosis, as I stated in my 2014 TAS paper. Note also that the beta(.5,1) distribution is infinitely peaked but has very low kurtosis. Again, kurtosis is not related to the peak.
@frenchmarty7446
@frenchmarty7446 2 жыл бұрын
@@peterwestfall6924 You didn't actually address my question... Take a distribution with a mean of zero and two outliers (-10 and +10). Now double the magnitude of the outliers (-20, +20). Kurtosis and standard deviation necessary increase. What further change can I make to the distribution to decrease the standard deviation *without* making the distribution more peaked? A beta distribution can have >1 peak and skewness, so it's not a very good example. Increasing skew necessary increases kurtosis, you're swaping moderate outliers on both sides with extereme outliers on one side. I would argue you are making the distribution more peaked, just on one side. And yeah of course you can have a peaky distribution with low kurtosis that's not the point (no pun intended). It's helpful to think of kurtosis and peakedness going together because it prevents you from confusing variation (magnitude) and kurtosis (shape/convexity).
@peterwestfall6924
@peterwestfall6924 2 жыл бұрын
Right, your example shows that it is about extension more than frequency. As far as beta goes, just reflect it around the x=0 axis and you have an infinitely peaked distribution with low kurtosis. So skewness is irrelevant to the point. And if high kurtosis is supposed to indicate peakedness, then it certainly is not beside the point that low kurtosis distributions can be infinitely peaked (and, by the way, infinite kurtosis distributions can be perfectly flat over 99.99999999% of the observable data as well). It is not helpful to think of kurtosis and peakedness as going together when they are not connected. Why promote falsehoods?
@frenchmarty7446
@frenchmarty7446 2 жыл бұрын
@@peterwestfall6924 "Right, your example shows that it is about extension more than frequency" That is a matter of perspective. I could also say I increased the frequency of extreme outliers and decreased the frequency of moderate outliers. A distribution is characterized by both, it is literally a graph of frequency vs extension. "you can have an infinitely peaked distribution with low kurtosis." Ok. But so what? That's not how causality works. Peaknedness has multiple causes, kurtosis is one of them. A peaked distribution with low kurtosis doesn't actually prove anything any more than a non-smoker who dies of cancer... "You can have infinite kurtosis and the distribution will be perfectly flat for >99.99% of the observable area" That is the very meaning of peaknedness. A tight cluster of data. You're literally describing a highly peaked distribution. "Why promote falsehoods?" Peakedness explains the distinction between kurtosis and deviation. It's easier to remember peaknedness has multiple causes than to somehow understand kurtosis without distinguishing it from the other cause of extreme values.
@peterwestfall6924
@peterwestfall6924 2 жыл бұрын
It sounds like you are saying "we can often associate kurtosis with peakedness" or something like that. The annoying fact remains that there is no mathematical connection between the shape of the peak and kurtosis, while there are many mathematical theorems connecting kurtosis to tailweight. Do you have a theorem? Please state it. I am not sure that people would generally agree that a distribution that is perfectly flat over 99.999999% of the observable data is "peaked". Sure, the data occupies a narrow band relative to potential data, but the cause of the appearance is the odd extreme outlier. So again, the outlier (tail) is the cause. Some have argued that kurtosis measures peakedness because you can increase kurtosis by spiking the center. Two problems with this: first, the spike increases the extreme z-scores, and those are what cause high kurtosis (just look at the definition: small z-scores contribute virtually nothing). Second, to generalize from "adding a spike implies high kurtosis" to "high kurtosis implies a spike" is simply a logical fallacy, similar to generalizing from "all bears are mammals" to "all mammals are bears." But you say that kurtosis is a cause of peakedness, and that sounds like math. Please, if you think you are right, provide a mathematical statement if the form "higher kurtosis implies ______ and lower kurtosis implies _______." Know, however, that there are already many very useful, enlightening mathematical results of this type that are readily accessible, and maybe you should read them first before stating something.
@lakhansinghsajwan
@lakhansinghsajwan 4 жыл бұрын
Nice explanation cheers :)
@m.c.degroffdavis9885
@m.c.degroffdavis9885 4 жыл бұрын
All about the base, it's all about the base.
@mohitthareja6608
@mohitthareja6608 2 жыл бұрын
No peakedness.
@AJ-et3vf
@AJ-et3vf 2 жыл бұрын
Awesome video! Thank you!
@yourswimpal
@yourswimpal 3 жыл бұрын
Is it possible to tell just from the value of kurtosis whether it is a bimodal or normal distribution ?
@proskhok1763
@proskhok1763 4 жыл бұрын
What is the relationship between Skewness and Kurtosis? How to explain this relationship?
@couragelawson4737
@couragelawson4737 4 жыл бұрын
Great explanation. THANKS
@olowoakequwam5454
@olowoakequwam5454 2 жыл бұрын
About the sample excess you mentioned in this video I want a differentiate between the one you mentioned and this that am going to state now n-1/(n-2)(n-3)*(n+1)*ekurt +6 Pls🙏
@Competent00
@Competent00 3 жыл бұрын
This was really very helpful
@iyersri1053
@iyersri1053 3 жыл бұрын
Thanks. Very well explained. Understood that the numerator is actually largely influenced by the outliers- but when we say the observations near the mean do not impact the numerator as much- would it not actually lead to the higher peak and is that the reason that higher kurtosis would lead to a higher peak? I am just working on some risk topics and I am not a pro at this. It may be not right but was something that came to my mind.
@peterwestfall6924
@peterwestfall6924 3 жыл бұрын
Repeating the logic: "Would it [the fact that the observations near the mean do not impact ...] not actually lead to a higher peak ..."? This statement seems to contradict itself. In any event, the fact that higher kurtosis does not imply a higher peak has been known since a paper that was published by Kaplansky in 1945: Kaplansky, I. (1945). A Common Error Concerning Kurtosis, Journal of the American Statistical Association, 40, 259.
@connorlaplante9318
@connorlaplante9318 3 жыл бұрын
Okay so now what if I did the 4th root of Kurtosis? Cause if you do the sqrt of variation you get stdev
@peterwestfall6924
@peterwestfall6924 3 жыл бұрын
Here is one fact about that. Kurtosis is the average of the z-values, each taken to the fourth power; hence it is the "point of balance" of the histogram of the z^4 values. Thus, the fourth root defines where this point of balance is, in terms of the raw data. For example if the kurtosis is 81 (= 3^4), then the distribution of the z^4 values balances at 81. In terms of the raw data, this point of balance occurs where the data are 3 (= 4th root of kurtosis) standard deviations away from the mean.
@hansisbrucker813
@hansisbrucker813 4 жыл бұрын
Why not describe the distribution curve with a formula and further describe things using its derivatives?
@arayafesseha3818
@arayafesseha3818 4 жыл бұрын
Great lesson
@mouradmadouni8277
@mouradmadouni8277 3 жыл бұрын
Thank you very much!
@harshakarricinema3295
@harshakarricinema3295 5 ай бұрын
im from bharat thank q for ur lesson
@claudio_rosa.consultoria5074
@claudio_rosa.consultoria5074 4 жыл бұрын
Very nice video!
@oscardelarosa3324
@oscardelarosa3324 9 ай бұрын
You are the best 🎉🎉
@ankitgupta8797
@ankitgupta8797 3 жыл бұрын
wouldn't the term ''longer tails'' be more meaningful than ''fatter tails''?
@zedstatistics
@zedstatistics 3 жыл бұрын
Good question. I guess since the tails go off to infinity in all of these distributions, there is no real distinction on "length".
@peterwestfall6924
@peterwestfall6924 3 жыл бұрын
Good point. Fat tails are often misinterpreted as "a lot of data in the tails." Actually, even with fat tails, there is little data in the tails. Rather, the tails effectively extend farther. For example, a value that is ten standard deviations from the mean might be observed when you have fat tails. The reason for the term "fat" is not because of a lot of data; rather, it is because the density is higher than that of the corresponding normal distribution density. But both densities are extremely close to zero, thus 'fat tails' are not easily observed in a density plot. In a histogram 'fat tails' is manifested by an outlier or two. Such outliers stretch the range of the horizontal axis, making the bulk of the data occupy a narrow range, and this is what caused the erroneous "peakedness" interpretation to persist. It is not peakedness, it is outliers. In other words, tail extension, or heavy (fat) tails. A much better graph for visualizing heavy tails is the normal q-q plot. Bottom line: " tail extension" is indeed a good alternative phrase to characterize the "heavy tails" concept.
@ThuanNguyen-yk7jx
@ThuanNguyen-yk7jx Жыл бұрын
thank you
@villwang8005
@villwang8005 4 жыл бұрын
can curtosis be exactly 0?
@asinegaasinega
@asinegaasinega 3 жыл бұрын
min kurtosis is 1, not 0 (i think, uniform distribuition has kurtosis of 1)
@faisaltasleemft
@faisaltasleemft 3 жыл бұрын
Beneficial video 👍
@Uwindow
@Uwindow 6 жыл бұрын
Thank you for your fantastic videos. Can you please do a video on probability distributions and how to check if a distribution is normal, mesokurtic, platykurtic and leptokurtic (if possible on STATA) and what they mean for data?
@JJ-fb2lp
@JJ-fb2lp 4 жыл бұрын
lol at sigma squared squared haha..good job explaining this though!
@spencerantoniomarlen-starr3069
@spencerantoniomarlen-starr3069 4 жыл бұрын
I understood the end only because you didn't try to "prove" it haha
@DuyNguyen-quantitative
@DuyNguyen-quantitative 11 күн бұрын
Nice
@arcticwolf6402
@arcticwolf6402 3 жыл бұрын
Hi. Quick tip. :) 99% of the time, when people search for this kind of stuff, and they see the relevant video that is 16 minutes long, they are NOT going to click on it. :) They want something that is as short and right to the point as humanely possible.
@searchmeinyoutube
@searchmeinyoutube 3 жыл бұрын
Even Academics have controversies.
What is skewness? A detailed explanation (with moments!)
15:12
zedstatistics
Рет қаралды 137 М.
Гениальное изобретение из обычного стаканчика!
00:31
Лютая физика | Олимпиадная физика
Рет қаралды 4,8 МЛН
We Attempted The Impossible 😱
00:54
Topper Guild
Рет қаралды 56 МЛН
СИНИЙ ИНЕЙ УЖЕ ВЫШЕЛ!❄️
01:01
DO$HIK
Рет қаралды 3,3 МЛН
What are degrees of freedom?!? Seriously.
27:17
zedstatistics
Рет қаралды 203 М.
Bayes theorem, the geometry of changing beliefs
15:11
3Blue1Brown
Рет қаралды 4,6 МЛН
USEFUL OR TRASH? Kurtosis and Skewness [WATCH THIS, BEFORE YOU USE]
13:11
What are p-values?? Seriously.
26:00
zedstatistics
Рет қаралды 186 М.
What is COVARIANCE? What is CORRELATION? Detailed video!
21:00
zedstatistics
Рет қаралды 136 М.
Kurtosis (for the @CFA Level 1 exam)
7:38
Let me explain
Рет қаралды 9 М.
What are "moments" in statistics? An intuitive video!
15:44
zedstatistics
Рет қаралды 219 М.
Teach me STATISTICS in half an hour! Seriously.
42:09
zedstatistics
Рет қаралды 2,9 МЛН
Гениальное изобретение из обычного стаканчика!
00:31
Лютая физика | Олимпиадная физика
Рет қаралды 4,8 МЛН