Review and intuition why we divide by n-1 for the unbiased sample

Review and intuition why we divide by n-1 for the unbiased sample | Khan Academy

Рет қаралды 350,412

11 жыл бұрын

Courses on Khan Academy are always 100% free. Start practicing-and saving your progress-now: www.khanacademy.org/math/ap-s...
Reviewing the population mean, sample mean, population variance, sample variance and building an intuition for why we divide by n-1 for the unbiased sample variance
Practice this lesson yourself on KhanAcademy.org right now:
www.khanacademy.org/math/prob...
Watch the next lesson: www.khanacademy.org/math/prob...
Missed the previous lesson?
www.khanacademy.org/math/prob...
Probability and statistics on Khan Academy: We dare you to go through a day in which you never consider or use probability. Did you check the weather forecast? Busted! Did you decide to go through the drive through lane vs walk in? Busted again! We are constantly creating hypotheses, making predictions, testing, and analyzing. Our lives are full of probabilities! Statistics is related to probability because much of the data we use when determining probable outcomes comes from our understanding of statistics. In these tutorials, we will cover a range of topics, some which include: independent events, dependent probability, combinatorics, hypothesis testing, descriptive statistics, random variables, probability distributions, regression, and inferential statistics. So buckle up and hop on for a wild ride. We bet you're going to be challenged AND love it!
About Khan Academy: Khan Academy offers practice exercises, instructional videos, and a personalized learning dashboard that empower learners to study at their own pace in and outside of the classroom. We tackle math, science, computer programming, history, art history, economics, and more. Our math missions guide learners from kindergarten to calculus using state-of-the-art, adaptive technology that identifies strengths and learning gaps. We've also partnered with institutions like NASA, The Museum of Modern Art, The California Academy of Sciences, and MIT to offer specialized content.
For free. For everyone. Forever. #YouCanLearnAnything
Subscribe to KhanAcademy’s Probability and Statistics channel:
/ @khanacademyprobabilit...
Subscribe to KhanAcademy: kzbin.info_...

Пікірлер: 114

@albertorivero1753 8 жыл бұрын

starts at 5:05

@thesevenkg 3 жыл бұрын

ur a hero

@danielroa748 Жыл бұрын

whos this man? he knows so much and explains so majestic. I wonder why he does not have a statue in the main square of my city ? he deserve a few

@adarshtiwari7395 Жыл бұрын

After 3 videos, I finally understood this n-1. Basically when we consider a sample from our population and calculate the mean for it, it may or may not be as close to the overall population mean (which is thr mean that matters) so to lower the possibility of a highly distinct sample mean/variance we use n-1 to reach at least near the population mean...

@ordiv12345 10 ай бұрын

What if you take the 3 highest values?

@baterwottle2400 10 ай бұрын

⁠@@ordiv12345in the sum it is squared, so it actually doesnt matter because a negative^2 is positive and a positive^2 is positive, take (1-5)^2 and (9-5)^2 for example, they make the same value

@alkalait 11 жыл бұрын

Thanks for this video Sal. Though intuitive and true, some viewers might find this approach (to dealing with the "bias" in the estimator) heuristic. For instance, one might argue "why not n-2 and so on...". If you decide to invest a bit more in this stats playlist, I hope you'll get to deeper concepts like degrees of freedom of estimators, which lie at the heart of the concept of this video. Please don't take this as criticism; the video is in the right direction :)

@nocturnalvisionmusic 8 ай бұрын

11 years later on 11/27/23, but thank you for introducing me to *heuristic.* Really neat definition. 😌💓🖋️

@alkalait 11 жыл бұрын

By the way for anyone curious, the "degrees of freedom" of some statistic, say, a sum across the x's, is n because this number has n ways or parameters (the x's themselves) by which it can vary. Using this simple notion of "freedom", you can state the d.o.f.s of the any statistic that is written in terms of some data points. As another example, the sum across the x's squared also has n d.o.f.s

@markbro2 11 жыл бұрын

The most common question seems to be why n-1 and not n-2 or n-3424342 (any other number). The way I understand it comes from the definition of unbiased estimators (look it up on wikipedia), in a nutshell an unbiased estimator is one whose expected value equals the value it is estimating. n-1 is known as bessel's correction (also on wikipedia). Here you can see that E[S^2]=sigma^2, hence it is unbiased. This makes sense; if you take enough samples and average them, you get true pop value.

@EntropicalNature 11 жыл бұрын

It has to do with the fact that on an interval with N points there are N-1 smallests subintervals. Consider for example the interval [1,4] on the natural number line in which case N=4 You can subdivide it only in [1,2] [2,3] [3,4] which is 3 not 4 smallest subintervals.

@vowcap5338 2 жыл бұрын

yeah but why are we dividing by the number of subintervals instead of the number of data point, dividing by the data points would make more sense.

@voodooman08 11 жыл бұрын

Being biased or unbiased can be formally prooved. For sample standard deviation to be unbiased means that its mathematical expectation should be equal sigma. If we divide just by n then this condition wouldn't be met. And if we divide by (n-1) the mathematical expectation of our sample standard deviation would be exactly equal sigma (st.dev. of population). For more details and for the mathematical proof look for Bessel's correction.

@flotars 9 жыл бұрын

So why minus - 1? Why not - 2 ? Or minus 6,345 % ? This is still not an explanation of the n - 1 :-(.

@mesplin3 9 жыл бұрын

The reason why one divides by n-1 instead of n is because of the definition of bias and unbiased. An unbiased estimate has the property such that the expected value (average) of the estimate is equal to the desired parameter. Assuming that the population has a variance of q^2, then the expected value of (1/(n-1))(Sum from i=1 to infinity of (X_i - X bar)^2 = E(S^2) [unbiased variance] also equals 1/(n-1) [Sum from i=1 to infinity of (q^2 - n(q^2 /n))] = q^2. If it were not n-1, the math wouldn't work. ...that's all. It's hard to explain without Latex, sorry. Source: My textbook.

@mesplin3 9 жыл бұрын

Or maybe just think of partitioning things into groups. For example, there are n groups, so I need n-1 dividers. ... Not really a proof, but maybe it's more intuitive.

@rlayne96 9 жыл бұрын

Michael Esplin i think you use n-1 because you lose one degree of freedom by estimating the mean(x-bar)

@rainshark94 8 жыл бұрын

+f lotars Watch video D1hgiAla3KI

@foglite4335 6 жыл бұрын

This explains why it's (n-1) kzbin.info/www/bejne/eZ-TpJ6Df5d5acU

@shdi7312 Жыл бұрын

A very interesting and important discussion. I made a break in the middle and thought about it by myself. I have a rather short explanation: If the sample size n is very small, such as 3, the variance calculated for the sample has more chance to be very different from the actual variance. The smaller the n is, the more effect has this '-1' on the result. Why do we use '-1' and not some other values like '-2', I think it is just a tradition. For the smallest sample size of 2, this unbiased variance can still be calculated. However, it is not really purely 'unbiased', just relatively 'unbiased'.

@user-jk1gb7wm6z Жыл бұрын

what happens if we pick samples from far right side in above example ? n + 1 ?

@manmoon9648 Жыл бұрын

@@user-jk1gb7wm6z honestly i only understood this through mathematical proof

@clancym1 8 жыл бұрын

this does not give an explanation for why it is exactly n-1.

@galahadthreepwood 8 жыл бұрын

Yeah - why not n-2?

@23561avatar 8 жыл бұрын

It never claimed to give mathematical proof. The title says it's a video to give you an intuitive understanding of n-1, so it's going to be superficial. If you want proof, you should see the video "Proof that the Sample Variance is an Unbiased Estimator of the Population Variance" on the channel of jbstatistics. Anyway, it's true that sometimes n-2, or n-3 or n-4 etc. are better than n-1. However, ON AVERAGE, n-1 is the best estimator of the population variance. The following simulation is a good demonstration of this www.khanacademy.org/computer-programming/will-it-converge-towards-1/1167579097

@foglite4335 6 жыл бұрын

This explains why it's (n-1) kzbin.info/www/bejne/eZ-TpJ6Df5d5acU

@pl4t1n00b 3 жыл бұрын

Chi distribution mean squared is n-1

@ordiv12345 10 ай бұрын

What if you take the 3 highest values?

@ordiv12345 10 ай бұрын

Why do we not use |Xi - x̄ | instead of (Xi - x̄ )² ?

@R9dyhQ.RDPD99yF0gDmSZqC 6 жыл бұрын

I had the intuition that overestimation and underestimation would compensate each other. Why is it not the case?

@AJ-et3vf 2 жыл бұрын

Awesome video! Thank you!

@darktealglasses 2 жыл бұрын

Much better than what my school teacher taught me

@Nigelfarij 10 жыл бұрын

If you want a more technical explanation/proof, Wikipedia Bessel's Correction. This video has some good intuition though.

@se0271 Жыл бұрын

So instead of the sample lying somewhere much lower than the true population mean, what if it's lying much higher? Would it be correct to use n+1 instead of n-1 in order to deliberately make the sample variance smaller?

@michaelfarrell4376 Жыл бұрын

No, because the population mean is still outside of your sample (doesn't matter whether it's too big outside or too small outside), so your sample variance would still be smaller than your population variance, so you'd still use n-1

@davidpastor2130 6 жыл бұрын

what if all the samples you took were greater than the mean? then you would be overestimating even more if you divide by n-1

@scienceblossom6197 5 жыл бұрын

(I had the same question, and I've copy pasted this from someone else which answers this question:) That's not true, because you "square" the difference between sample mean and the variables to get the sample variance. For this reason the average of the difference between your measurements and the sample mean "are always closer", than to the real mean and that's what he wants to explain and why they decrease the value of the numerator. (Also here's the empirical evidence for the why it's the "n - 1" that is in the denominator:) kzbin.info/www/bejne/eZ-TpJ6Df5d5acU

@thetruereality2 3 жыл бұрын

yes exactly....

@hubert155 2 жыл бұрын

Not really , that's the point of variance , since it is a square root then it the same if you take negative or positive numbers

@KevinsMathClass 10 ай бұрын

6:10 Why we divide by n - 1 in variance

@CandyLemon36 8 ай бұрын

This is essential reading. A book of similar stripe became a cornerstone in my personal growth. "Game Theory and the Pursuit of Algorithmic Fairness" by Jack Frostwell

@anglaismoyen 11 жыл бұрын

Say there's a population with a known population mean, and you take N random values from it, is there a way to calculate a probability density of deviance of the sample mean from the population mean? I hope that was a coherent question.

@file4318 Жыл бұрын

Thank you very much for your video, it was very very good at explaining. But I have one more question, If descriptive statistics do not try to generalize to a population (since there is no uncertainty in descriptive statistics), then why does the sample standard deviation try to best estimate the population mean? Yet it is still considered a descriptive statistic

@vicofour 11 жыл бұрын

I don't get it.. What if you sample sits all the way to the right ? The unbiased variance would then be even further...

@eloy618 3 ай бұрын

The proof from 8:11 onwards does not seem rigorous. Correct me if I'm wrong. The sum of squares of the distances the 3 samples have to pop. mean, divided by 3, could potentially be an over-estimation of the pop. variance. Because you could have picked 3 samples that are all far from the pop. mean, whereas in the population there are points much closer to the pop. mean. Which means: even the sum of the squares of the distances the 3 samples have to the sample mean, divided by 3, is always smaller than the sum of squares of the distances the 3 samples have to pop. mean, divided by 3, it is still not enough to show that the sum of the squares of the distances the 3 samples have to the sample mean, divided by 3, is always smaller than the pop. variance.

@johnhendrickson2175 5 жыл бұрын

I would like to know why we use the square of the difference between x and xbar, and not the absolute value of the difference?

@busydying 6 жыл бұрын

I can't understand why we would underestimate variance in general this way. Let's take population [0, 10, 20] and its sample [0, 20]. They have the same mean 10, and variance of the population is (100 + 100 + 0) / 3, while variance of the sample is (100 + 100) / 2, so we overestimate the variance.

@Nathan-cz8uk 5 жыл бұрын

That's true - the idea is that ~on average~ dividing by n-1 will give the same value as the population variance. (And for that to happen, as you point out, some samples will give a larger sample variance.) Using your example, let's consider all possible samples, and all their sample variances. Note that if we use the n-1, for a sample of size 2 we'll actually divide by n-1=1, not 2. All possible samples are: {0, 0}, {0, 10}, {0, 20}, {10, 0}, {10, 10}, {10, 20}, {20, 0}, {20, 10}, {20, 20}. (Note that for the formula with the n-1 to work, the first and second numbers chosen have to be independent - as such, we need to allow the possibility that we choose the same number twice, hence {0, 0}, etc., as possible samples. In most real-world sampling scenarios, we don't allow this, but in such cases it's generally not an issue, since the population is so large [unlike this example] that the probability of selecting the same person twice is very small anyway, meaning the math isn't significantly affected.) If we compute s^2=((x1-xbar)^2+(x2-xbar)^2) / 1) for each of these samples, we get the following values: {0, 0} -> 0 ; {0, 10} -> 50 ; {0, 20} -> 200 ; {10, 0} -> 50 ; {10, 10} -> 0 ; {10, 20} -> 50 ; {20, 0} -> 200 ; {20, 10} -> 50 ; {20, 20} -> 0. Some of these sample variances are greater than the population's (which is 200/3 = 66.7) while some are less, but the key point - and the definition of an unbiased estimator - is that their average is exactly 200/3. (We can compute this: (0 + 50 + 200 + 50 + 0 + 50 + 200 + 50 + 0)/9 = 600/9 = 66.7)

@manupandit7645 2 жыл бұрын

Hi How is this S2 variance of sample different from the sigma squared /n formula ( population variance /n) which is also the sample variance thanks

@thegoonist 9 жыл бұрын

still dont get it. yes you would be underestimating it if u take the sample cluster below the mean. but if the cluster is above the mean? you would be overestimating it! seems arbitrary to me.

@TomBaumeister 8 жыл бұрын

+john smith That's not true, because you square the difference between sample mean and the variables to get the sample variance. For this reason the average of the difference between your measurements and the sample mean are always closer, than to the real mean and that's what he wants to explain and why they decrease the value of the numerator. The only thing I still don't really get is, why it's (n -1)..

@foglite4335 6 жыл бұрын

This explains why it's (n-1) kzbin.info/www/bejne/eZ-TpJ6Df5d5acU

@MrArtbyart 2 жыл бұрын

Because of the upper and lower boundaries, samples are biased to be less spread, compared to the population mean, which is typically more centralized.

@HelterMcSkelter 11 жыл бұрын

I GET IT! I had to work out the proof and think about it really hard, but I get it! I have an intuition for why n-1 makes sense! Message me with your questions, because I don't think I can explain it easily in the comment boxes.

@Twas-RightHere 6 жыл бұрын

Why should I expect the random selection of points to be an underestimate, when it's just as likely to be an overestimate?

@glavgad 11 жыл бұрын

I dont get it. Yes the error will be smaller, but why we dont divide by n-2, or n-3 or n-4 , etc...

@yeast4529 4 жыл бұрын

What if the sample mean is far greater than the population mean, then would you not divide by n+1 in order that your sample mean is not an overestimate?

@frankieli98 2 жыл бұрын

It would still be n-1 since the difference is always positive where squaring.

@jayraldbasan5354 6 жыл бұрын

So this means that the n-1 of the sample variance equation was just an arbitrarily chosen value because it's empirically closer to the actual population variance? Or is there any equation or a logical path in deriving the n-1? I kinda see that it's the former but kinda feel that there might be a theory that could explain why n-1 is the most appropriate and not any other value and that it's just a natural consequence of our math. Anyone who does have one, please tell me! Thank you for the video Khan Academy! It was very informative!

@LegoEddy 5 жыл бұрын

There is a reason why exactly n-1 gives the unbiased estimate, you can get it if you take a more theoretic look at the expectation value of the sample variance. If you do that, you mathematically can solve why it has to be n-1.

@ArchyWhite 11 жыл бұрын

Hey, help me resolve world economics. Bringing knowledge

@VGF80 4 жыл бұрын

Let's say a report comes out that mentions standard deviation. How are we supposed to know which formula was used to calculate that standard deviation.

@shivayshakti6575 2 жыл бұрын

But the same can be there for the other end where we would overestimate it?

@luciabreccia1901 7 жыл бұрын

Why isn't this video on the statistics playlist?

@pyguy9915 2 жыл бұрын

Is n-1 mathematically derived? Could we justify doing something else, e.g. using "0.85n" to build in conservativeness even for large n?

@ArchyWhite 11 жыл бұрын

Knowledge is the only hope for world peace. We must have save trench town. As an actual real world issues that can be mathematically save the world from this little island. If you you can do it!

@Quercuspalustris50 11 жыл бұрын

I don't know either, but I bet if you examine all possible sets of samples, for a given sample size and population, and compare their variance(s) against the population's variance, then the 'n-1' idea has a higher chance of being closer to the true variance than the plain 'n' idea. Further the "n-2" idea would probably have even a smaller chance than the "n-1". There's probably a cut off point too, where, for example, if your sample size has 99% of your population then n is better....

@frankieli98 2 жыл бұрын

How's life?

@MrCannabitz 9 жыл бұрын

I love you, fuck the rest of explanations on internet, this made me understand

@kanishkvishwakarma1057 4 жыл бұрын

8:40 I think you should not represent the true variance and the sample variance on the same number line you drew for the population points. Also the consequence of your putting them together is you're visualizing the distance between the sample variance and the population variance on the same number line, resulting in your conclusion that because the sample points are far from the population mean, the variance is far too. Ponder over it, you'll realize. Love your lectures BTW 😃

@kumudsharma007 Жыл бұрын

Good point. I had the same point. You better framed it into words

@arvinpillai681 5 жыл бұрын

Starts at 5.00

@carlneedsajob Жыл бұрын

thank you sal :4)

@glavgad 11 жыл бұрын

we cant But we divide by n-1 even if we have 10000 samples, what difference n-1 will make?

@scottlomagistro9550 3 жыл бұрын

I get the math.... What I don't get is how you're able to write with the drawing/annotation feature so freakin' nicely?!?!? Either you missed your calling as a steady-handed microsurgeon or there is some sort of stabilization assistance with the program you're using.

@varunahlawat9013 2 жыл бұрын

Doesn't explain the point sal, sample could've been among the higher than mu values only; in that case this would be completely opposite, we should've divided by n+1 then

@Euroliite 11 жыл бұрын

You said that the biased variance was an underestimate, so is it possible to overestimate?

@hardikvegad3508 3 жыл бұрын

n-2

@HelterMcSkelter 11 жыл бұрын

As n approaches N, s_n approaches sigma, but s_n-1 approaches something that is not sigma. So what gives?

@ArchyWhite 11 жыл бұрын

We all hold the key to our part save the world by using and combining knowledge to promote peace throughout the world. I'm starting it off as an inventor and entrepreneur

@Euroliite 11 жыл бұрын

Perhaps that tends to be overdoing it?

@Twas-RightHere 6 жыл бұрын

9:08 - You are just as likely to be overestimating, you just chose to pick the bottom points rather than the top ones. This offers literally NO explanation, let alone an intuitive one, as to why I should expect there to be a downwards bias.

@MrVpassenheim 5 жыл бұрын

I think we've all had teaching experiences where we've dropped the ball. It's just strange to see Khan Academy put it to video lol.

@jayadenuja3796 5 жыл бұрын

Whether it's the top set or bottom set, the VARIANCE will be an overestimate not the MEAN. I realise this may not be perfectly clear in the video but in order to get a true (unbiased) representation of the variance, you would need values that are very far apart and even then it wouldn't be an OVERestimate of the variance.

@beamboy14526 4 жыл бұрын

variance is the measurement of the distance between each point and the mean. It does not matter whether the mean of the sample is above or below the mean of the population. still, i do not understand why it is n - 1 rather than n - 2 or n -4 or so on

@blakeobeans 11 жыл бұрын

Tackle chance variability first.

@ParabhjyotSingh 3 жыл бұрын

by this logic it can be n+1 also ig

@liushaomin7004 4 жыл бұрын

The analogy you’re using is probably not very convincing/intuitive enough. Because there’s also a likelihood that the sample is over-estimating the population mean, so why don’t we divide it by n+1?

@origamigek 6 жыл бұрын

So I guess the biased variance is better if your sample is still close to the entire population

@hehehehhe2011 5 жыл бұрын

This is terrible. Still no explanation of why it is unbiased if using n-1.

@WGBraves24 11 жыл бұрын

n-1 D:

@CaseyVanBuren 9 жыл бұрын

Didn't say anything about n-1, misleading title.

@foglite4335 6 жыл бұрын

This explains why it's (n-1) kzbin.info/www/bejne/eZ-TpJ6Df5d5acU

@Piecakesman 11 жыл бұрын

thank you so fucking much for this!!!

@MrLullumbonum 11 жыл бұрын

FIRST!

@affan150 11 жыл бұрын

...

@benwearne542 Жыл бұрын

N-1 is "better", but it is still very flawed

@MrVpassenheim 5 жыл бұрын

NOT one of Khan Academy's shining moments. You're other video (thanks Dhiraj Budhrani) is MUCH better (with the simulation & a mathematical explanation!).