Haven't seen someone nail it better. Thanks a lot.
@PsychExamReview9 ай бұрын
You're welcome!
@aloistavier11713 ай бұрын
This is by far the best explanation I have found on this topic. Most other content are either too mathematical or they give the feeling they try to explain it but don't get it either. Thanks a lot for this !
@PsychExamReview3 ай бұрын
@@aloistavier1171 Glad you found it helpful!
@GabriEla-eb1sn11 ай бұрын
Your explanation is very clear! Thank you so much for your effort and willingness to share your knowledge with us!💚🙏
@PsychExamReview11 ай бұрын
My pleasure, glad it was clear!
@augierakow Жыл бұрын
Best explanation I’ve found so far. Thank you sir.
@PsychExamReview Жыл бұрын
You're welcome!
@NotEasyDude10 ай бұрын
Hands down,the best explanation
@CircadianFireАй бұрын
You have an incredible talent for teaching, I watched several hours of videos and looked at lots of forums before this and I didn't understand it very well (like I thought I understood it, but I knew I didn't). I wish more people could teach like this. Thank you so much!
@PsychExamReviewАй бұрын
@@CircadianFire Glad it was helpful, and thanks for the compliments!
@samircanoiriarte764611 ай бұрын
Finally!!! This is the best explanation on the topic. Thank you so much
@PsychExamReview11 ай бұрын
You're welcome, glad you liked it!
@stuck-in-a-catch-2211 ай бұрын
Great explanation, concise and easy to follow - explains missing gaps in an intuitive manner (helps significantly if you’re a curious/inquisitive individual lol).
@PsychExamReview11 ай бұрын
Glad you liked it!
@robomatt1600 Жыл бұрын
Thanks! This is one of the best explanations I have found so far for Bessel's Correction.
@PsychExamReview Жыл бұрын
Thanks, glad you liked it!
@sharan99936 ай бұрын
The under estimating explanation was the best intuitive guide to why we need to adjust. Thank you very much. Hope i can learn other statistics lessons from your channel.
@PsychExamReview6 ай бұрын
You're welcome, glad to hear it was helpful!
@qkloh68045 ай бұрын
This should be the standard explaination in all classes.
@brazhell5 ай бұрын
I want to thank you for that information, clear and precise.
@PsychExamReview5 ай бұрын
You're welcome!
@aviroopmitra535311 ай бұрын
very clear explanation! Got the logic, thank you very much
@PsychExamReview11 ай бұрын
You're welcome, glad to hear that!
@faresmhaya9 ай бұрын
I don't know what to say. I've been looking for an intuitive understanding for this and all that the data science channels can do is just say "degrees of freedom" without elaborating any further. I didn't expect a Psychology channel of all things to give me a better answer, but here we are! I don't know if I understand everything perfectly, bias=sigma²/2 wasn't fully clear to me, but I read your long ass answer in a comment below and it made it a bit more clearer, and I have nothing but respect for you for actually taking the time to write that answer. I have to ask though, why do we assume that n would go to infinity (taking the bias to zero) if we know that any population would have to have a finite number of elements? Edit: I guess some populations would be represented by continuous functions, so technically they can have an infintie number of elements, but that's not always the case. Secondly, if you can make another video, I'm curious to know why X bar is an unbiased estimator even though X bar will never be exactly equal to Mu.
@PsychExamReview9 ай бұрын
Glad it was helpful and that I was able to explain it more clearly than some other places. I'm also glad that taking the time to write that long ass explanation of the average bias was helpful to another person 😅 For n going to infinity, it probably would have been better for me to say as the sample n gets as large as possible; you're right that often the population would be finite. Once we have the full population N then the bias would be zero and we would no longer need to make Bessel's correction because in that case x-bar would be equal to mu. I do plan to make a video about the sampling distribution of the mean, which relates to your question. X-bar is an unbiased estimator for mu because the distribution of x-bar is centered around mu. So even though each individual sample x-bar may not equal mu, the average of all possible x-bars from all samples will be equal to mu. Hope this makes sense!
@faresmhaya9 ай бұрын
@@PsychExamReview Ah, that's referring to the notion of the "expected value" I assume. Yeah I can see how that works. Although if I'm being honest, I'm not sure if I intuitively understand why the same reasoning doesn't work for all other estimators (variance included). Why doesn't the average of all possible estimated varianes from all samples equal sigma²? I mean don't get me wrong, I get it with your explanation, but I don't know how to debunk the expected value reasoning without referring to your explanation and it feels like I should. Is it because estimated variances are inherently built on X bar, which is already an estimator, and that plagues them with an error that the average of all estimated variances cannot metigate? If my assumption is correct, can we conclude that any estimator that is built on top of other estimators is inherently biased?
@PsychExamReview9 ай бұрын
@@faresmhayaThat's exactly right, it's an unbiased estimator because the expected value is always equal to mu, even though individual samples may not be, and this is true even if the population the samples are drawn from is not normally distributed, due to the central limit theorem. But the distribution of the sample variance doesn't follow the CLT; it isn't normal and it depends on the population, so we can't really know the expected value without making assumptions (such as assuming the population is normally distributed). Basically the problem is that the amount of bias in our estimate of the variance depends on how much our x-bar differs from mu, but we don't know mu, and so our estimate of the bias due to x-bar also involves using x-bar. So we make some assumptions and use n-1, to try to get our expected value for the sample variance closer to the population variance, but it's not guaranteed that it will. I'll try to cover this in more detail in future videos, but generally you're right that whenever we have to use an estimated value in a calculation in place of a parameter (like using x-bar for mu here) this introduces bias or uncertainty into the result that we will need to correct for, usually by reducing degrees of freedom (just as we did here, replacing n with n-1). Hope this helps!
@statjackson5 ай бұрын
Great proof. Thank you for the perspective.
@julioaurelio27 күн бұрын
Excellent explanation.
@PsychExamReview27 күн бұрын
@@julioaurelio Thanks!
@MayankBhagya23 күн бұрын
Loved it. Many thanks!
@sksahil49978 ай бұрын
Great Intuitive Explanation. Thanks.
@PsychExamReview8 ай бұрын
Glad you liked it!
@thrashed57322 ай бұрын
great explanation, i'm gonna check your other videos on stats
@PsychExamReview2 ай бұрын
@@thrashed5732 Hope they're helpful, more coming soon!
@365days_of_reminding7 ай бұрын
Best explanation 🙌🏻🙌🏻
@preetiprajapat5369 ай бұрын
wow that is a clear Ans of this, Amazing💯
@garykeyvan6627 ай бұрын
Almost magical, Thanks a lot you saved my day good Sire.
@PsychExamReview7 ай бұрын
Glad to hear that!
@CliffSedge-nu5fv5 ай бұрын
This did better in 14 minutes what other videos try to do in 30 or more minutes.
@mohammedhafiz8522Ай бұрын
Thank you very much. Understood and it made sence.
@PsychExamReviewАй бұрын
Glad it was helpful!
@gopalijagude47864 ай бұрын
Very nice explanation
@PsychExamReview4 ай бұрын
@@gopalijagude4786 Thanks!
@forniteguruji94098 ай бұрын
Thank you so much!!!! You are to the point!
@PsychExamReview8 ай бұрын
You're welcome!
@Titurel10 ай бұрын
Even though the equations are very clear I don’t understand intuitively why taking two samples on average cuts the bias in half or why taking 3 cuts it by a third and ten cuts it by 1/10. Please clarify.
@PsychExamReview10 ай бұрын
Sure, I'll try. We can think of this bias relating to how bad our estimate of mu (and thus our estimate of deviations) can possibly be. If we think about using 1 score to estimate mu, it could be anywhere on the full range of x in the population. Now we can ask how much adding a 2nd score would improve our estimate of mu and our deviations. If we assume a normal distribution, the probability of any one score being above or below mu is 50%, but the probability of selecting 2 scores on the same side is only 25%. And the lower the first score is, the greater the probability the second score will be above that score and improve the estimate (though it could over-correct in the other direction). If our first score happened to be the lowest possible value of x, then any 2nd score could only move our estimate of x-bar closer to the true population mean or keep it the same. We can't be any more wrong in our estimate of mu; getting the same extreme low value again wouldn't change x-bar, and even if the 2nd score were the highest possible value of x, this would just bring x-bar to the population mean because the distribution is symmetrical. So a 2nd score could improve our estimate up to being exactly correct, with 0% chance of over-estimating mu. But as the first score falls closer to the true population mean, the probability that a 2nd score will improve the estimate of x-bar decreases, and the possibility of worsening the estimate starts to increase (because it's more and more likely to get values below the first score, and high values could now over-correct, giving us estimates that are farther above mu than the first score was below it). But the maximum amount it could possibly worsen the estimate is cut in half because the 2nd score will only move x-bar half its distance from the 1st score. So if the first score happened to be the true population mean (which we wouldn't actually know), a second score could only keep this the same or worsen this a maximum of 50% in either direction compared to how bad the estimate could possibly be with only one score by itself. So then if we imagine all the different possible combinations of 2 scores and the probabilities of their pairings, having a second score will improve some estimates and worsen others (depending on how close the first score was to mu) but cut the occurrence of extremes in half. The extreme estimates are still possible (2 extremely low or high scores together) but these are half as likely to occur as they would be if we had only picked one score to estimate mu. This continues as n increases. If the current x-bar differs from mu, then the probability an additional score will shift it in the correct direction is always greater than 50% (because more than half of scores in the population will be above or below that estimate) while the amount it might over-correct gets smaller and smaller (the 10th score can only pull the mean a maximum of 10% of its distance from the estimate using 9 scores, etc.). To give a concrete example; if I had 2 people, one with an IQ of 100 and one with 140, my estimate of mu (assume it's 100 in the population) could be off by 40 using just the 2nd score, but only off by 20 using both scores. If I had an average of 100 from 9 people and added a 10th at 140, my estimate would only move to 104, compared to possibly being off by 40 if I had used the single 10th score by itself. And as we get to higher sample sizes, the possibility of drawing more extreme scores is very low because there just aren't that many scores there. If we have a sample size of 10,000 we won't get all extreme high/low values on one side not just because it's unlikely but because there just aren't 10,000 scores there in the population, so very extreme values for mu start to become impossible to select. Hope this helps!
@Titurel10 ай бұрын
@@PsychExamReview Thank you for your wonderfully clear and thorough explanation. I was thinking it had to do with probability density but was having a hard time intuiting how each additional N would change sigma^2. You saved me a sleepless night. Ten points to Ravenclaw!
@PsychExamReview10 ай бұрын
@@Titurel Yeah, it's hard to intuit or imagine all possible cases, so I find thinking of the extreme cases and recognizing that the probability will fall between these on average can make it a little easier. Glad I could help, sleep is important 😂
@ct8veylm3kzj685 ай бұрын
Perfect clear. Thanx. Gracias
@PsychExamReview5 ай бұрын
Glad it helped!
@PauloPereira-wl7qf10 ай бұрын
Dang, that was really helpful! Thank you very much!!!!!
@PsychExamReview10 ай бұрын
You're welcome, glad I can help!
@sarthak-salunke8 ай бұрын
brother i m beginner please help me understand concept .what i know and thinking underestimate is value less than actual value and overestimate is above actual value , in video we underestimated and overestimated deviation of sample mean so when we consider value close to sample mean will give us low variance then actual when we consider entire set of dataset in general to estimate value under under or over category and when we observe bell curve for it specific sample most likely underestimate value fit in it is this you want to convey? does actual value refer to population mean ?
@PsychExamReview8 ай бұрын
Yes, we want to know the population variance, but we can't calculate this value directly because we don't know the true population mean and we usually can't measure the entire population in order to find it. So we can only use x-bar, or the mean of our sample. Using x-bar tends to give us an underestimate of the population variance so we increase this estimate slightly by dividing by n-1 instead of n. This doesn't guarantee a better estimate for all possible samples, but if we have a population that follows a normal distribution using n-1 will tend to give us a better estimate on average.
@adityapandey83194 ай бұрын
why did we cut our bias by half when considering sample size of 2?
@PsychExamReview4 ай бұрын
@adityapandey8319 Great question, I didn't fully explain this in detail here as it has to do with probability. Essentially, the lower the first score, the greater the probability the second score will fall above that point, leading to a less extreme average. By taking the average of 2 scores, extreme scores are still possible but half as likely to occur. For a more detailed explanation and examples, you can see some of my other replies to comments here. Hope this helps!
@zikrifisehaye3232 ай бұрын
THANK YOU!
@PsychExamReview2 ай бұрын
@@zikrifisehaye323 You're welcome!
@abdulkadirguven11733 ай бұрын
Thanks a lot.
@PsychExamReview3 ай бұрын
@@abdulkadirguven1173 You're welcome!
@alaminpriyodu5 ай бұрын
Love you MAN!!!❤
@PsychExamReview5 ай бұрын
Thanks!
@rohitdas4752 ай бұрын
Thanks
@PsychExamReview2 ай бұрын
@@rohitdas475 You're welcome!
@guillermootuama261610 ай бұрын
Great video, subscribing!😁
@PsychExamReview10 ай бұрын
Thanks, glad you liked it!
@theresaurus98204 ай бұрын
5:00 The assertion that the divisor n-1 has its reasoning in there only being n-1 independent deviations from the sample mean has never made sense to me because of the fact that the n-th deviation is still included in the numerator. Also, the decomposition of the variance into the estimated variance and the variance of the population mean (the bias in this case) doesn’t give an intuition for Bessel’s Correction either in my view; why should the estimated variance s^2 be the difference between the population variance with one score and the population variance with n scores? Nevertheless, thank you very much for this presentation.
@PsychExamReview4 ай бұрын
Great questions. The reason we include the nth deviation in the numerator but use n-1 in the denominator is that we need to increase our underestimate. If we didn't include the deviation in the numerator and then used n-1 it would be just like having a sample size 1 smaller and the result would still be an underestimate. As for the example of a sample size of 1 (with 0 variance), this is just to demonstrate the maximum amount of error since it would cause us to estimate the variance to be 0. It's not possible to have a variance below 0, and we also can't have a biased sample variance that's greater than the population variance (because the population includes all scores that could possibly be in the sample), so the maximum amount of error we could have for our estimate would be sigma squared, and this maximum bias will then decrease as n increases. Hope this is clearer, thanks for commenting!
@markkennedy97673 ай бұрын
Only thing i dont understand is why the bias is cut by a factor of n when taking samples of n. What is the proof of this.
@PsychExamReview3 ай бұрын
This has to do with the probability of randomly selected scores in a normal distribution (which is why I didn't explain it fully in this video). If we think about the probability of a randomly selected score falling anywhere above the mean, it would be 50%. But for two scores to both fall above the mean the probability would be 50%x50%, or 25%. All of the possible combinations of 2 scores will have different changes in probability when going from 1 score to 2 scores, but on average our bias will be reduced by half by adding a second score and using the average instead of using 1 randomly selected score. If we think of a single extreme score in the top 5% of the curve, for a score like this to occur twice (in a sample of 2) the probability would be .05x.05, or .0025 (0.25%). So the probability of getting an average this extreme is much lower. For less extreme scores the probability wouldn't change as drastically (for a score within one standard deviation above the mean the probability would change from about 34% for one randomly selected score to about 11.5% for both scores (0.34x0.34). And scores on opposite sides of the mean would balance out and lead to an average closer to the center of the distribution. The more extreme the first score is, the greater the probability that the 2nd score will be less extreme and pull the average closer to the center, reducing the bias in our deviations. So if the first score is in the top 5% of the distribution, the probability the 2nd score is somewhere below that point would be 95%. So the probabilities for where the average of 2 scores will fall in relation to the mean will differ from the probabilities for one score. Our average of 2 scores would be less likely to fall at the extremes and more likely to fall closer to the center of the distribution, reducing the bias in our deviations. This same logic would also apply to larger and larger samples (where extreme averages would become very unlikely to occur). I've also responded to a few versions of this question in the comments with some other ways of thinking about this that might be helpful to read. I hope this helps, thanks for commenting!
@bonob0123Ай бұрын
nice
@mansoor82282 ай бұрын
❤❤❤❤
@hannahminerva943911 ай бұрын
I could not understand why sample variance equals to population variation minus bias. Is it because the mean underestimates the sample variance so we have to add bias to the sample variance in order to reach population variance ?
@PsychExamReview11 ай бұрын
Yes, since the deviations tend to be underestimates using the sample mean instead of mu, the formula for the biased sample variance will likely be too low, so we could say the biased sample variance + the bias = the population variance, I've just started with a rearranged version of that equation
@hannahminerva943911 ай бұрын
@@PsychExamReview thank you very much!! This video and your reply helped me a lot :))
@benlee35456 ай бұрын
Hi Sir, at 2:27 I wonder you can elaborate this Overestimates and underestimate when the xbar is below or even higher than the population mean. I am not expert in this. So what you said may carry some implicit knowledge of this issue as you breeze through without much explanation.
@PsychExamReview6 ай бұрын
I probably should have given a concrete example to help make this clear. Let's imagine that mu, our population mean, is 10 (even though we can't really know this) and x-bar for our sample is 9. In this case, when we compare each score to x-bar to calculate the variance, any scores that are below 9.5 will be underestimates compared to the population mean, because they will be closer to 9 than 10. So a score of 8 is only 1 point away from x-bar, but if we were able to compare it to the population mean (10) it would be 2 points away. So the deviation from x-bar is smaller than from mu, meaning we have an underestimate. Scores above 9.5 would cause overestimates, because they are farther from 9 than from 10. But because x-bar is 9, more of the scores in that sample must be below 9.5, so we'll have more underestimates than overestimates when we add up all the deviations. The same situation occurs in the opposite direction if x-bar is greater than mu. If mu were 10 and our x-bar is 11, now scores above 10.5 would underestimate the deviation (closer to 11 than 10) and scores below 10.5 would overestimate (farther from 11 than 10). But again, for x-bar to equal 11 there must be more scores above 10.5, meaning more underestimates. The only time we won't have underestimates is when x-bar is exactly equal to mu, but since we can't actually determine mu we won't know when this has happened. It's rare, so we generally assume it hasn't happened and that we probably have an underestimate of the deviations. Hope this helps, thanks for commenting!
@benlee35456 ай бұрын
@@PsychExamReview Thank you Sir. I think I get what you mean. But not sure whether you can advise. Since for some cases, we will not know the population mean, so when we should use the n-1? Based on what you said, it seems like for sampling, it is likely that the standard deviation calculated based on sampling will be smaller than the population deviation. Hence n-1 is used to adjust. But when we do not know the population mean, when shall we use n-1?
@PsychExamReview6 ай бұрын
@@benlee3545 Yes, in almost all cases we can't measure the entire population to know the population mean and we can only measure a sample, so that means that we will almost always use n-1 when calculating the variance.
@benlee35456 ай бұрын
@@PsychExamReview Thank you Sir
@eloy6187 ай бұрын
Could someone explain why when there are 2 sample scores, you divide your bias by 2?
@PsychExamReview7 ай бұрын
I've written a more detailed explanation and examples in a reply to Titurel's comment below but the basic idea is that sampling 2 scores reduces the probability of extreme deviations from mu because both scores have to be extreme on the same side, otherwise they balance out and their average is pulled toward mu. The probability of one score being below mu is .5 but the probability of both scores being below mu is only .25 The more extreme one score is, the greater the probability the second score will move their average closer to mu, reducing bias in the estimate, and this continues as n increases because each single score's ability to pull the average away from mu is reduced. Hope this helps!
@laurenceroche8457Ай бұрын
Why does x bar underestimate the population mean ?
@PsychExamReviewАй бұрын
@@laurenceroche8457 X-bar doesn't necessarily underestimate the population mean; it might be below, above, or equal to the population mean. But any time it differs from the population mean, the sum of squared deviations using x-bar will be an underestimate compared to the sum of squared deviations using the true population mean.
@xcalibur070 Жыл бұрын
can you do a vedio on probability distribution
@PsychExamReview Жыл бұрын
Yes, I will get to this in the future!
@sukursukur3617Ай бұрын
Does degree of freedom explain the case? Then when you calculate kurtosis from sample, do you divide by n minus 1 again? No!! Division of n minus 1, only give better solution with respect to division of n. But term 1 is unclear. There is no obstruction about 1 to be 0.5 or 2. I guess there is a probabilistic relation among sample size, population size and under estimation error.
@user-zx7vh6mp8n7 ай бұрын
I am sorry that I didnt understand what Mew and X bar are? I understand X Bar is the mean. Mew is what?
@PsychExamReview7 ай бұрын
Mu is the population mean, or the true average of everyone in the population. But since we'd have to measure the entire population to know this, we usually don't know it and we only have an estimate from a sample of the population, which is x-bar.
@foyzulhaque67737 ай бұрын
Just wow
@xcalibur070 Жыл бұрын
nicee
@amanmishra98916 ай бұрын
i think so you completely changed the sample and population variance and proved the wrong equation. i.e you proved population variance is been divided by n-1 which is not the case. can you explain it clearlt?
@PsychExamReview6 ай бұрын
The equation shows that the sample variance (which is probably an underestimate) + the bias (the average amount of underestimate) is equal to the population variance. I probably should have started with this form of the equation instead of starting with sample variance = population variance - the bias. This is the same equation but perhaps not as clear of a starting point. So it's showing that the biased sample variance * (n/n-1) = the population variance, which simplifies to the sum of squared deviations / n-1. This means that dividing the sample deviations from x-bar by n-1 will tend to equal the population variance, though we have to remember that this is based on estimating the average bias as sigma squared / n, which it may not actually be for a given sample, so it's only true on average, not in every case. Hopefully this makes things clearer.