In order to be even more practical, I would simply say that: - Mean: You only need 1 value to estimate it. (Mean is the value itself) - Variance: You need at least 2 values to estimate it. Indeed the variance estimates the propagation between values (the more variance, the more spreaded around the mean it is). It is impossible to get this propagation with only one value. For me it is sufficient to explain practicaly why it is n for mean and n-1 for variance
@chonky_ollie3 жыл бұрын
Best and shortest example I’ve ever seen. What a gigachad
@YusufRaul4 жыл бұрын
Great video, now I understand why I failed that test years ago 😅
@venkatnetha83824 жыл бұрын
payhip.com/b/ndY6
@jamiewalker3294 жыл бұрын
How I think about it: suppose you have n data points: x1, x2, x3, x4.., xn. We don't really know the population mean, so let's just pick the data point on our list which is closest to the sample mean, and use this to approximate the population mean. Say this is xi We can then code the data, by subtracting xi from each element - but this doesn't affect any measure of spread (including the variance). But then after coding we will have a ist x1', x2', ...., xn' but the i'th position will be 0. Then only the other n-1 data points will contribute to the spread around the mean, so we should take the average of these n-1 square deviations.
@gfmsantos4 жыл бұрын
I guess the only other n-1 data points will contibuite to the spread around zero not the mean.... I got lost.
@jamiewalker3294 жыл бұрын
@@gfmsantos 0 is the mean of the coded data.
@gfmsantos4 жыл бұрын
@@jamiewalker329 Yes, but you didn't know the mean before you chose the point. As far as I understood, you've just picked a point that might be close to the sample mean, haven't you?
@jamiewalker3294 жыл бұрын
@@gfmsantosYes, the sample mean. It's not supposed to be rigorous, just a way of thinking that given any data point as a reference point then there are n-1 independent deviations from that point. One data point gives zero indication of spread. With 2 data points, only the 1 distance between them would give an indication of spread, and so on...
@gfmsantos4 жыл бұрын
@@jamiewalker329 I see. Good. Thanks
@Physicsnerd13 жыл бұрын
Best explanation I've seen on KZbin. Excellent!
@ritvikmath3 жыл бұрын
Wow, thanks!
@Matthew-ez4ze Жыл бұрын
I am reading a book on Jim Simons, who ran the Medallion fund. I’ve gone down the rabbit hole of Markov chains and this is an excellent tutorial. Thank you.
@ritvikmath Жыл бұрын
Wonderful!
@abderrahmaneisntthatenough69054 жыл бұрын
I wish you provide all math related to ml and data science
@699ashi4 жыл бұрын
I believe this is the best channel I have discovered in a long time. Thanks man.
@stelun563 жыл бұрын
The lucidity of this explanation is commendable.
@junechu9701 Жыл бұрын
Thanks!! I love the way of saying "boost the variance."
@ritvikmath Жыл бұрын
Any time!
@DistortedV124 жыл бұрын
I watch all your vids in my free time. Thanks for sharing!
@venkatnetha83824 жыл бұрын
For a 1200 long pages of question bank on real world scenarios to make you think like a data scientist. please visit: payhip.com/b/ndY6 You can download the sample pages so as to see the quality of the content.
@cadence_is_a_penguin Жыл бұрын
been trying to understand this for weeks now, this video cleared it all up. THANK YOU :))
@neelabhchoudhary2063 Жыл бұрын
dude. this is amazingly clear
@vvalk2vvalk4 жыл бұрын
What about n-2 or n-p, howcome more estimators we have the more we adjust? How does it exactly transfer intro calculation and ehat is the logic behind it?
@Ni9994 жыл бұрын
That last blue equation looks more straightforward to me as - = [n/(n-1)] [σ²-σ²/n] =[σ²n/(n-1)] [1-1/n] =σ²[(n-1)/(n-1)] = σ² ... but that's entirely my problem. :D Anyway, great video, well done, many thanks! PS - On the job we used to say that σ² came from the whole population, n, but s² comes from n-1 because we lost a degree of freedom when we sampled it. Not accurate but a good way to socialize the explanation.
@kvs1231003 жыл бұрын
Thanks for the great explanation! But one question! why minus 1? Why not 2? I know the DoF concept would come over here! but all the explanation I have gone through, they have fixed the value of the mean so as to make the last sample not independent! but in reality as we take samples the mean is not fixed! It is itself dependent on the value of the samples! then DoF would be number of samples itslef!
@musevanced4 жыл бұрын
Great video. But anyone else feel unsatisfied with the intuitive explanation? I've read a better one. When calculating the variance, the values we are using are x_i from 1 to n and x_bar. Supposedly, each of these values represents some important information that we want to include in our calculations. But, suppose we forget about the value x_n and consider JUST the values x_i from 1 to (n-1) and x_bar. It turns out we actually haven't lost any information! This is because we know that x_bar is the average of x_i from 1 to n. We know all the data points except one, and we know the average of ALL of the data points, so we can easily recalculate the value of the lost data point. This logic applies not just for x_n. You can "forget" any individual data point and recalculate it if you know the average. Note that if you forget more than one data point, you can no longer recalculate them and you have indeed lost information. The takeaway is that when you have some values x_i from 1 to n and their average x_bar, exactly one of those values (whether its x_1 or x_50 or x_n or x_bar) is redundant. The point of dividing by (n-1) is because instead of averaging over every data point, we want to average over every piece of new information. And finally, what if we were somehow aware of the true population mean, μ, and decided to use μ instead of x_bar in our calculations? In this case, we would divide by n instead of (n-1), as there would be no redundancy in our values.
@cuchulainkailen4 жыл бұрын
Right. The phraseology is this: the system has only n-1 degrees of freedom when you use xbar. ...Xbar has "taken it away".
@richardchabu42543 жыл бұрын
well explained very clear to understand
@DonLeKouT4 жыл бұрын
Try explaining the above ideas using the degrees of freedom.
@cuchulainkailen4 жыл бұрын
correct.
@venkatnetha83824 жыл бұрын
For a 1200 long pages of question bank on real world scenarios to make you think like a data scientist. please visit: payhip.com/b/ndY6 You can download the sample pages so as to see the quality of the content.
@tyronefrielinghaus3467 Жыл бұрын
Good intuitive explantation,,,thanksd
@AbrarAhmed-ox2fd3 жыл бұрын
Exactly what I have been looking for.
@陳冠熏-m3d8 ай бұрын
Th last section is so helpful thank you!
@ritvikmath8 ай бұрын
Glad it was helpful!
@yassine209092 жыл бұрын
Now it makes total sense. Thank you 👏👍
@Set_Get4 жыл бұрын
Thank you. Could you please do a clip on Expected value and it's rules and how to derive some results.
@ChakravarthyDSK2 жыл бұрын
Please do one lesson on the concept of ESTIMATORs. It would be good if the basics of these ESTIMATORs is understood before getting into the concept of being BIASED or not. Anyways, you are doing extremely good and you way of explanation is simply superb. clap.. clap ..
@amittksinghАй бұрын
great explanation!
@braineater3514 жыл бұрын
I wanted to ask a question. For E(x bar), x bar is calculated using a sample of size n, so is E(x bar) the average value of x bar over all samples of size n? Other than that, I think this has been one of the more informative videos on this topic. Additionally, many times people tie in the concept of degrees of freedom into this, but usually they show why you have n-1 degrees of freedom and then just say "that's why we divide by n-1", I understand why it's n-1 degrees of freedom, but not how that justifies dividing by n-1. I was wondering if you had any input on this?
@subhankarghosh12339 ай бұрын
Marvelous... Loved it...❤
@ritvikmath9 ай бұрын
Thanks a lot 😊
@martinw.97862 жыл бұрын
Great explanation! Love your videos.
@missghani86463 жыл бұрын
this is how we can understand stats not by just throwing some number to students
@alexandersmith6140 Жыл бұрын
Hi @ritvikmath, I want to understand those derivations in the red brackets. Do you have a good set of sources that will explain to me why those three expected values return their respective formulae?
@nguyenkimquang0201 Жыл бұрын
Thank you for great content!!!❤❤❤
@ritvikmath Жыл бұрын
You are so welcome!
@chinmaybhalerao50622 жыл бұрын
I guess second approach for n-1 explanation will be right when both population and sample will follow same distribution which is very rare case.
@yitongchen754 жыл бұрын
is that because of we lose 1 degree of freedom when we used the estimated mean to calculate the estimated variance?
@cuchulainkailen4 жыл бұрын
Correct. It's NOT as author states, that the Variance is boosted.
@venkatnetha83824 жыл бұрын
For a 1200 long pages of question bank on real world scenarios to make you think like a data scientist. please visit: payhip.com/b/ndY6 You can download the sample pages so as to see the quality of the content.
@prof.g51402 жыл бұрын
incorrect intuition. this is more accurate: ideally the actual sample mean equals the population mean, however the actual sample mean is rarely ideal and there's an error amount. if the sample is more concentrated on lower values, then the sample mean will be lower than the population mean. since the sample is concentrated on lower values and the sample mean is also lower, the differences between the samples and the sample mean will mostly be lower than the samples and the population mean thus lowering the sample variance. if the sample is instead concentrated on higher values, then the sample mean will be higher than the population mean. since the samples are concentrated on higher values and the sample mean is higher than the population mean, the distance between the samples and the sample mean will mostly be higher than the differences between the samples and the population mean thus lower the sample variance. whether the sample is concentrated on lower or higher values (not concentrated is unlikely for small sample sizes), the sample variance (using n as denominator) will prob be lower than the population variance. therefore, we need to add a correction factor.
@GauravSharma-ui4yd4 жыл бұрын
Amazing...
@venkatnetha83824 жыл бұрын
For a 1200 long pages of question bank on real world scenarios to make you think like a data scientist. please visit: payhip.com/b/ndY6 You can download the sample pages so as to see the quality of the content.
@plttji26152 жыл бұрын
Thank you for the video, can you help me how to prove that is unbiased in this question? Question: Compare the average height of employees in Google with the average height in the United States, do you think it is an unbiased estimate? If not, how to prove it is not mathced?
@user-or7ji5hv8y3 жыл бұрын
Great video but still not convinced on the intuition. How do you know that the adjustment compensates for missing tail in sampling? And if so, why not n-2, etc? I guess, if anywhere there would be missing data, it would be in the tail.
@yezenbraick65982 жыл бұрын
yes why not n-2 Jamie Walker's comment explains it in another way check that out
@jingsixu46653 жыл бұрын
Thanks for the explaination from this perspective. Can u talk more about why 'n-1'? I remember there is something with the degree of freedom but I never fully understand that when I was learning it.
@samtan63043 жыл бұрын
I also had this confusion when I first learned it. Say you have a sample with values 1,2,3, Now, you calculate the sample variance. The numerator will be [(1 - 2) + (2 - 2) + (3 - 2)]. Notice in this calculation, you are implicitly saying the sample mean must be 2, because you are subtracting every value by 2. Using this implicit information, you will realize that one term in the numerator cannot vary given the other two terms.
@soumikdey14562 жыл бұрын
just wow!
@mm_ww_23 жыл бұрын
tks, great explanation
@jeffbezos44742 жыл бұрын
you're hired!
@nelsonk13412 жыл бұрын
you are GREAT
@AmineChM214 жыл бұрын
Quality video , keep it up !
@EkShunya Жыл бұрын
good one
@pranavjain9799 Жыл бұрын
You are awesome
@ritvikmath Жыл бұрын
Thanks you too!
@Titurel10 ай бұрын
4:38 You really should give links to the derivation otherwise we still feel it's hand wavy
@asifshikari Жыл бұрын
Why n-1...we could adjust even better by doing n-2
@mohammadreza991010 ай бұрын
useful
@jtm12839 ай бұрын
Two criticism (of an otherwise very nice video): 1. all the real work in the proof is done by the formulae in black on the right, for which you provided no explanation; and 2. to talk about sample sd without mentioning degrees of freedom seems incomplete. WRT to the latter, just look inside the summation and ask "how many of these are there?" For the mean, there are n different things (the x-sub-i values), so you divide by n. For sample sd there are n things (the x-sub-i values) minus 1 thing (x-bar), so it's n-1.
@thomaskim53944 жыл бұрын
You still are not clear why we use n-1 instead n in the sample variance, intuitively.
@jamiewalker3294 жыл бұрын
See my comment.
@thomaskim53944 жыл бұрын
@@jamiewalker329 I have already seen a similar argument like yours.
@cuchulainkailen4 жыл бұрын
@@jamiewalker329 It's convoluted. The answer is what I posted. # of degrees of freedom is reduced to n-1 by use of xbar.
@venkatnetha83824 жыл бұрын
For a 1200 long pages of question bank on real world scenarios to make you think like a data scientist. please visit: payhip.com/b/ndY6 You can download the sample pages so as to see the quality of the content.
@thomaskim53944 жыл бұрын
@@venkatnetha8382 What are you talking about?
@gianlucalepiscopia31233 жыл бұрын
Never understood why "data science" and not "statistics"
@yepitsodex Жыл бұрын
the 'we need it to be slightly smaller to make up for it being a sample and not the population' argument isnt needed or realistic. Having n-1, regardless of the size of the sample, says that the one is completely arbitrary just to tweak it the smallest amount. in reality when you go to the sample space from the population space, you lose exactly one degree of freedom. It seems like thats why its n - 1 and not n-2 or something else. if you had all of the sample space numbers except for one of them, the value of the last one would be fixed, because it has to average out to the sample variance. Since it cant be just anything, that is a loss of a degree of freedom, which justifies the use of n-1
@alexcombei88533 жыл бұрын
@tooirrational4 жыл бұрын
Bias is not the factor that is used to deside the best estimates...its Mean Squares Error...n-1 is used because error is low not because its unbiased
@rhke6789 Жыл бұрын
Ah. Learning is i the details. You just skipped over "not interesting" that permits the logic to flow. Not good, Even mentioning the names of the quoted formulas you used but not explain be helpful.... variance decomposition formula or the deviation square formula