Confidence Intervals, Clearly Explained!!!

  Рет қаралды 360,240

StatQuest with Josh Starmer

StatQuest with Josh Starmer

Күн бұрын

Пікірлер
@statquest
@statquest 2 жыл бұрын
Support StatQuest by buying my book The StatQuest Illustrated Guide to Machine Learning or a Study Guide or Merch!!! statquest.org/statquest-store/
@joarvat
@joarvat 5 жыл бұрын
Guys, I can't believe you are doing all this. I am trying to break into the field of data science, and your videos are really great because you are doing it in such an entertaining way. Big thank you!
@statquest
@statquest 5 жыл бұрын
Thanks and good luck with Data Science! :)
@Dupamine
@Dupamine 4 жыл бұрын
how is it going man
@joarvat
@joarvat 4 жыл бұрын
@@Dupamine I am still just in the beginning, but I have just started in my first analyst job
@applepeel1662
@applepeel1662 4 жыл бұрын
@@joarvat hey I'm 20 rn and learning all bout statistics for data science. Is it worth it?
@joarvat
@joarvat 4 жыл бұрын
@@applepeel1662 I have got my first analyst job, and then I decided to go through the Data Science track with dataquest.io It's a great program.
@kylebecker5083
@kylebecker5083 3 жыл бұрын
Josh, when I finish the StatQuest Statistics Fundamentals playlist, will you send me a BAM certificate? I want to be BAM certified.
@statquest
@statquest 3 жыл бұрын
BAM!!! One day I will make certificates. :)
@reetanshukumar1865
@reetanshukumar1865 Жыл бұрын
@@statquest can I make one for you 👻
@Wahkyascene
@Wahkyascene 8 ай бұрын
Oh! He will get your BAM certified
@Krishna_0911
@Krishna_0911 14 күн бұрын
Why can't others explain things as clearly and easily as you do? Simply amazing
@statquest
@statquest 14 күн бұрын
Thank you!
@yonatansegal1615
@yonatansegal1615 2 жыл бұрын
I am medical student with an Bachelors in Science and this is possibly the only Stats tutorial, I have EVER been able to understand!!! Thank you
@statquest
@statquest 2 жыл бұрын
bam! :)
@kurosakishusuke1748
@kurosakishusuke1748 3 жыл бұрын
Months ago, I found an idea to know how machine learning may make varied prediction given different sample orders in both train and test set under influence of bootstrapping for my thesis. But, knowing that I had spent too much time thinking on how to clearly communicate the CI results to supervisor, promptly I jumped to watch this video and this is exactly what I have been searching for. You have my deep gratitude.!
@statquest
@statquest 3 жыл бұрын
bam!
@Небокекисвета
@Небокекисвета 3 жыл бұрын
You know, I am a Data scientist and work in the banking sphere for 3 years. I noticed your videos in my KZbin recomendation section and was like: "easily explained? Ye ye haha just another video for those who hope to easily learn ML and statistics, well let me watch it during my breakfast". And I was shocked. I realized that the use of Python made me completely blind about some connections between measurements. Sometimes I run tests without any true understanding. For example, those last 10 seconds about "when we should run t-test" were completely new for me! And that can be told about a lot of your videos. There is always a tiny detail that makes me say "oh, wow, that was something I've never noticed". You should definetely run a course on Coursera...
@statquest
@statquest 3 жыл бұрын
Wow!!! Thank you very much!!! :)
@annel5546
@annel5546 4 жыл бұрын
I'm currently writing my bachelor thesis and this video helped me a lot, thank you! What I like most is, that it's not too long and on point. Moreover, I'm not a native English speaker, but the video was very clear and easily understandable.
@statquest
@statquest 4 жыл бұрын
Thank you! :)
@siyuguo3300
@siyuguo3300 4 жыл бұрын
I learned stat for 6 years, and this is the best tutorial about CI. Thank you very much.
@statquest
@statquest 4 жыл бұрын
Hooray! :)
@vincentlin9926
@vincentlin9926 3 жыл бұрын
You are a true life saver for person like myself who needs such knowledge but never had a chance to get educated in school…thank you.
@statquest
@statquest 3 жыл бұрын
I'm glad I can help!
@06Amruta
@06Amruta 3 жыл бұрын
Respect and gratitude to you!! Your videos are in my interview prep playlist! Thanks so much for making math understandable!!
@statquest
@statquest 3 жыл бұрын
Good luck with your interviews! Let me know how they go.
@rrrprogram8667
@rrrprogram8667 6 жыл бұрын
I am trying to complete all ur videos
@statquest
@statquest 6 жыл бұрын
Nice! You're making great progress! :)
@samuelkellerhals5942
@samuelkellerhals5942 5 жыл бұрын
me too! this stuff is gold
@cvic7143
@cvic7143 3 жыл бұрын
@@samuelkellerhals5942 agree, it is good!
@dunjianxiao4105
@dunjianxiao4105 3 жыл бұрын
@@statquest double BAM!!
@healingmyselfalone
@healingmyselfalone Жыл бұрын
I usually don't comment on YT videos, but I'm eternally grateful that you are posting such incredibly useful videos. Thank you very much!!! God Bless!!!!
@statquest
@statquest Жыл бұрын
Wow, thank you!
@CE-wg5gn
@CE-wg5gn 3 жыл бұрын
If I ever finish my PhD, I propably need to credit you for every knowledge I have about statistics. And I actually learned this stuff beforehand.
@statquest
@statquest 3 жыл бұрын
Good luck finishing your PhD! You can do it!!! :)
@zayaanwho
@zayaanwho Ай бұрын
hey, did you finish it? how's life
@leonardogoes683
@leonardogoes683 5 жыл бұрын
This video helped me to better understand the p-value besides the confidence intervals.
@6789uiop
@6789uiop 3 жыл бұрын
That _was_ good, very clearly related. The introduction of the term 't-test' threw me however.
@SergeySenigov
@SergeySenigov Жыл бұрын
Josh, you are genius. Finally i got the idea how t-test are made and confidence intervals and p-value relate to each other! And why one can simply check if "0" statistics belongs to conf int!
@statquest
@statquest Жыл бұрын
bam!
@kuanjuchiu9450
@kuanjuchiu9450 6 жыл бұрын
This saves the world, thank you so much
@statquest
@statquest 6 жыл бұрын
Thank you! I'm glad I could save the world! I thought only Spider-Man could do that. ;)
@marshalljordan2416
@marshalljordan2416 3 жыл бұрын
Thanks for such a clear explanation of bootstrapping and confidence intervals. The two concepts do go together so that understanding bootstrapping makes confidence intervals and their interpretation easy to understand.
@statquest
@statquest 3 жыл бұрын
Thank you very much! :)
@DreamCodeLove
@DreamCodeLove 4 жыл бұрын
One of best tutorials I watched on net paid or otherwise...
@statquest
@statquest 4 жыл бұрын
BAM! :)
@z8709
@z8709 4 жыл бұрын
I am also a fan and I highly recommend the videos from StatQuest to student in my class.
@statquest
@statquest 4 жыл бұрын
Awesome! Thank you very much! :)
@jorgevalero4819
@jorgevalero4819 Жыл бұрын
Thanks so much. I have been working on hydrology for many years and finally I understood this valuable concept.
@statquest
@statquest Жыл бұрын
Glad it was helpful!
@redcat7467
@redcat7467 3 жыл бұрын
Mr. Josh Starmer's singing abilities has significantly advanced since year 2015.
@statquest
@statquest 3 жыл бұрын
Bam!
@siddhft3001
@siddhft3001 4 жыл бұрын
This is by far one of the best videos I've seen. Thank you so much!
@statquest
@statquest 4 жыл бұрын
Thank you! :)
@vahidnajafzadeh4137
@vahidnajafzadeh4137 3 жыл бұрын
I consider myself one of the most stupidest people on earth in learning stats. and yet here I understood the CI concept very well. a big fat thank you to you 😊.
@statquest
@statquest 3 жыл бұрын
Bam! :)
@Majso11
@Majso11 3 жыл бұрын
Im gonna pass my stat exam thanks to you, you explain it so well :'))))))
@statquest
@statquest 3 жыл бұрын
Good luck!!
@nicolethm2002
@nicolethm2002 Жыл бұрын
This was great. I’m taking and finding stats complicated but this broke down the basics of what it was supposed to. Thanks 🙏🏾
@statquest
@statquest Жыл бұрын
Glad it was helpful!
@xnoreq
@xnoreq 5 жыл бұрын
Your example also shows how backwards confidence intervals and p-values are. You already assume a mean of ~26. But you end up calculating a p-value to make a probabilistic statement about the mean being lower than ~21 ... given samples from a distribution with a fixed mean of ~26.
@baay81
@baay81 Жыл бұрын
thanks for sharing. will use this example on my students for sure. will link to the video, of course
@statquest
@statquest Жыл бұрын
Thanks!
@DriveInPsychopath
@DriveInPsychopath 3 жыл бұрын
Thank you for explaining this like a normal person and not like you're teaching people who already know how to do it.
@statquest
@statquest 3 жыл бұрын
Thanks! :)
@luchan1638
@luchan1638 2 жыл бұрын
your videos are god sent
@statquest
@statquest 2 жыл бұрын
Thanks!
@manuelargos
@manuelargos 3 жыл бұрын
YOU ARE THE BEST OUT THERE!
@statquest
@statquest 3 жыл бұрын
Thank you! :)
@Hersh0828
@Hersh0828 2 жыл бұрын
You are a godsend Josh!
@statquest
@statquest 2 жыл бұрын
Thank you!
@apnp6787
@apnp6787 3 жыл бұрын
Dude, beautifully and simply explained!
@statquest
@statquest 3 жыл бұрын
Glad you liked it!
@brienwashington4019
@brienwashington4019 2 жыл бұрын
This is so simple and eloquent.
@statquest
@statquest 2 жыл бұрын
Thanks!
@bnv8514
@bnv8514 4 жыл бұрын
"A 95% confidence interval is just an interval that covers 95% of the means." 😁
@statquest
@statquest 4 жыл бұрын
:)
@saudzaman1243
@saudzaman1243 3 жыл бұрын
So if p value for a sample is < 0.05, does that imply that the sample is not a good representative of the population?
@statquest
@statquest 3 жыл бұрын
It suggests that the sample may come from a different population than the one you think you are collecting it from.
@greatvedas
@greatvedas 4 ай бұрын
very good explanation. I liked your examples very much. Thanks you sir!
@statquest
@statquest 4 ай бұрын
Thanks!
@marioestrada2233
@marioestrada2233 3 жыл бұрын
Thanks, confidence intervals seemed so tricky! Till now!!!
@statquest
@statquest 3 жыл бұрын
bam!
@visionarynjy5491
@visionarynjy5491 Жыл бұрын
In 5:02 , shouldn't the p-value be 0.025? Given the Confidence Interval is both direction.
@statquest
@statquest Жыл бұрын
Sure. The point, however, is that 0.05 is the usual threshold for making a decision about the hypothesis. So as long as we are < 0.05, we will reject the hypothesis.
@SabrinaXe
@SabrinaXe 11 ай бұрын
3:11 Less the confidence level/interval i.e 80% rather than 90, the more meaningful our inference about the population?
@statquest
@statquest 11 ай бұрын
It depends on the data and what you want from it. There's no one specific set of rules that fits all datasets.
@dewinmoonl
@dewinmoonl 3 жыл бұрын
another exciting quest complete!
@statquest
@statquest 3 жыл бұрын
bam!
@Denise_lili
@Denise_lili 3 жыл бұрын
Hi Josh, thank you for the nice video! One quick question, I learnt the interpretation of 95% confidence interval is 95% of confidence intervals will contain the true mean (i.e. if we have n=100 random samples of size 5, there are 95 confidence intervals will contain the true mean). It seems different from your explanation here?
@statquest
@statquest 3 жыл бұрын
It's the same. However, we are arriving at the confidence interval differently and we need to make sure we don't confuse a bootstrapped mean for a population mean. The interval that contains 95% of the bootstrapped means is a 95% CI, and thus, if we repeated the process a bunch of times, 95% of the intervals calculated that way will contain the population mean.
@henriqueazank5254
@henriqueazank5254 4 жыл бұрын
I'm curently speedrunnig all your videos
@statquest
@statquest 4 жыл бұрын
Go for it! :)
@rajkamalingle9144
@rajkamalingle9144 2 жыл бұрын
Total types of samples that we can take : 23C12 = 1352078, out of this we are asked to find sample mean of around 10,000 samples. Now, we can define confidence interval as : 95% confidence interval is just an interval that covers 95% of the above calculated sample means.
@statquest
@statquest 2 жыл бұрын
:)
@pupface
@pupface Жыл бұрын
Thank you. It's crazy how nobody else seems able to explain this clearly
@statquest
@statquest Жыл бұрын
Thanks!
@foedeer
@foedeer Жыл бұрын
You mean the world to me man.
@statquest
@statquest Жыл бұрын
Thanks!
@Birdsneverfly
@Birdsneverfly 5 жыл бұрын
Wonderful series. Thankyou for sharing your knowledge.
@stevequan7306
@stevequan7306 4 жыл бұрын
At 3:22 -- should the confidence interval bar has some constraints? (eg, should be the shortest bar to cover 95% means, otherwise I can shift and prolong it to cover 95% )
@statquest
@statquest 4 жыл бұрын
Surprisingly, it can be any bar that covers 95% of the means.
@stevequan7306
@stevequan7306 4 жыл бұрын
@@statquest ah I see, here we are talking about CI itself, not the CI we using in N-distribution calculated by "std". (even in distributions CI can shift and change) Thank you!
@statquest
@statquest 4 жыл бұрын
@@stevequan7306 That's exactly right. There are different ways to calculate the confidence interval, and you end up with different intervals, but as long as you are covering 95% of the means (or at least doing that in theory), then you have a 95% CI.
@karannchew2534
@karannchew2534 3 жыл бұрын
@@statquest Hi Josh, Hi Steve, I had the same question as Steve. But I don't understand the answer/explanation. Why can CI be any range? Shouldn't it be the shortest range that cover 95% of the mean values?
@statquest
@statquest 3 жыл бұрын
@@karannchew2534 Why should it be the shortest?
@ritika.upadhyay
@ritika.upadhyay 5 жыл бұрын
Hi Josh! Great videos (I'm currently on a StatQuest marathon and it has been incredibly helpful!) I have a question though. Could you explain the bit about p-value being less than 0.05 in case of the weights of female and male mice? Instinctively I understand that there's a statistical difference between the true means of these two but I'm struggling to relate it to the idea of p-value. Thank you!
@funny__bean
@funny__bean 4 жыл бұрын
Same thing occurred to me!! If in a 95% confidence interval, the remaining 5% do not cover means right? If so, then how come its p-value is significantly different?
@busyshah
@busyshah 4 жыл бұрын
By definition p-value denotes a probability of (something other which is equally rare + something rarer than null hypothesis) happening. Here the null hypothesis is that the means for both male and female mice are from same population. But we already know that 95% confidence intervals of both male and female means don't coincide. So there is only one possibility left that less than 5% cases will have the possibility of their means coinciding. Which is why p-value is
@destinnguon4877
@destinnguon4877 Жыл бұрын
Excellent, thank you so much!
@statquest
@statquest Жыл бұрын
Glad it was helpful!
@liamhoward2208
@liamhoward2208 2 жыл бұрын
Hello Josh, @ 3:22 is it correct to say that 95% of all confidence intervals will contain the population mean? I am having a hard time understanding if this interpretation is the same as yours. Also, I am a bit confused about the bootstrapping. How do we construct the interval? How do we adjust the interval for different levels of alpha? Thanks again bro!
@statquest
@statquest 2 жыл бұрын
At 3:22 I say that when using bootstrapping a 95% CI is an interval that covers 95% of the (bootstrapped) means. Now, if we made a lot of 95% CIs using this method, then 95% of them would contain the population mean. For more details on bootstrapping, see: kzbin.info/www/bejne/jquTqWBuYsial7M
@liamhoward2208
@liamhoward2208 2 жыл бұрын
@@statquest Thank you for the quick reply. It really separates you from the rest.
@matthewdong9368
@matthewdong9368 6 жыл бұрын
In the example where you want to get the p-value for true mean less than 20, and the result is less than 0.05. Does that mean it's very unlikely that the true mean is less than 20? Thanks!
@statquest
@statquest 6 жыл бұрын
Yes!
@LittleMonsterswiftie
@LittleMonsterswiftie 3 жыл бұрын
Very clear! thank you
@statquest
@statquest 3 жыл бұрын
Glad it was helpful!
@Artersa
@Artersa 3 ай бұрын
Insanely useful
@statquest
@statquest 3 ай бұрын
Glad you think so!
@mycotina6438
@mycotina6438 Жыл бұрын
I'm wondering if there's a correlation between this method and central limit theorem? Because if my understanding is correct, we can also construct a confidence interval using the latter.
@statquest
@statquest Жыл бұрын
The central limit theorem makes it possible to create confidence intervals for the estimate of the mean, but only the mean. In contrast, bootstrapping allows us to create confidence intervals for any statistic we want.
@mycotina6438
@mycotina6438 Жыл бұрын
@@statquest Thanks! It makes so much sense
@nikolenarepousi8189
@nikolenarepousi8189 Жыл бұрын
Amazing! Thank youuuu
@statquest
@statquest Жыл бұрын
No problem 😊!
@venkilfc
@venkilfc 3 жыл бұрын
Thank you so much Josh, I just watched your videos of standard error and confidence interval. Could you please verify if I understood it correctly? 95% confidence interval = mean of means ± 2 Standard Error
@statquest
@statquest 3 жыл бұрын
It depends on how you calculate it. If you are using bootstrapping, then your method is correct. If you are using a formula to approximate bootstrapping (so you are not using bootstrapping), then you have to appeal to the t-distribution (instead of the standard error). This is because for small sample sizes, the t-distribution is a little wider than a normal distribution, and that compensates for the fact that a small sample size means we have very limited knowledge of what is going on.
@venkilfc
@venkilfc 3 жыл бұрын
@@statquest you're a god sent Josh. Thank you 😄
@isaacbarbozavilchez6773
@isaacbarbozavilchez6773 3 жыл бұрын
If all teachers could explain like Josh, more people loving statistics would be
@statquest
@statquest 3 жыл бұрын
Thanks!
@ashimay4722
@ashimay4722 4 жыл бұрын
Amazing explanation..!!
@statquest
@statquest 4 жыл бұрын
Thank you! :)
@viktorsemenov7208
@viktorsemenov7208 9 ай бұрын
it is brilliant. thanks!
@statquest
@statquest 9 ай бұрын
Thanks!
@response2u
@response2u 2 жыл бұрын
Thank you sir!
@statquest
@statquest 2 жыл бұрын
You are welcome!
@AcademiaDados
@AcademiaDados 3 жыл бұрын
Beautiful.
@statquest
@statquest 3 жыл бұрын
Thanks!
@alaknandaagarwal8431
@alaknandaagarwal8431 4 жыл бұрын
A 95% confidence interval is just an interval that covers 95% of the means. This has cleared confidence intervals for me, without talking about the standard error and all other statistical jargon that's so confusing. Quick question: What is the difference between confidence intervals and confidence levels? Thank you !
@statquest
@statquest 4 жыл бұрын
Keep in mind that the 95% confidence interval is just the interval that covers 95% of the means created with bootstrapping. And I believe that confidence levels just refers to the "95%" or "99%" - 95 and 99 are the levels that can then be converted into intervals.
@alaknandaagarwal8431
@alaknandaagarwal8431 4 жыл бұрын
@@statquest yes that's it. Thank you.
@emelyannett
@emelyannett 5 жыл бұрын
This is so helpful. Thank you
@statquest
@statquest 5 жыл бұрын
Thanks! :)
@ananyaagarwal7108
@ananyaagarwal7108 2 жыл бұрын
You are simply Awesome !! I have a doubt here though, Let's say the point estimate is the sample mean. We can repeatedly keep taking the sample means and then plot all these sample means in a histogram and we would observe a normal distribution called the sampling distribution of the sample means. The mean of this distribution would be a better estimate of the population mean and its standard deviation, called standard error would be the population standard deviation/sqrt (number of points in a sample). Won't the confidence interval(say 95%) range be (sampling distribution mean - 2 SE,sampling distribution mean + 2 SE) instead of (point estimate - 2 SE,point estimate + 2 SE)? Why would we use the sample mean(point estimate) in calculating the confidence interval range? What if that particular sample mean was like an outlier in the sampling distribution of the mean? In that case, doing +/- 2*SE wouldn't be a good judge to measure population mean right?
@statquest
@statquest 2 жыл бұрын
The technical definition of a 95% confidence interval is that if we repeat the process a lot of times, calculating 95% CIs each time, 95% of the CIs we calculate will cover the true (population) mean. So, sure, sometimes we get outliers, and our CI is bad, but that is expected about 5% of the time we calculate a 95% CI.
@ananyaagarwal7108
@ananyaagarwal7108 2 жыл бұрын
@@statquest Thanks for responding, Josh ! Does this mean that while calculating the 95% CI, we are assuming that our point estimate (sample mean) is always 1.96 SD away from the population mean(mean of the sampling distribution) ?
@statquest
@statquest 2 жыл бұрын
no
@ananyaagarwal7108
@ananyaagarwal7108 2 жыл бұрын
@@statquest Thanks for responding again :). I'm having a lil bit of tough time connecting all the dots, sorry for the long questions ! If we are given the pop SD and use z stats to calculate 95% CI for mu, we say z=xbar-mu/(sigma/sqrt(n)) where z=1.96, xbar is the sample mean or point estimate and sample SD can also be computed. Based on the definition of z score, does this not mean that xbar is 1.96sd away from mu ? In any case, What is the intuition behind using this formula. Thanks in advance !
@liaoweien
@liaoweien Ай бұрын
Hi, thanks a lot for the video. for step, how large is the original sample size? much much much bigger than 12? thanks in advance.
@statquest
@statquest Ай бұрын
I think it needs to be 8 or more for bootstrapping to work.
@dharam8060
@dharam8060 5 жыл бұрын
How do we calculate 95% cover from the Bootstrap means?
@BibleSamurai
@BibleSamurai 2 жыл бұрын
the humor is great
@statquest
@statquest 2 жыл бұрын
Thanks!
@harithagayathri7185
@harithagayathri7185 5 жыл бұрын
Hi Josh, a little confused about the p-value here as, if less than 0.05 is considered as less likely to reoccur then why are we considering variables with less than 0.05 as highly significant variables in the regression models?
@statquest
@statquest 5 жыл бұрын
A p-value < 0.05, means, in general terms, that the result is probably not due to random chance. Thus, when we do linear regression, a small p-value tells us the the relationship between the independent and dependent variable is probably not due to random chance.
@harithagayathri7185
@harithagayathri7185 5 жыл бұрын
@@statquest Thanks Josh 😊😊
@adirozeri7162
@adirozeri7162 2 жыл бұрын
Thank you so much for the explanation! i have one question tho - could there be more that a single 95 interval for the example above and does it matter? how do you construct it? thanks!
@statquest
@statquest 2 жыл бұрын
Any interval that covers 95% of the bootstrapped means qualifies, but usually you select the one that is centered on the original mean.
@Reonsi
@Reonsi Жыл бұрын
Why does the 95% CI select some means/values and not others? Does it need to be in the center? If so, how? I would suppose that if you force the mean of the interval to be the mean of all means, it would give you a CI similar to the ones you showed in the video.
@statquest
@statquest Жыл бұрын
Traditionally, we center the 95% CI over 95% of the means, but you don't have to do it that way. You just need to cover 95% of the means.
@minhaoling3056
@minhaoling3056 3 жыл бұрын
Hi Sir, can you make series of videos for Bayesian inference & Bayesian credible interval ?
@statquest
@statquest 3 жыл бұрын
I hope to do that in the spring.
@shubhamtalks9718
@shubhamtalks9718 4 жыл бұрын
95% C.I. is the area where 95% of the means are present but from where should I start drawing the line of C.I.( from 1st mean or 2nd or what)?
@statquest
@statquest 4 жыл бұрын
When you use bootstrapping, the 95% CI is any line that covers 95% of the bootstrapped means. So, take your pick! However, usually people center it over the estimated mean.
@mmk34
@mmk34 3 жыл бұрын
Josh, I love the line diagrams you use in your illustrations, how do you put these together?
@statquest
@statquest 3 жыл бұрын
For details on how I create my videos, see: kzbin.info/www/bejne/maOviX19Yqp0ns0
@TheTessatje123
@TheTessatje123 Жыл бұрын
Thanks for the video! Are confidence intervals always defined for distributions of the sample means (i.e. means obtained by boodstrapping)? Or can you also calculate them for one single experiment? Or the means of multiple experiments without bootstrapping?
@statquest
@statquest Жыл бұрын
Because of the central limit theorem, all means are normally distributed, so there is a closed form equation for all confidence intervals based on that and you don't need bootstrapping. In other words, you can calculate the CI with a single set of measurements. However, I believe the concept is easier to understand with bootstrapping.
@TheTessatje123
@TheTessatje123 Жыл бұрын
@@statquest I see, thanks!
@shilupangrak1593
@shilupangrak1593 4 жыл бұрын
great!! statquest apps will be a good platform
@statquest
@statquest 4 жыл бұрын
I think so too!
@culturismoproculturismo8251
@culturismoproculturismo8251 3 жыл бұрын
But I don't understand, maybe I'm wrong. Perhaps you are giving a Bayesian approach? The frequentist approach would tell us that it is the frequency with which 95% confidence intervals calculated from many studies, would contain the true effect. The true value would be in the 95% confidence intervals. Thank you in advance
@statquest
@statquest 3 жыл бұрын
This is the bootstrap approach to the frequentist CI. Since we are using bootstrapping, the interval covers 95% of the bootstrapped means and is equivalent to an interval that, if we repeated the experiment a lot of times and calculated the CI the same way, 95% of those CIs would cover the true mean.
@culturismoproculturismo8251
@culturismoproculturismo8251 3 жыл бұрын
@@statquest ❤️❤️❤️❤️❤️❤️. Thank you very much
@koustubhmuktibodh4901
@koustubhmuktibodh4901 6 ай бұрын
Sir, I am looking for a Calculus series. Because I'm going for M.S. in Business Analytics.
@statquest
@statquest 6 ай бұрын
3Blue1Brown has an excellent series on calculus. I also believe Khan academy has some good stuff.
@1tsvaishnav
@1tsvaishnav 4 жыл бұрын
Can you give some intuition of prediction interval? What is the difference between confidence interval and prediction interval?
@SergeySenigov
@SergeySenigov Жыл бұрын
PI estimates range of RVs or some statistics of RVs. Its limits are not random (are not drawn from samples). For example for standard norm distribution ~99% PI for RV = mu+-3*sigma. And we must _know_ mu and sigma. On the opposite CI estimates range of not random but exact value - the population parameter (mu for example) though we don't know it's exact value. Its limits are drawn every time from different samples so they are random.
@renatamirra5294
@renatamirra5294 Жыл бұрын
Any chance you could do a video on frequentist confidence intervals, based on the central limit theorem? Also, with the bootstrap method, is the interpretation that you're 95% confidence that the population mean is contained in the interval still valid? Thank you.
@statquest
@statquest Жыл бұрын
Confidence intervals always have the same interpretation. If we repeated the procedure to calculate the CI a bunch of times, 95% of them would overlap the population mean.
@Sofyee_Official_
@Sofyee_Official_ 3 жыл бұрын
02:12 I can not understand why keep on taking samples from original samples and calculated the mean, Shouldn't it to take samples from the original data, Not the Sample Data? Any one can reply me please
@statquest
@statquest 3 жыл бұрын
Here's another way to look at it: kzbin.info/www/bejne/jquTqWBuYsial7M
@SergeySenigov
@SergeySenigov 2 жыл бұрын
In real world you almost never know the exact population mean. You can sample the original data say 100 times more than Josh, but I can always oppose - why not 1000 times more - that would be more precise. And so on. So there is no solution until you sample the whole population - but this is impossible. That why the goal is - not getting the most precise assessment of population mean (it's impossible) but getting to know the boundaries where it lies with some % of confidence. And this is _possible_.
@viktormaximiliandistaturus7660
@viktormaximiliandistaturus7660 Жыл бұрын
you deserve a like
@statquest
@statquest Жыл бұрын
Thanks!
@alexandergarcia6479
@alexandergarcia6479 3 жыл бұрын
Hi joshua, thanks for the video, can you tell me what do you use to make those sample plots? I don't find that tool in python, thank you so much
@statquest
@statquest 3 жыл бұрын
For details on how I create the images, see: kzbin.info/www/bejne/maOviX19Yqp0ns0
@jasonsj
@jasonsj 2 жыл бұрын
Can someone please help me understand 5:23 why p-value
@statquest
@statquest 2 жыл бұрын
First, if p-value < 0.025, then it is also < 0.05. Now, the reason I used 0.05 is because that is the most commonly used cut off for significance. In other words, we know that the p-value will be at least as small as it needs to be in order to obtain statistical significance.
@jasonsj
@jasonsj 2 жыл бұрын
@@statquest thank you Josh for replying! Love your videos, currently I’m preparing for Data Scientist interview, your videos help a lot!!
@statquest
@statquest 2 жыл бұрын
@@jasonsj BAM! Good luck!
@theforester_
@theforester_ 3 жыл бұрын
i lost it when u said u didnt weight every single female mouse on the planet, just twelve... hahaha thanks anyway
@statquest
@statquest 3 жыл бұрын
:)
@lilyha2470
@lilyha2470 5 жыл бұрын
Why is it that nobody ever explained the CI to me like this before? I bet people who taught us statistic in med school knew this or they did not?
@statquest
@statquest 5 жыл бұрын
Thanks! :)
@deuteros
@deuteros 3 жыл бұрын
Thank you, Josh. Great video. However, I don't know how to calculate the confidence interval. Is it calculated through 2 times the standard deviation of the mean of the sample means?
@statquest
@statquest 3 жыл бұрын
There are lots of formulas for calculating confidence intervals. Conceptually, the easiest one to remember is bootstrapping, however there are lots of other formulas you can use. For details, see: www.statisticshowto.com/probability-and-statistics/confidence-interval/
@deuteros
@deuteros 3 жыл бұрын
@@statquest Thanks! I will check that out. Cheers
@rahulmukherjee8060
@rahulmukherjee8060 Жыл бұрын
How do I check hypothesis for individual distribution sampling?
@statquest
@statquest Жыл бұрын
I'm not sure I understand your question. Are you asking about the distribution of the samples? (like, are you asking about the whether or not the data come from a normal distribution?)
@globalshooky5030
@globalshooky5030 8 ай бұрын
youre a life saver
@statquest
@statquest 8 ай бұрын
Thanks!
@yangtravis1182
@yangtravis1182 Жыл бұрын
One problem to discuss: In this video, you said the probability that the "true" mean is in this area has to be < 0.05. It was my definition of confidence interval before, but after some theoretical statistics courses, my definition has been refereshed. My statistics teacher told me that in this sentence, since neither the "true" mean nor that area is a random variable, this is in fact not a random event and give it a probability is logically wrong. However, the theoretically correct definition (i.e., after hundreds of repetition, 95% of 95% CIs will include the population mean) is usually not informative by itself for researchers, so the definition in the video got to be more popular in the end. I'm not sure whether you took all of it into consideration and then made that video, otherwise it may be better to review it if you remake this video in the future! Please don't treat it as a negative criticism, just because you have released so many helpful videos, I think it would be meaningful if we can help to make them better!
@statquest
@statquest Жыл бұрын
At that time point, we're discussing calculating a p-value with the null hypothesis being H0 < 20. By looking at the confidence interval, we know that the p-value (which is a probability) will be < 0.5 and thus reject the null hypothesis.
@chucknor9708
@chucknor9708 3 жыл бұрын
Does the number of random selections to calculate the bootstrap mean, from the sample need to equal the sample size as it does in your example? i.e. could you have chosen 8 random samples from selection and calculated the mean and bootstrapped mean and repeated this 10000 times?
@statquest
@statquest 3 жыл бұрын
The bootstrap sample is always the same size as the original sample.
@熊飞-b5k
@熊飞-b5k 4 жыл бұрын
Thank you for your wonderful video, here I have a question. When 95% CI do not overlap, we could say there is a significant difference between the two sample sets. I want to ask is we can conclude if the significant difference when the SD of two sets do not overlap, and how about SEM? Hope for your reply. :)
@statquest
@statquest 4 жыл бұрын
95% confidence intervals reflect the SEM, rather than the standard deviation of the raw data. For more details, see: kzbin.info/www/bejne/d2mVk6V8pcmlb8k
@sushilchauhan2586
@sushilchauhan2586 5 жыл бұрын
don't you think bootstrap is similar to central limit theorem?
@statquest
@statquest 5 жыл бұрын
Any time you are calculating a lot of means, then you are doing something related to the central limit theorem.
@sushilchauhan2586
@sushilchauhan2586 5 жыл бұрын
@@statquest i forgot to ask you a simple question in it. How did u take 95%? Is it by using thumb rule of 68-95-99.7. Hey Josh did you have any video related to column standardization and can you make video spearson correlation. Tq
@statquest
@statquest 5 жыл бұрын
@@sushilchauhan2586 95% is just the most commonly used threshold. However, different fields of study have different thresholds. For example, in highly controlled physics experiments, they sometimes use 99% as the threshold. In contrast, other fields might use 90% as the threshold. I also have videos on quantile normalization and Pearson's correlation. This link will take you to a complete index of all of my videos: statquest.org/video-index/
@edydev6775
@edydev6775 3 жыл бұрын
Josh, although you're crystal clear, i still don't get the following point: according to my interval i have a range of weights that can be considered an estimate of the true mean. But, now I get this 20 weight, and I know that it is out of my interval, so it's very unlikely that it represents a significative diference (it happened by chance). So what I do next? Discard this sample, and run another? What needs to occur so I say that yes, this value of 20 really show something that I need to pay attention? And, you're saving my as...s with all these simple explained knowledge. I cannnot thank you enough. greetings from brazil ;)
@statquest
@statquest 3 жыл бұрын
You might need to learn about hypothesis testing to understand the value of the confidence interval. Here's the link: kzbin.info/www/bejne/ZqDGZWx6rqZmnrc
@ai1888
@ai1888 6 жыл бұрын
Is it always true that you don't need any other statistical tests for two distributions with non-overlapping confidence intervals?
@ai1888
@ai1888 6 жыл бұрын
Thanks!
@jimmyxu1735
@jimmyxu1735 4 жыл бұрын
Hello Sir, great videos, thanks. One quick question, is this one tailed or two tailed p-value? if two tailed, then the p-value would be 0.025 given 95% CI. Please clarify, thanks a lot again, J
@statquest
@statquest 4 жыл бұрын
95% confidence intervals are not 1 or two tailed p-values, they are intervals. 95% of them will cover the true mean.
@amnont8724
@amnont8724 Жыл бұрын
Hey Josh, so if I build a confidence interval to the mean in a 95% confidence level, 95% of the bootstrapped means will be in the confidence interval, and there's a 95% chance the mean will be in that confidence interval?
@statquest
@statquest Жыл бұрын
95% of the bootstrapped means will be in the interval, but that doesn't mean there's a 95% chance that the interval covers the true mean.
@amnont8724
@amnont8724 Жыл бұрын
@@statquest Ok, thanks!
@alexanderlewzey1102
@alexanderlewzey1102 5 жыл бұрын
i'm a little confused, is it true that if you use bootstrapping of a sample that will only tell you with what confidence you can state the mean of that particular sample? wouldnt you need to know the population standard deviation to get the confidence intervals for a sample from the population?
@alexanderlewzey1102
@alexanderlewzey1102 5 жыл бұрын
i think i've worked it out, i wasnt adjusting the margin of error in accordance with the sample size ie changing the t/z value, that is when you sample is really small the confidence interval becomes massive to account for that. Either way i would still like to hear your answer if you have time, thanks.
@karthikvijayasarathi89
@karthikvijayasarathi89 4 жыл бұрын
Hi Josh. Sorry to bother you. I have a simple question. I always get confused by this. May be a dumb question. But want to get it clarified Its about how probability is realised. So, sometimes people say, for example in this case , it is said that 95% of the means lie in a certain confidence interval. Here its like speaking in terms of proportions . I understand this . Ok now lets take another statement. Lets take for reference in this video at 3.47, it states that " We know that anything outside of it occurs less than 5% of the time. So here , its speaking in terms of probability or percentage of chance. So, here is my big confusion.. So , if 5% of the total points (here means) lie outside the confidence interval, how or why it can/is said that " at 3.47, it states that " We know that anything outside of it occurs less than 5% of the time" Could you please explain me . Thanks in advance
@statquest
@statquest 4 жыл бұрын
I'm not really sure how to rephrase it. In this case, proportions and probabilities are equal. If 95% of the means end up in a region, then there is 95% chance that if we randomly selected one of the means, it would be within that region.
@karthikvijayasarathi89
@karthikvijayasarathi89 4 жыл бұрын
@@statquest thanks for your instant response. Hope you understood what my. confusion is....Kind of you
@jjbotha6242
@jjbotha6242 2 жыл бұрын
Excellent!!
@statquest
@statquest 2 жыл бұрын
Thank you! :)
StatQuest: Random Forests in R
15:10
StatQuest with Josh Starmer
Рет қаралды 160 М.
The standard error, Clearly Explained!!!
11:44
StatQuest with Josh Starmer
Рет қаралды 228 М.
Quando eu quero Sushi (sem desperdiçar) 🍣
00:26
Los Wagners
Рет қаралды 15 МЛН
It works #beatbox #tiktok
00:34
BeatboxJCOP
Рет қаралды 41 МЛН
1% vs 100% #beatbox #tiktok
01:10
BeatboxJCOP
Рет қаралды 67 МЛН
Правильный подход к детям
00:18
Beatrise
Рет қаралды 11 МЛН
The Central Limit Theorem, Clearly Explained!!!
7:35
StatQuest with Josh Starmer
Рет қаралды 843 М.
False Discovery Rates, FDR, clearly explained
18:27
StatQuest with Josh Starmer
Рет қаралды 223 М.
Why Dividing By N Underestimates the Variance
17:15
StatQuest with Josh Starmer
Рет қаралды 132 М.
Statistical Power, Clearly Explained!!!
8:19
StatQuest with Josh Starmer
Рет қаралды 312 М.
p-values: What they are and how to interpret them
11:21
StatQuest with Josh Starmer
Рет қаралды 1,2 МЛН
Bootstrapping Main Ideas!!!
9:27
StatQuest with Josh Starmer
Рет қаралды 486 М.
Explaining Confidence Intervals and The Critical Region
11:21
Very Normal
Рет қаралды 11 М.
Covariance, Clearly Explained!!!
22:23
StatQuest with Josh Starmer
Рет қаралды 575 М.
Quando eu quero Sushi (sem desperdiçar) 🍣
00:26
Los Wagners
Рет қаралды 15 МЛН