Using Bootstrapping to Calculate p-values!!!

No video

Using Bootstrapping to Calculate p-values!!!

Рет қаралды 107,071

StatQuest with Josh Starmer

Күн бұрын

Пікірлер: 235

@statquest 2 жыл бұрын

Support StatQuest by buying my book The StatQuest Illustrated Guide to Machine Learning or a Study Guide or Merch!!! statquest.org/statquest-store/

@insertacoin738 3 жыл бұрын

I have really no words to express how incredibly amazing, clear and enlightening your videos are, you transform the historically "hard and complex" concepts into kids' games, it is astoundingly magnificent, almost majestic. Thank you, really, from the bottom of my heart. You deserve a whole university named after you.

@statquest 3 жыл бұрын

Thank you so much 😀

@insertacoin738 3 жыл бұрын

@@statquest I just came with this Josh, do you have any explanation for why does this happen? stats.stackexchange.com/questions/535343/bootstrapped-mean-always-almost-identical-to-sample-mean

@statquest 3 жыл бұрын

@@insertacoin738 Yes, I do. First, keep in mind that the person who posted that is calculating the mean of the the bootstrapped means. And this mean of means is very similar to the original mean. In other words, the mean of the histogram that bootstrapping created is centered on the mean of the original data. That's to be expected. Bootstrapping works because the sample distribution is an estimate (not an exact copy) of the population distribution. This estimate gets better as the sample size increases.

@KoLMiW 3 жыл бұрын

I started watching these videos to prepare for my Introduction to Machine Learning exam but now I just watch them because it's fun to learn about it when it is so well explained. Thank you for your effort!

@statquest 3 жыл бұрын

Awesome!!! BAM! :)

@maxyen9892 2 ай бұрын

I really appreciate how you start off with a simple application example and then you build up from there with explanations and real time drawings. Lots of times when I read about concepts, they start more abstract or from theory, and that makes it less intuitive.

@statquest 2 ай бұрын

Thank you! I'm glad you appreciate my style.

@rattaponinsawangwong5482 3 жыл бұрын

The way you explain bootstrap is so good. Make it simpler for everyone.

@statquest 3 жыл бұрын

Thank you! :)

@OwenMcKinley 3 жыл бұрын

Thank you! I've never really realized the power of bootstrapping until watching your 'Quests. Great stuff 👍👍

@statquest 3 жыл бұрын

Thank you very much! :)

@summerai8724 Жыл бұрын

Thanks a lot on the explanation. I was confused on how to create a simulated distribution for calculating p-value and this video explains really well. Shifting the data to a mean of zero before resampling is the key!

@statquest Жыл бұрын

Glad it was helpful!

@mattsmith4027 8 ай бұрын

Literal black magic. Cheers so much for making this I had some data that was a pain in the butt to get and Im trying to pull all I can out of it, this really helped!

@statquest 8 ай бұрын

Thanks!

@zivot6822 Жыл бұрын

You just saved my work report, keep it up man.

@statquest Жыл бұрын

Glad I could help!

@ChanukaFernando12129 2 күн бұрын

Simply awesome !

@statquest 2 күн бұрын

Thank you!

@kanikabagree1084 3 жыл бұрын

This channel deserves atleast a million subscribers!

@statquest 3 жыл бұрын

Thank you! :)

@saeidsas2113 3 ай бұрын

I finally did it for my real problem case.

@statquest 3 ай бұрын

bam! :)

@goonerrn 3 жыл бұрын

Josh just made the part2 so he could sing "part 2... calculate p value" this is gem!

@statquest 3 жыл бұрын

@moali001 3 жыл бұрын

Damn, that's some good quality here ! hope to see more videos !

@statquest 3 жыл бұрын

Thanks!

@bLuemaNMKO 3 жыл бұрын

your work is amazing

@statquest 3 жыл бұрын

Thank you! :)

@ismailalkhalaf6061 2 ай бұрын

great Video!! thank you so much. 🌻🌻 would you please make some other videos about Wild Bootstrapping?

@statquest 2 ай бұрын

I'll keep that in mind.

@user-tn1nw8th6g Жыл бұрын

It's very easy to understand! Super explanation Thank you

@statquest Жыл бұрын

Thanks!

@XoCortanaXo 11 ай бұрын

This is exactly what I was looking for, thank you!

@statquest 11 ай бұрын

Great to hear!

@user-gp3ts9ib6f 3 ай бұрын

Thank u for your knowledge sharing. This video is helpful for me.

@statquest 3 ай бұрын

Glad it was helpful!

@PastryDonut 2 жыл бұрын

I'm just following your Fundamentals playlist in order. My first encounter with statistics ever. Thank you so much for putting it together!! Can you recommend any collection of beginner stat problems to practice on? It would help to learn tremendously.

@PastryDonut 2 жыл бұрын

Also thank you for stripping away most of the terminology! Can't imagine learning this from a regular lecture or a texbook ugh

@statquest 2 жыл бұрын

I'm glad you are enjoying the video. I have a few "beginner" stat problems here statquest.org/video-index/ (just search for "StatTest")

@PastryDonut 2 жыл бұрын

@@statquest Awesomeness, thank you!

@FedericoMerlo-tx2uq 2 ай бұрын

Great video, as always. I admire so much your work, your knowledge and your ability to make concept understandable. What about if our interest is compare some statistics between two different groups? For example, the mean difference between two groups: - Calcolate the difference of two group means - Bootstrapping each group by itself - Calculate the bootstrap mean difference and subtract the observed mean difference - Repeat to obtain the bootstrap mean difference under null hypotesis of no mean difference? Could make sense? Thank you so much

@statquest 2 ай бұрын

Here's a discussion on how to use the bootstrap to compare two means: stats.stackexchange.com/questions/92542/how-to-perform-a-bootstrap-test-to-compare-the-means-of-two-samples

@petercourt 3 жыл бұрын

Awesome video Josh! Really well explained, as usual. I was curious as to how the data is shifted (e.g. what function is applied) so that you can get from your original mean, to a mean of zero. Otherwise I think I understood everything!

@statquest 3 жыл бұрын

BAM! :) We just subtract the original mean value from all of the original values to shift the data.

@petercourt 3 жыл бұрын

@@statquest Haha, I should've thought of that! Thanks Josh!

@finanzassainz4013 Жыл бұрын

OMG, looked like too complicated learn about this topics, however, you make so easy

@statquest Жыл бұрын

BAM! :)

@tejasbhagwat877 3 жыл бұрын

Hi Josh, Big fan of your videos (and merchandise)! They are incredibly helpful :) Could you please also do a series on running models in Bayesian framework?

@statquest 3 жыл бұрын

Yes, that's a plan.

@tejasbhagwat877 3 жыл бұрын

@@statquest That would be a TRIPLE BAM! Looking forward :)

@rissalhedna5534 7 ай бұрын

Amazing video as usual! I just was wondering why the value of 0.05 was used as a threshold for the p-value. Was it just arbitrarily set or did we assume that it was meaningful for our experiment with the drug ?

@statquest 7 ай бұрын

I explain p-value thresholds here: kzbin.info/www/bejne/rJbQi6d7gptmfbs

@usmanazhar7073 3 жыл бұрын

Really informative, thank you so much for uploading

@statquest 3 жыл бұрын

Thanks!

@marcoventura9451 3 жыл бұрын

I wish I had more time for your videos. Non only they are high standard pieces of higher education but also a moment to relax and to enjoy the day.

@statquest 3 жыл бұрын

bam!

@KirillBezzubkine 3 жыл бұрын

Good bless you mister

@statquest 3 жыл бұрын

Thanks!

@l.josephineandresen610 3 жыл бұрын

Thanks so much! These videos are really great. I was wondering if you will make one on Mixed ANOVAs? :-) Your explanations really help to understand the concepts quickly.

@statquest 3 жыл бұрын

One day I hope to.

@PsyK0man 3 жыл бұрын

clarification needed: to fail to reject the hypothesis that the drug has 0 effect, does it means that we don't reject the null hypothesis and this mean that the experiment is not statistically significant ? does this therefore mean that we cannot conclude whenever the drug is effective or not? or that the drug is not effective?

@v0ldelord 3 жыл бұрын

It means that we do not have enough evidence to exclude that the drug has no effect. Or in other words we can't conclude that the drug is effective.

@statquest 3 жыл бұрын

@@v0ldelord BAM! :)

@statquest 3 жыл бұрын

To learn more about hypothesis testing, check out kzbin.info/www/bejne/ZqDGZWx6rqZmnrc

@mikelmenaba Жыл бұрын

Great video mate

@statquest Жыл бұрын

Thanks!

@citron2725 24 күн бұрын

hey really nice video! I am wondering why P value is calculated by adding up the proportions of values that are farther than the observed value at either side instead of one side (in the case of observed mean = 0.5, just p-value > 0.5)?

@statquest 23 күн бұрын

It's because two-sided p-values are almost always better than 1-sided p-values. To understand why, see: kzbin.info/www/bejne/gILGZKyuZZKEb6c

@jamesstrickland833 2 жыл бұрын

Must we always consider both tails when calculating a pvalue from bootstrapping? Had we looked at the medians and only considered the right tail that would have been significant (@.05) to reject Ho. Or did we assume that Ha was not equal to zero and therefore a two tail test?

@statquest 2 жыл бұрын

You don't always need to use two-tailed p-values. However, I think it is almost always a mistake to not use two-tailed p-values. Not once in my career as a biostatistician did I use a single sided test. If you want to know why, see: kzbin.info/www/bejne/gILGZKyuZZKEb6c

@alikoushki6483 Жыл бұрын

Great video, thanks

@statquest Жыл бұрын

Thank you! :)

@marcingrzebalski103 Ай бұрын

Thanks for your work again. Another question that comes to my mind is:😊 Does testing the null hypothesis always involve centering the dataset so that the mean is zero? I mean there are common cases in real world when mean/any statistics we use is not equal to 0 when there is no effect/no differences between means? Also, does t test function in R or python also use shifting dataset so that it has mean equal to 0/ centered on null hypothesis and repeat that bootstrap like simulations many times to calculate p value? Or it uses different method?

@statquest Ай бұрын

You don't have to center the data to get the p-value - it just makes it easier to visualize and interpret.

@marcingrzebalski103 Ай бұрын

@@statquest so technically and generally in statistics so as in R/python functions, calculating P value (which of course involves null hypothesis testing) does not obligatory use this data centering to 0 "in its mechanism" and calculating it using bootstrap method is just a method not involved in default r/python functions for t tests or linear models? Or whether the classic P value calculating also uses bootstrap?

@statquest Ай бұрын

@@marcingrzebalski103 that's correct and analytical methods do not use bootstrapping.

@PauloBuchsbaum Жыл бұрын

Great video and I understood your procedure perfectly. I just believe that, in the process of shifting 0.5 to the left to redo the bootstrap taking the mean to 0, it would not be strictly necessary (except for ease of understanding) I think that instead of redoing the shifted bootstrap, it would be enough in the original bootstrap to take the probability of above 1.0 plus the probability below 0.0. In the original boostrap this would correspond respectively to get the probability above 0.5 and below -0.5, after shifting 0.5 to the left. Am I wrong? Another point is that at 4:11, the probability above 0.5 was 48%, but at 5:04 to get the p-value you used 47%.

@statquest Жыл бұрын

That is correct

@streetsmart5033 3 жыл бұрын

Sir,please explain the convolutional neural networks I'm eagerly waiting for your way of explanation

@statquest 3 жыл бұрын

I've already done that, see: kzbin.info/www/bejne/eaKyl5xqZrGZetk For a complete list of all of my videos, see: statquest.org/video-index/

@streetsmart5033 3 жыл бұрын

@@statquest yes sir,thank you for reply but in that playlist there is no CNN and RNN.

@lbb2rfarangkiinok 3 жыл бұрын

the jingles are off the chain

@statquest 3 жыл бұрын

bam!!!

@PeteHwang 4 ай бұрын

Hi Josh, thank you for the great video. I had a question at 4:57. Why do you look at the probabilities of observing means ≤ or ≥ ±0.5 in the bootstrap distribution?

@statquest 4 ай бұрын

Are already familiar with p-values? If not, check out these two videos: kzbin.info/www/bejne/rJbQi6d7gptmfbs and kzbin.info/www/bejne/gILGZKyuZZKEb6c I believe those will answer your question.

@thbdf3879 3 жыл бұрын

I wish I could see this video earlier before my exam

@statquest 3 жыл бұрын

bam!

@Daniel88santos Жыл бұрын

Great video! Is this the working principles of "Particle Filters"/"Sequential Monte Carlo"?

@statquest Жыл бұрын

I have no idea. I've never heard of those things before. :(

@rayman2704 3 жыл бұрын

Thank you soooooooooooooo much!

@statquest 3 жыл бұрын

BAM! :)

@alexvass Жыл бұрын

Thanks

@statquest Жыл бұрын

BAM! Thank you so much for contributing to StatQuest!!!

@thegimel 3 жыл бұрын

It sounds like calculating p-values from bootstrapping can lend itself to p-hacking, if you find "the right" statistic that does lead to rejecting the null hypothesis because of some reason (e.g. being more or less sensitive to outliers). What do you think?

@statquest 3 жыл бұрын

That's why for everything in statistics, you plan what you are going to do (what metric you are going to use etc.) before collecting data.

@shirleygui6533 2 жыл бұрын

awesome!

@statquest 2 жыл бұрын

Thank you!

@juanete69 2 жыл бұрын

I don't understand why you use the shifted data to perform the bootstrap. What if you don't "know" the null hypothesis but just your sample?

@statquest 2 жыл бұрын

You don't have to shift the data, it just makes the math easier.

@juanete69 7 ай бұрын

How do you use bootstrapping when you have several variable? For example for a regression model. How would you use it to test the standard deviation?

@statquest 7 ай бұрын

See: www.sciencedirect.com/science/article/abs/pii/S0167715217303450

@caroldanvers6306 Жыл бұрын

Great video and helpful examples! What do you do when you're testing the median (with HO: median = 0; HA: median not 0), and the observed median is 0? As there is no shift, I'm thinking the p-value is 1.000 (as all of the bootstrapped medians are either >=0 or =

@statquest Жыл бұрын

Yes, exactly.

@EdoardoMarcora 3 жыл бұрын

Wouldn't shifting the bootstrap distribution that was obtained from the original sample data be basically equivalent (for the purpose of calculating a pvalue) to the bootstrap null distribution?

@statquest 3 жыл бұрын

Sure, either way.

@alinaastakhova8412 3 ай бұрын

Thank You for amazing explanation, still I am a little confused. On 4.10 of the video You have the probabilty for a mean >=0.5 as 0.48 and on 5.02 of the video the probability for a mean >= 0.5 becomes 0.47... How is that? And for the median - how do You get the probability for a median >= 1.8 as 0.01. How is that calculate once the bootstrapped distributions for medians does not go beyond ~ 0.5 units? Isn't the calculated probability simply a portion of the distribution beyond the given value (like 1.8 for the median in our example)? What do I miss?

@statquest 3 ай бұрын

1) That's just a minor typo. 2) We count the number of bootstrapped generated medians >= 1.8 and divide by the total number of bootstrapped generated medians.

@alinaastakhova8412 3 ай бұрын

@@statquest Thanks!

@LittleLightCZ 2 жыл бұрын

The main question is, how is it different from simply running a t.test to see if the mean equals to 0 or not? Is there anything that bootstrapping adds to it? Originally I thought that bootstrapping might help for example to get tighter confidence intervals without the need to take more sample data in the field, but according to my tests which I made with boot library, the confidence intervals from the bootstrapped data are basically the same as the ones computed from the original data. Well, when I call boot.ci() they tend to be a little bit tighter, but I think it's because the t.test computation is probably a little more conservative (I guess).

@statquest 2 жыл бұрын

The purpose of bootstrapping isn't to replace a t-test, or any other known statistical test. Those known tests will always perform better because they make assumptions about the data that bootstrapping does not, and that results in them having an edge. However, the magic with bootstrapping is that it can be used to calculate p-values or confidence intervals in any situation - including those that are not appropriate for t-tests or any other known test. For example, with bootstrapping we can compare medians or modes instead of means, and you can't do that with a t-test.

@DrThalesAlexandre 4 ай бұрын

Amazing video! Any ideas on how to make bootstrapping run faster on python? It starts lagging once you are doing > 10^5 trials with large sample sizes.

@statquest 4 ай бұрын

Good question...I'm not really sure, but with a large sample size, you might be able to get away with doing less bootstrapping.

@DrThalesAlexandre 2 ай бұрын

@@statquest Thanks! There is probably some library that does this efficiently. I was just curious about how one could be implemented, but it something that can be learned at another point in time.

@DeepROde 2 жыл бұрын

Hey, your videos are a treasure! I had a doubt, at 6:18, the histogram of median doesn't look bell-shaped. This made me wonder whether the distribution of medians would be Normal (like distribution of means) or not, could you please let us know?

@statquest 2 жыл бұрын

The distribution of medians is not normally normal.

@SunSan1989 Жыл бұрын

Perhaps because of the different ways of thinking between East and West, as an Asian I find it easier to understand not to switch to a mean of zero and use the drug no effect as -0.5, but to do so is somewhat inconsistent with the null hypothesis method，good tutorial. There is another problem, that is, the example of 0.36 probability and the probability of less than -0.5 is 0.16 and the probability of greater than 0.5 is 0.47, which seems to be a bit contradictory to do bootstrapping on the basis of the null hypothesis. If the bootsrtrapping times are enough, shouldn't the probability of less than 0.5 and greater than 0.5 be equal?

@statquest Жыл бұрын

What time point, minutes and seconds, are you asking about?

@SunSan1989 Жыл бұрын

Dear Josh，time point is 4:07, the probability of less than or equal to-5 is 0.16, greater than or equal to 5 is 0.48 in time pint4：10. Is this probability a reasonable example? If bootstrapping enough times, shouldn't 0.16 be equal to 0.48? In addition, why can't the paper version of the book be sent to China? I bought it in Japan and transferred it from Japan to China.@@statquest

@statquest Жыл бұрын

@@SunSan1989 My guess is that they will probably meet in the middle. As for my book, there should be a Chinese version (and translation) available in the next year. People are working on it.

@SunSan1989 Жыл бұрын

Sorry, since my English is not very good, I want to confirm that my understanding 0.16 should be replaced with the same value as 0.48. Is this understanding correct? @@statquest

@statquest Жыл бұрын

@@SunSan1989 No, I'm not sure what the value will be, but the sum will probably still add up to something close to 0.63

@jiangshaowen1149 3 жыл бұрын

Hi Josh, May I know the reason why p value is calculated of two-sided?

@statquest 3 жыл бұрын

Because 99 times out of a 100 you always want a two-sided p-value. For details, see: kzbin.info/www/bejne/gILGZKyuZZKEb6c

@ModernTolkien143 2 жыл бұрын

Hey Josh, thanks for this awesome video!! Do you know of any reference (paper, handbook chapter etc.) that shows the asymptotic validity of the approach you are using? Best, Sebastian

@statquest 2 жыл бұрын

Here's a great place to start if you want to learn more details: en.wikipedia.org/wiki/Bootstrapping_(statistics)

@bobiq Жыл бұрын

We fail to reject the hypothesis that the drug makes no difference. - a triple negation in one sentence is what makes statistics such a mind-bending exercise. Why can't this be expressed more easily?

@statquest Жыл бұрын

Good point! Yes, classical statistics lends itself to a lot of awkward wording. Bayesian statistics attempts to make the language easier - and one of the ideas in this video, using computers to generate a lot of data, is a big step towards getting there.

@themoan 3 жыл бұрын

Hi Josh, do you have to make assumptions about normality of the data? Or does bootstrapping work for parametric and non parametric cases (because of the central limit theorem)? Thank you for another informative video!

@statquest 3 жыл бұрын

Bootstrapping makes no assumptions about the data.

@zerocoll20 3 жыл бұрын

There's anyway to know how good this method is? I mean, comparing resampling with actual knew statistics?

@statquest 3 жыл бұрын

Yes, the same theory that we use to trust "normal" statistics (like t-tests and what not) also applies to bootstrapping. In other words, the theory that allows you to put trust into a t-test also suggests we should put trust in bootstrapping.

@xinlu82 2 жыл бұрын

Thanks a lot. Really nice video. I have a question about the number of replicates when doing the bootstrapping. Is this related to the sample size?

@statquest 2 жыл бұрын

In a small way it is dependent on the sample size (if the sample size is small, there are only so many different bootstrapped samples you can create).

@unlearningcommunism4742 Жыл бұрын

I gave it a try today. It's still not working / returning what I want it to return.

@statquest Жыл бұрын

Noted

@user-on7vj1em3k 2 жыл бұрын

Thank you! Why you calculate +-0.5 in the histogram and not only 0-0.5?

@statquest 2 жыл бұрын

What time point in the video, minutes and seconds, are you asking about?

@PuneetMehra 2 ай бұрын

1:54 - Since 95% CI includes 0, we cant reject null hypothesis (drug not working). Why? What has inclusion of 0 in the CI to do with null hypothesis rejection? I am confused. Ps: I have studied all previous videos.

@statquest 2 ай бұрын

When the confidence interval contains 0, then we can't be confident that the true value is not 0, even though our estimate is not 0. In other words, there is enough variation in the data that we can't have a lot of confidence in the estimate we made with it.

@Julsten3107 3 жыл бұрын

Hey Josh, thanks for this comprehensive explanation! I'm a bit confused why you need to add values greater than and equal to 0.5 but also values less than and equal to -0.5 for the p-value? Why can't I just look at values >=0.5?

@HankGussman 3 жыл бұрын

It is 0.05 actually. To reject the null hypothesis, observe results must be rare. Such that the probability of observing such results is

@statquest 3 жыл бұрын

In this video we calculate a two-sided p-value and I describe these, and the reasons for them, extensively in this StatQuest on p-values: kzbin.info/www/bejne/gILGZKyuZZKEb6c

@frashertseng9426 3 жыл бұрын

Thank you the awesome video, 1) how does this apply to compare means from two different group (ctrl/test)? 2) What if my measure is proportion (%), how can we apply this method?

@statquest 3 жыл бұрын

1) see: stats.stackexchange.com/questions/128694/bootstrap-two-sample-t-test 2) see: online.stat.psu.edu/stat200/lesson/4/4.3/4.3.1

@frashertseng9426 3 жыл бұрын

@@statquest Thank you Josh!!

@alputkuiyidilli 2 жыл бұрын

1) Make a bootstrapped Dataset 2) Calculate a statistic 3)??? 4) Profit.

@statquest 2 жыл бұрын

@joeguerriero3841 6 ай бұрын

but how would you do this for a test statistic (like a correlation coefficient), where creating a "null data set" from which to resample is not as straightforward as just mean-centering the data?

@statquest 6 ай бұрын

See: www.sciencedirect.com/science/article/abs/pii/S0167715217303450

@joeguerriero3841 6 ай бұрын

TRIPLE BAM!!@@statquest

@AkashSiddabattula 8 ай бұрын

please reply!! when you were calculating the p value i think we were supposed to find the p value supporting the null hypothesis and if that value is less than 0.05 we can reject the null hypothesis, but here you were calculating the p value of observing mean value of 0.5 or something more extreme and i think this is not supposed to be null hypothesis, then if we get a p value of greater than 0.05 of observing the mean >=0.5 that means often we will get mean >= 0.5 which means drug is having some effect. This is what i understood can u explain?

@statquest 8 ай бұрын

In this video, the null hypothesis is that, on average, the drug has no effect (average effect = 0). We then use bootstrapping to calculate a p-value for this null hypothesis and we get 0.63, so we fail to reject the null hypothesis that the drug has no effect. In other words, there's a high likelihood that any random set of 8 people that have the disease will have, on average, an effect = 0.5.

@AkashSiddabattula 8 ай бұрын

Thank you so much

@rupiyaldekai6136 3 жыл бұрын

can you do pytorch implementation for ann.and fuzzy systems.please sir

@statquest 3 жыл бұрын

I'll keep that in mind.

@cjh4467 3 жыл бұрын

Why don't people just use bootstrapping for everything instead of worrying about robust standard errors and other types of similar concerns?

@statquest 3 жыл бұрын

It's a good question. The answer, I believe, is "power". Bootstrapping works in all kinds of situations, but (I believe) it has less power than parametric methods.

@cjh4467 3 жыл бұрын

@@statquest Thank you!

@SunSan1989 Жыл бұрын

@@statquest That's a really good question, dear Josh, can you make a video about the differences in power? Thank you for the tutorial.I appreciate it very much.

@redcat7467 2 жыл бұрын

That was a bam with different statistics.

@statquest 2 жыл бұрын

@user-gj8vs1do9n Жыл бұрын

Hi Josh! How do we calculate critical value of statistic in this case?

@statquest Жыл бұрын

If, for example, alpha = 0.05, then you can incrementally add the tails of the histogram together until you get 0.05. The last parts of this histogram added define the critical values.

@user-gj8vs1do9n Жыл бұрын

@@statquest Got it! Thank you!

@mikhaeldito Жыл бұрын

When to use permutation over bootstrap (and the other way around) to calculate P-values?

@statquest Жыл бұрын

If you have a relatively small dataset, you can use permutation. If it's relatively large, then you can use bootstrap.

@mikhaeldito Жыл бұрын

@@statquest BAM!

@willw234 2 жыл бұрын

Thanks for the very clear and informative description of this. I have a question - whenever the absolute value of the mean/median/statistic-of-interest of the original data is greater than the absolute value calculated from the shifted data, the p-value will be zero. I have a large set of tests to run and would like to do an FDR correction on the resultant set of p-values, but a not-insignificant number of them are zero. Is this still a legitimate thing to do?

@statquest 2 жыл бұрын

I'm not sure I understand your problem because each time you calculate a p-value you have to calculate the bootstrapped statistic. Are you saying that when the absolute value for every single bootstrapped statistic (and there should be > 10,000 of them) is > then the original statistic, the p-value is 0? Well... if that is the case, all 10,000 bootstrapped statistics are way far away from 0, then the p-value should be 0.

@willw234 2 жыл бұрын

@@statquest Sorry, I probably didn't explain very well. For the shifted data, the largest possible mean of a bootstrap resample is just the largest value in the shifted data (which happens when it is chosen for every element of a resample). When the mean of the original unshifted data is larger than this, the p-value will be zero, regardless of the number of bootstrap resamples carried out. But this does not distinguish between cases when it is just a little bit larger, or very much larger. So if I have a lot of tests on independent data sets, I am concerned that the 'zero p-vaue' ones will be treated identically by the FDR procedure, when perhaps they shouldn't be??

@statquest 2 жыл бұрын

@@willw234 Since you are just testing the mean, you might consider just using a one sample t-test. Then you're p-values will be more spread out.

@willw234 2 жыл бұрын

@@statquest I will do that. I was just hoping to use the bootstrap so I could use the median instead of the mean. (btw I recently purchased your book on ML - very helpful, thank you!)

@statquest 2 жыл бұрын

@@willw234 Awesome! Thank you!

@joxa6119 2 жыл бұрын

So what happened exactly when we shift the data (so the mean will be 0)? Any formula for the data shift?

@statquest 2 жыл бұрын

value - mean

@jasd100 2 жыл бұрын

My brother thought I was watching Blue's Clues, but stats edition

@statquest 2 жыл бұрын

bam!

@engr.majidkaleem8810 Жыл бұрын

Could you please upload 5 unavailable hidden videos?

@statquest Жыл бұрын

Which ones?

@HannahMeaney Жыл бұрын

I dont understand how you got the actual p-value number? for example the p-value of 0.47 - how was that calculated?

@statquest Жыл бұрын

First off, the p-value is not 0.47, so that might be part of the problem. At 3:29 we have a histogram that tells us what would happen if the null hypothesis was true. Then at 3:36 we can calculate the percentage of means that were between -0.5 and 0.5 (this is just the number of means that we calculated that fell between -0.5 and 0.5 divided by the total number of means). This percentage was 36%, which also tells us that the probability of observing a mean between -0.5 and 0.5 is 0.36. Likewise, we then calculate the probability of observing a mean = 0.5 + the probability of observing a mean

@jeffz7310 2 жыл бұрын

where did the 0.05 come from at 5:33 ? thank you

@statquest 2 жыл бұрын

0.05 is the standard threshold for hypothesis testing. For details, see: kzbin.info/www/bejne/rJbQi6d7gptmfbs

@PunmasterSTP 5 ай бұрын

Q: What's the significance of a urine test? A: The p-value!

@statquest 5 ай бұрын

Ugh! ;)

@PunmasterSTP 5 ай бұрын

@@statquest Q: What do claims adjusters use to estimate hail damage? A: Confi-dents intervals.

@DrMcZombie 3 жыл бұрын

Hi Josh and thank's for the overview. I have been using bootstrapping for quite some time now, but not to look at p-values for just one data set. What you describe is---more or less---a different kind of t-test, right? I am using bootstrapping for determining confidence intervals, but also to compare two datasets, e.g., I use two models to predict data and compare the models' performance with bootstrapping. For example, is the root-mean-squared prediction error (RMSE) larger in data set A in comparison to data set B? When repeating this (e.g.) 1000 times, each time comparing the RMSEs, I get a p-value from these comparisons. --> Model A performed better than model B in 990 of 1000 comparisons --> p = 0.99 (or 0.01) I hope this was understandable. What are your thoughts on this application of bootstrapping?

@statquest 3 жыл бұрын

This example is like a one-sample t-test (without having to refer to the t-distribution). Your experiment is a little confusing. You have data sets A and B and also models A and B, so I don't know what you are comparing.

@DrMcZombie 3 жыл бұрын

@@statquest Thanks, and I try to explain a bit more: I have data that I measured (in my case those are Speech Recognition Thresholds, i.e., the signal to noise ration at which 50 % of spoken words can be understood in a noisy environment, I hope this is not getting to abstract). I want to simulate this data with different models and I want to determine which model is better (e.g. model A and model B). To figure out, which model is better, I create a bootstrapped data set of the measured data and calculate the RMSE for both model simulations. Let's say, the RMSE for the bootstrapped data set of model A is 1 and of model B it is 2. I compare these values and count how often the RMSE of model A was lower than the RMSE of model B: --> For this first comparison, I count 1. Second run: RMSE of model A is 1.5, RMSE of model B is 1.4 --> I do not count this (1 of 2 comparisons indicate that the RMSE of model A is lower than the RMSE of model B) When repeating this procedure 1000 times, 990 of the comparisons showed that model A has a lower RMSE, and in 10 comparisons model B had a lower RMSE. I consider this to yield a p value of 0.99 (which is effectively an p value of 0.01). I hope you find this interesting, and I would be happy to get your thoughts on this application of bootstrapping.

@statquest 3 жыл бұрын

@@DrMcZombie You've calculated a probability, which is part of a p-value, but not a p-value. A p-value is the probability of the observed result or data plus the probabilities of all results that are more extreme. For details, see: kzbin.info/www/bejne/gILGZKyuZZKEb6c So, here's what you should do (or consider doing): 0) The null hypothesis is that there is no difference between models A and B. This means that we would expect the difference in RMSE to be 0 between models A and B. 1) Bootstrap your data, run it through your models and make a histogram of differences in RMSE. 2) Draw a 95% CI between the 2.5% quantile and the 97.5% quantile of that histogram 3) Does that CI include 0? If so, fail to reject the hypothesis that models A and B are the same. If not, reject the hypothesis that models A and B are the same. Bam.

@DrMcZombie 3 жыл бұрын

@StatQuest with Josh Starmer Thank you for your reply, and I also see the point that you make. But just to clarify: Wouldn't this boil down to the "counting the comparisons approach"? (not with regard to the p-value, but just for failing to reject the null hypothesis) When 10 of 1000 comparisons (1%) showed, that model A had a lower RMSE than model B, then the 95%-CI of the histogram of differences between the models would not include 0. The CI would include 0 when 25 or more of 1000 comparisons (i.e. more than 2.5 % of the comparisons) would show that model A has a lower RMSE than model B. Anyway, thank's and I am looking forward to more of your great videos. --> octave code example (e.g. use octave-online.net/): % let's assume A and B are the RMSEs of two models. % H1: A is significantly different from B (0 not in 95%-CI of the difference histogram) % H0: A and B are the same (0 in 95%-CI) A = randn(10000,1) + 3; % random numbers, mean = 3; std = 1; B = randn(10000,1); % same, but mean = 0; hist(A-B); % draw histogram comparisons = sum(B > A) / numel(B); CI = quantile(A-B,[0.025 0.975]); printf('comparisons: %1.3f ; CI: [%1.3f %1.3f] ', comparisons, CI); % when CI does not include 0 --> H0 rejected, H1 true

@shivverma1459 3 жыл бұрын

lets says we dont see the p values and see that the 95% confidence interval is crossing 0 at 5:41 then cant we say that the majority of means are crossing 0 therefore drug has been helping in the recovery instead of having no effect. I mean, with confidence interval point of view.

@statquest 3 жыл бұрын

This example is not great for discussing CIs because we shifted the data to be centered on 0. If we wanted to calculate a CI, we would do this: kzbin.info/www/bejne/jquTqWBuYsial7M

@shivverma1459 3 жыл бұрын

@@statquest ohkk thanks bam!

@yongkailiu1448 Жыл бұрын

make another video talking one-sided test?

@statquest Жыл бұрын

You can just multiply the p-value by 2.

@accountname1047 3 жыл бұрын

ah the elusive triple bam

@statquest 3 жыл бұрын

@ilusoeseconomicas2371 Жыл бұрын

There is no reason to subtract the mean of the distribution before bootstrapping and then adding it later. Just bootstrap the original data and see where the original mean is in the generated histogram.

@statquest Жыл бұрын

I shifted the data because the null hypothesis is that the "true mean" is 0 and it's helpful to see how the distribution would be distributed around 0 in that case.

@mohamedsase7250 Жыл бұрын

Can we use bootsrap to calculate confidence interval (%) for conditional event element like cross-tab element and how? Thank you

@statquest Жыл бұрын

Probably, but I don't know what a cross-tab element is so it would be better to get someone else to answer.

@mohamedsase7250 Жыл бұрын

@@statquest cross-tab actually who use spss know it It is cross table like cross two variables such as as gender and healthy (yes or no), you will end with 4 group, i want to know if i can consider the each group as independent group and calculate CI as normal

@mohamedsase7250 Жыл бұрын

Note: i have searched on the answer from months, thank you alot

@gardaramadhito1650 2 жыл бұрын

Isn’t this just randomization inference and you’re testing the sharp null hypothesis?

@statquest 2 жыл бұрын

I believe they are different: jasonkerwin.com/nonparibus/2017/09/25/randomization-inference-vs-bootstrapping-p-values/

@chrislam1341 2 жыл бұрын

I cannot understand why do we care about the region of -0.5.. Given a data with mean 0.5 and variance v, how likely i see this data if the mean is 0. lets assume the data is from a normal distribution, N p-value = P(mean >= 0.5| N(0, v)) if p-value reject H0 if p-value > 0.05: it is likely that the H0 is true => cannot reject H0 where is the role of -0.5 here?

@statquest 2 жыл бұрын

I almost always use two-sided p-values, and I explain the reasons here: kzbin.info/www/bejne/gILGZKyuZZKEb6c

@acc3095 2 жыл бұрын

Is there a minimum sample size needed for bootstrap to be valid?

@statquest 2 жыл бұрын

I think 8 might be a good starting point.

@drachenschlachter6946 Жыл бұрын

How do you shift the data?

@statquest Жыл бұрын

At 2:29 I say that we shift the data to the left by 0.5 units (where 0.5 is the mean of the data). That means we subtract 0.5 from each value in the dataset.

@drachenschlachter6946 Жыл бұрын

@@statquest but why Josh? If you have the bootstrap distribution and you calculate the 95% confidence interval you can say if the hypothesis can be rejected or not? If 0 is in than it can't be rejected. So why shift the data it doesn't matter?

@statquest Жыл бұрын

@@drachenschlachter6946 Because this video is talking about how to calculate p-values, not confidence intervals. The first bootstrapping video describes confidence intervals (and does not require shifting the data): kzbin.info/www/bejne/jquTqWBuYsial7M

@saeidsas2113 4 ай бұрын

Hi Josh, I have a question, how I can contact you and ask my question?

@statquest 4 ай бұрын

If you have a question about my videos, the best place to ask it is right here, in the comments.

@saeidsas2113 4 ай бұрын

@@statquest Yes, but I need to write a bit of narrative to clarify my question related to Bootstrap but not particularly your nice video. I am a risk analyst working at a company and also doing my PhD in the field of actuarial science. We recently encountered an issue related to a model being used at the company.

@statquest 4 ай бұрын

@@saeidsas2113 Unfortunately I don't have time to do much consulting work. :(

@saeidsas2113 4 ай бұрын

@@statquest @statquest , If you do not mind I shoot my question here :) To begin with, I am a model validator, and one of our tasks is to ensure that a model works as expected and is fit for business purposes. To do so, back-testing is typically performed to check the model performance. In a nutshell and simple language, we have the following problem: A financial model generates thresholds at a confidence level of 90 percent. In order to check the model performance, it is important to count the number of defects over a given period which is usually 250 working days (i.e., one year). The defect is defined as below: A defect occurs if the relative market movement in 10 days is greater than the threshold, in other words: log(P_{t+10} /P_{t}) > v_t, where i = 1, 2, ..., 240 and P_{i} is the market price at time t and v_t stands for the thresholds comes out of the model. Note that the market movements are obtained on a rolling basis so we have overlapping intervals. If we believe that the model works good, then one can expect that the number of defects observed over 240 should be 2.4 ~ 3 violations because only at the confidence level 90 percent there is 10 percent chance for observing defects, i.e., 240*0.01 = 2.4. Now let's consider the test hypothesis that needs to be done in order to back-test the model: Null hypothesis: p = 0.01 Alternative hypothesis: p > 0.01 where p is the probability of defect. Under the null hypothesis, the model works as expected because the probability of defect is 1% which is acceptable at the confidence level of 90 percent. Here are the steps taken to back-test the model 1) Compute the spread which is the difference between the market movement and threshold, i.e., Spread = log(P_{t+10} /P_{t}) - v_t 2) Generate 1000 synthetic samples each with size 240 from the original spreads while preserving the dependency structure, for example, the Maximum Entropy Bootstrap approach is applied in this stage. 3) Count the number of positive spreads (indicating defects) for each synthetically generated sample. 4) Obtain the defect ratio for each synthetically generated smaple using (#defects)/240. 5) Use the distribution of the generated defect ratios (i.e., the probability of defect) to find the p-value corresponding to the above hypothesis test. So, using p*_1, p*_2, ..., p*_1000 we calculate the following probability: p-value = P_H0( p > 0.01 ) that is approximated basedo the distribution of p*_1, p*_2, ..., p*_1000. My question: Here the quantity under consideration is the probability of a defect or we could consider the defect rate. If the observed defect rate in the original data set is greater or less than 0.01, then we need to apply a transformation, like what you did for mean where you shifted the data to get zero mean, to have ratio equal to 0.01 and then generate samples from spreads for which the defect ratio is 0.01 to compute the probability of being greater than 0;01 under the alternative hypothesis right?

@saeidsas2113 4 ай бұрын

@@statquest It is fine howevere I already asked my question and I think it is interesting to be taken into account. Feel free to answer it. Thank you for your time.