This video will talk about some of the basics of bootstrapping, which is a handy statistical tool, and how to do it in Stata.
Пікірлер: 71
@I.amBago4 ай бұрын
First 3 minutes were exactly what I wanted to know. Thank you!
@aysecetinel2 жыл бұрын
This was super helpful!!! I first bootstrapped using bsample and then ran a multi-level fixed effect using reghdfe in the loop. It works great! I noticed by default it sets the obs to be equal to the size of the dataset that you are sampling from. It also lets you oversample by setting an obs greater than the size of the dataset. I also tried bootstrapping by using the command bootstrap, reps(#): then reghdfe. This by default lets you specify the obs number to sample as equal to the number of clusters in the dataset. Thank you again for creating content and sharing! Looking forward to reading your book and hope that you'll have workshops tailored to grad students around the world.
@jessyjkn3 жыл бұрын
Omg you literally SAVED MY LIFE!!!!! Thank you Thank you Thank you!!!!!!
@tarantula6649 Жыл бұрын
Very helpful video! Thanks a lot!
@pablovelazquez19036 жыл бұрын
Thank you for this clear explanation.
@lifehappy2173 жыл бұрын
Hi, Nick. Thank you so much for the nice video. I am doing panel regression, and wondering whether it is possible to use bootstrap to get the confidence intervals for the panel model using stata (or r).
@NickHuntingtonKlein3 жыл бұрын
Yep! That's a different goal than in this video though. See www.stata.com/support/faqs/statistics/bootstrap-with-panel-data/
@lifehappy2173 жыл бұрын
@@NickHuntingtonKlein Thank you so much. This is what I want to learn. It is helpful.
@yasmindoghri91752 жыл бұрын
Thank you very much for this video!! I was wondering if I could use bootstrap with different samples. To construct an index, I constructed an index merging data from a different dataset (I extracted mean values per variable from the latter one since it is way larger than my sample). I would like to check if the final index measurement is influenced by the external sample dimension. So as original dataset I considered my sample and after preserve I inputted the external dataset, whereas in the loop I put the distance index formula. Yet, once I run it, it says already preserved. what am I getting wrong?
@NickHuntingtonKlein2 жыл бұрын
There's a bit too much in here for me to follow it, but if you're getting an already-preserved erorr, that means that you tried to preserve twice in a row without a restore in between. So make sure each preserve is matched by a restore, or if you want to clear out your last preserve without restoring, use "restore, not"
@gabriellanocita42393 жыл бұрын
Thanks for the video! I'm wondering if bootstrapping can be used to run an MLM model with random effects predictors in Stata?
@NickHuntingtonKlein3 жыл бұрын
It sounds like you're looking for bootstrapped standard errors, which is something a bit different than this video is about. But yes you can apply bootstrap SEs to any model in Stata, see www.stata.com/features/overview/bootstrap-sampling-and-estimation/
@mikecheng60104 жыл бұрын
Hi thank you so much Nick! If I wanna get the coefficient for each iteration, what should I do?
@NickHuntingtonKlein4 жыл бұрын
If you are running a regression in your bootstrap you can pull a coefficient out and store it in a local (just like in the code in the video). The way to refer to a coefficient after running the regression is with _b[x], where x is the name of the variable you want the coefficient for
@mikecheng60104 жыл бұрын
@@NickHuntingtonKlein Got it thank you so much it works perfectly.
@aibannongspung17652 жыл бұрын
Hi Nick .Thank you so much for the insightful video.I have a question to ask you .I am running a regression model and have also added weights to it ( eg I used aw= wt) since the survey data comes with a survey weight /multiplier.I need to bootstrap the model and report the standard errors thereafter.However I cannot use the same weights while bootstrapping . Is there a way around this issue? Will the standard errors generated without weights after bootstrapping be significantly different from the standard errors of the regression model with the weight ?
@NickHuntingtonKlein2 жыл бұрын
The downloadable package bsweights will help you do this
@aibannongspung17652 жыл бұрын
@@NickHuntingtonKleinThank you for the reply .I just want to mention that the survey data that I am using does not have replicate weights. From what I understand, bsweights are helpful when the survey data also includes replicate weights. Can bsweights be used to manually generate these replicate weights for survey data without them ?
@NickHuntingtonKlein2 жыл бұрын
@@aibannongspung1765 oh I see. Maybe look at svy bootstrap. The replicate weights refer to the weights you get from bootstrap www.stata.com/manuals/svysvybootstrap.pdf
@aibannongspung17652 жыл бұрын
@@NickHuntingtonKlein Thank you Nick .I will give it a try .
@kangkana13543 жыл бұрын
Thank you so much Nick. I have a query on whether bootstrapping can be used on a survey weighted data set, which uses a svy command before a regression. If yes, how can the codes be modified?
@NickHuntingtonKlein3 жыл бұрын
If you're just trying to get bootstrapped SEs, look at the "svy bootstrap" help file
@kangkana13543 жыл бұрын
@@NickHuntingtonKlein Thank you so much. I am going through the file currently to clear the basics.
@ProfessorAliAhmed4 жыл бұрын
I am using the stata KCDF function and then the variable generated from this into my regression model. Since my variable is estimated, I have to bootstrap the process. I am able to do the looping and bootstrapping based on your method, But I not able to use the generated bootstrapped variable in the model to get bootstrapped standard errors. any suggestions would be very helpful. Thank yo.
@NickHuntingtonKlein4 жыл бұрын
Just take the standard deviation of your bootstrapped coefficient (for example, with the summarize command). That's the bootstrap standard error.
@ProfessorAliAhmed4 жыл бұрын
@@NickHuntingtonKlein Thank you Nick!
@nandinimishra2149 Жыл бұрын
Nice Job Nick 💓💓💓💓
@nandinimishra2149 Жыл бұрын
May u share ur I'd for asking some problem related stata
@evahakobjanyan85285 жыл бұрын
great video,I have question .I did exactly you show in video,but without g x normal,because I already had data. But error happens every time. ''invalid obs no'' what does it mean?
@NickHuntingtonKlein5 жыл бұрын
The "set obs" command is for the purpose of creating the fake data, you don't need it if you already have data, and it will produce that error.
@evahakobjanyan85285 жыл бұрын
@@NickHuntingtonKlein do I need g store_means that you write before the word 'quietly'
@NickHuntingtonKlein5 жыл бұрын
@@evahakobjanyan8528 You need some sort of variable to store the results in, yes.
@ataliethompson67254 жыл бұрын
How does one get a bootstrap 95CI and p-value for the difference in two proportions, particularly in multilevel data? I have dataset where eyes are nested within subjects. I want to show that the proportion of var1 is significantly different from the proportion of var2, and since the data is multilevel I'm assuming bootstrap 95CI and p value would be the way to address this?
@NickHuntingtonKlein4 жыл бұрын
For multilevel data you generally want to do bootstrap sampling by cluster. Once you do that, just store all the ratio estimates from all the bootstrap iterations. The 2.5th and 97.5th percentiles of the estimates are your confidence interval.
@ataliethompson67254 жыл бұрын
@@NickHuntingtonKlein How does one bootstrap for the difference in two proportions (as opposed to a mean)?
@NickHuntingtonKlein4 жыл бұрын
@@ataliethompson6725 that's the beauty of bootstrap - just calculate whatever it is you want to calculate in each of the bootstrap samples. So calculate the difference in proportions
@andreab21144 жыл бұрын
What if I have missing values or a multiply imputed dataset ?
@NickHuntingtonKlein4 жыл бұрын
Missing values you just keep using as normal. For multiple imputation you could bootstrap each imputation separately. There might even be a special MI bootstrap in stata 16, I'm not sure, they added a bunch of MI stufd
@user-cr7hy7sr7s3 жыл бұрын
Thank you so much for your wonderful video! I just registered this channel as my favorite. Thanks. I'm wondering if I could use this in the regression command. In each loop, I opened the original dataset, ran the regression command and obtained the coefficient. Then I aggregated the results of each resampling. (I mean I calculated the mean and sd of the coefficient.) Am I right?
@NickHuntingtonKlein3 жыл бұрын
Yep, that works
@user-cr7hy7sr7s3 жыл бұрын
@@NickHuntingtonKlein Thanks! Your videos went viral in my community!
@user-cr7hy7sr7s3 жыл бұрын
@@NickHuntingtonKlein By the way, in Stata software, the bootstrap command can also work but the coefficients do not change and only standard errors change. I could not understand why. sysuse auto, clear regress mpg weight gear foreign regress mpg weight gear foreign, vce(bootstrap, rep(1000)) In the second command, you can get the coefficient and SE. But the coef is actually the same as the original model. What is the difference?
@NickHuntingtonKlein3 жыл бұрын
@@user-cr7hy7sr7s The second command is estimating the coefficient by regular OLS and only the standard errors by bootstrap. This is actually a good idea if you plan to use them for hypothesis tests, as it helps any hypothesis tests done after the fact be sure they're comparing the right things.
@user-cr7hy7sr7s3 жыл бұрын
@@NickHuntingtonKlein Thank you very much! Got it! Now I understand the mechanism. Much appreciate it. I am working on prediction model development and I wanted to learn how to perform internal validation using the bootstrap resampling method. I guess your program would work to calculate the optimism statistics to evaluate the prediction model based on the regression models. Aren't you going to make some video on this topic??
@QuynhNguyen-ij6fe4 жыл бұрын
Can you guide using bootstrap with xtabond2? Thanks
@NickHuntingtonKlein4 жыл бұрын
For bootstrap SEs? I'm not certain that the bootstrap standard error assumptions are justified in the Arellano-Bond case. But in any case you should be able to apply the guide on this page about boostrapping in a panel/ts setting www.stata.com/support/faqs/statistics/bootstrap-with-panel-data/
@YorgosEU5 жыл бұрын
I am doing a Cost effectiveness analysis for costs and health benefit. from my data I calculated an average cost and an average effect per treatment arm in order to calculate the ICER . Then my Supervisors told me that this is not enough and that I need to do bootstraping...i know how but... I DO NOT HAVE A CLUE WHY do I need to do this though. Does anyone know? THANKS!!
@NickHuntingtonKlein5 жыл бұрын
I would recommend posting this question in more detail on StackExchange
@YorgosEU5 жыл бұрын
@@NickHuntingtonKlein thanks Nick
@alisadavtyan21335 жыл бұрын
what command should I change if I already have exsiting varaible. thsi part g X=rnormal(4)*2+4
@NickHuntingtonKlein5 жыл бұрын
Bootstrapping over an existing variable? It should all work the same, you can just skip generating a new variable and use the old one.
@alisadavtyan21335 жыл бұрын
@@NickHuntingtonKlein and what about set obs 10000 ?Should I write my obs number ?
@NickHuntingtonKlein5 жыл бұрын
@@alisadavtyan2133 Everything before the "save originaldata.dta" line is just me creating the fake data, you don't need it. You can just open up your existing data instead.
@alisadavtyan21335 жыл бұрын
@@NickHuntingtonKlein and local boots are number of my obs ?
@NickHuntingtonKlein5 жыл бұрын
@@alisadavtyan2133 That's the number of bootstrap iterations
@justalice51395 жыл бұрын
what if it shows ''floor not found''?
@NickHuntingtonKlein5 жыл бұрын
That suggests there's an error in the line with floor in it. Remember, floor is a function, not a variable. So floor() is correct, not floor () or floor*()
@HE-gw2gr10 ай бұрын
How to implement Kónya (2006) bootstrap panel granger causality approach in stata?please help me😢
@NickHuntingtonKlein10 ай бұрын
No idea! Never heard of it. If it were me I'd Google for it.
@HE-gw2gr10 ай бұрын
@@NickHuntingtonKlein Thank you.Of course I searched, unfortunately I couldn't find it.
@diverdown00116 жыл бұрын
Could you provide the do file. I keep getting an error
@NickHuntingtonKlein6 жыл бұрын
Walter Chin I'm afraid I didn't keep the do file. It's just the same code you can see in the video though.
@diverdown00116 жыл бұрын
Thank for taking the time to reply. I figured it out. There was a minor issue in the code I entered. The boot code is working. Would you happen to know how this can be done for nested data? I have diving data with parameters of depths and bottom times (how long and how deep). These dives belong to a group of 17 small-scale fishermen divers. Each fishermen conducted a range of 100-400 dives per year. My goal is get a good understand for what their average depth and bottom time. The dives are nested within each fishermen. The average per fishermen have a lot of variance. Anyway any help is greatly appreciated.
@NickHuntingtonKlein6 жыл бұрын
Walter Chin There are two ways to go about this depending on what you want to do with it. One uses the "strata" option of bsample, and the other uses the "cluster" option (see help bsample). Strata does a bootstrap such that you are resampling within fishermen (ie fisherman A did ten trips and B did 16, so you resample from A ten times and B 16 times). Cluster resamples at the fisherman level (ie it will resample from fisherman A and fisherman B, picking all the trips that fisherman goes on). If the problem is that there's a lot of noise within fishermen, you probably want the strata option, but I'd recommend looking closer at the help file for more details.
@ASMTowhid6 жыл бұрын
Could you please help me? My code is not working. It's showing following error: . set obs 'boots' ''' invalid It is not an integer or its value is too large.
@hamaybe4 ай бұрын
@@ASMTowhid the first apostrophe should be a backtick (next to the one) i.e. `boots'; it is an annoying feature of specifying locals
@afshanyounas44953 жыл бұрын
i am still confused....
@anusuyabiswas66874 жыл бұрын
complicated and confusing... Better to use original data