Interaction Terms in Stata

Рет қаралды 177,396

Econometrics, Causality, and Coding with Dr. HK

Күн бұрын

This video will explain how to use Stata's inline syntax for interaction and polynomial terms, as well as a quick refresher on interpreting interaction terms.

Пікірлер: 88

@timocallaghan4408 3 жыл бұрын

finally I have a clue at what I am looking at in this assignment. You have a statistically significant positive effect on many of our grades!

@AN-yr7nm 4 жыл бұрын

Dear Nick, big thanks, you just helped one more thesis! I really wish you that you get to enjoy the help you've been giving us!

@BB-se9bl 6 жыл бұрын

Dude you’re saving my thesis. Thanks!

@DiegoVillegasA 4 жыл бұрын

Saved mine too!!!

@abdullahrabaya9235 4 жыл бұрын

Great video, for me this is first time I saw analysis of moderating variable like this simple way. Thank you so much for time n efforts.

@abdullahrabaya9235 4 жыл бұрын

I have some questions 1) what are the main differences between (##)(#) and which case I can interpret the results. For me, (#) seems to me like you split sample. 2) when you calculate the total effect of moderator (b3 &b4), the P-value were not significant. So how you can do that? 3) is there any academic paper published use this method? Pls If you have could you please share with me abdullah.rabaya@siswa.ukm.edu.my. Thank you in advance

@dehiole6463 8 ай бұрын

is it still true if i put white = 1; not white = 2??? 6:50

@NickHuntingtonKlein 8 ай бұрын

Yep it will work for any value of white (although in this case white is binary so it can only be 0 or 1 anyway, but in a case with a variable with a wider range, yes)

@amyheadley6429 4 жыл бұрын

Hi Nick, thanks for the video. Just wondering how to interpret odds ratios following a regression using the logistic command? Is the odds ratio for the interaction term still an additional effect or is it interpreted differently? Many thanks

@bmaclin Жыл бұрын

I have the same question!

@cheerioz83 3 жыл бұрын

Wow. Just what I was looking for. Thank you!

@Garrettthethief 7 жыл бұрын

A really valuable video. THANK YOU NICK!

@elizabethdonkor3274 5 жыл бұрын

Hi Nick, thanks for the video. Quick question pls, do we always have to include the IV's in the model if we only interested in the interaction results? So say just having the control variables, interaction terms and DV only?

@NickHuntingtonKlein 5 жыл бұрын

It's pretty rare that you'd want to leave out the independent variables, as doing so makes the interaction terms very difficult to interpret and sometimes meaningless. Because they have no base independent variable coefficient to contrast to, the interactions lose the "how the effect of one variable changes with values of the other" interpretation

@elizabethdonkor3274 5 жыл бұрын

Thank you very much Nick. I understand it now.

@takesuretozooneyi4836 10 ай бұрын

Will the interpretation of B4 remain the same if family income was a dummy?

@bapaknyaayadanfathan4686 2 жыл бұрын

thank you.. you really save my dissertation.

@TinaTina-xn9on 2 жыл бұрын

Sir, what do with the high multicollinearity between the interaction term and one of its components variable?

@NickHuntingtonKlein 2 жыл бұрын

It's not a concern. In fact it will be correlated with both of its components by construction. If the correlation is extremely high, that may indicate that the other component doesn't have much variation, at worst. In general multicollinearity is way overblown as a concern.

@JaredForsyth 3 жыл бұрын

So in this example, the interaction term doesn't have a significant p-value -- doesn't that mean that we don't have evidence of an interaction? And would we still add the coefficient (b4) in that case? Thanks!

@NickHuntingtonKlein 3 жыл бұрын

In general you never want to pick which variables to include or exclude on the basis of significance. The interaction should be included or excluded on the basis of theory or which hypothesis you're trying to test (often, the interaction itself is what you're curious about so you'd need to keep it in). If your really want to determine whether to keep or remove it on a predictive basis, something like LASSO would be more principled than a significance test

@Garrettthethief 7 жыл бұрын

Nick, two questions: • If you use a fixed-effects regression instead: Is the interpretation still the same? • Would you present interaction after interaction or all interactions in one fixed effects regression for a presentation about these interaction effects?

@NickHuntingtonKlein 7 жыл бұрын

Keep'nCalmProductions Yes, the interpretation remains the same, with the caveat that, like everything else with a fixed effects regression, you're looking only at Within variation. And if you think there are multiple interactions, then you should be estimating the correct model with all the interactions included (unless your sample size is small enough that doing so runs you into power issues). Unfortunately that does also make the results harder to present.

@marinmes1722 4 жыл бұрын

Hey Nick! I have a question: is there a command in Stata that makes 1=highest rank (like you have for classrank)? I'm doing an analysis which includes ranking in a Dutch football league and am curious to know how you did it. I only figured it out by making an additional variable which is constructed by doing: 19-rank (there are 18 teams in the league I am researching). Let me know!

@NickHuntingtonKlein 4 жыл бұрын

If you have a rank variable to start wtih, doing 19-rank sounds like a pretty good way to do it. If you're constructing ranks from a raw variable you might look at the rank function in egen

@eddiele644 4 жыл бұрын

So when do we actually interact our variables? Is there a way to see if it is necessary or do we just do it and then see if the coefficient on the interaction term is statistically significant?

@NickHuntingtonKlein 4 жыл бұрын

Ideally, build your model using theory. Should the effect of one variable be mediated by another? But outside of that, yes, adding the interaction and checking for significance is commonly done.

@eddiele644 4 жыл бұрын

@@NickHuntingtonKlein Thanks Nick

@obeydzomonda9821 4 жыл бұрын

Hi Nick. Thank you for the video. I wanted to ask if its possible to test moderation effect on 5 independent variables and one dependent variable using panel data in stata

@NickHuntingtonKlein 4 жыл бұрын

I don't see why that wouldn't be possible. Anything where you're testing that many independent variables for their effects at once is going to be a bit jumbled and likely difficult to identify though

@obeydzomonda9821 4 жыл бұрын

Thank you for your response. Well appreciated. Regards Obey

@abeypp 3 жыл бұрын

Hi Nick, may i ask if we can use the interaction term as a control variable in the same model? For example, age or ethnicity.

@NickHuntingtonKlein 3 жыл бұрын

Yes, this is the standard thing to do

@blessingsam7198 4 жыл бұрын

Very helpful, Please can I generate a variable instead though, like gen whitefeminc= 1.white*feminc, would this count as an interaction term too, I want my table to look good, without the # in the variable name.

@NickHuntingtonKlein 4 жыл бұрын

That would work (without the 1.). It will stop functions like margins from working properly afterwards though. Generally if you want the table to look nice it's better to do it properly with the interaction in the regression, and then use a table making function like outreg2 to change the variable names

@blessingsam7198 4 жыл бұрын

@@NickHuntingtonKlein thank you.

@JAY-ft1dw 4 жыл бұрын

Thank you for your video Nick, but do you also happened to know how to "outreg2" of multiple regression output at the same time? (I basically want to outreg regression output of each set of groups (300+groups) and hoping that I don't have to copy and paste that many times.....)

@NickHuntingtonKlein 4 жыл бұрын

Do them one after the other. Do a regression, then do outreg2 with the "replace" option. Then do another regression and do outreg2 with the "append" option, and then do as many more append versions as you like

@ramachandrant2719 5 жыл бұрын

Dear Nick, Thank you for the video. How is the interpretation if the interaction term is a factor variable with three levels. For the third level, is it the difference from the omitted level or the second level?

@NickHuntingtonKlein 5 жыл бұрын

Everything is always relative to the omitted level

@ramachandrant2719 5 жыл бұрын

Thank you for the prompt response

@andersengbergpetersen4437 5 жыл бұрын

Really great video! I'm looking at social origin effect on occupational attainment in three different countries and want to run a regression with an interaction between the social origin dummies and country. However, in the dataset (ESS) the country variable is a string variable (no numbers assigned). Is it possible to do an interaction when one of the variables is a string variable?

@NickHuntingtonKlein 5 жыл бұрын

Use encode to turn it into a numeric variable, then use i. with it.

@andersengbergpetersen4437 5 жыл бұрын

Fantastic! -and yet so simple. Thanks a lot. You really saved my day!

@nikitapestov7602 3 жыл бұрын

The video sound is pretty good, beyond my imagination

@cate-pillar 5 жыл бұрын

Thank you Nick, this really helped!

@kselev1708 4 жыл бұрын

Super simple and interesting explained. Thank you very much!!! Dear Nick, could you maybe create a set of videos about using Stata for econometric analysis E2E? That would be soooo great and helpful for multiple master/PhD students!

@NickHuntingtonKlein 4 жыл бұрын

Glad you like the material! Unfortunately I'm not planning to create any new Stata material in the near future. Plenty of E2E material in R though!

@annedenhartog9389 6 жыл бұрын

Can you also use control variables?

@NickHuntingtonKlein 6 жыл бұрын

Yep, just add 'em in there.

@annedenhartog9389 6 жыл бұрын

Thankyou! I want to interpret if income differs between Men and women within different education levels (middle and upper). So I get interaction term 0.2258 for middle if someone is a male and 0.7203 for upper is someone is male. How do I interpret that?

@NickHuntingtonKlein 6 жыл бұрын

I'd recommend writing out your regression equation and plugging in values like in the video. I couldn't really say the interpretation without knowing your model.

@emilienneyameogo3525 4 жыл бұрын

Hi Nick, thank you very much. Is it possible to use interaction terms with the PMG model? Because when I try, it is telling me (note: lnKOF dropped because of collinearity//initial vector: extra parameter lnKOF found//specify skip option if necessary). How can I use all my variables in the same model with the interaction term without collinearity issues?

@NickHuntingtonKlein 4 жыл бұрын

Yes it is. That error generally means that you have perfect multicollinearity, ie you can perfectly predict lnKOF with a linear mix of your other variables. Try running the same model as a regression and you should get the range result. Your model is misspecified

@emilienneyameogo3525 4 жыл бұрын

@@NickHuntingtonKlein I will try that. Thank you.

@dataman1000 2 жыл бұрын

Nicely done thanks

@jekamito 2 жыл бұрын

thank you, crystal clear!!!

@antonio23merino1 6 жыл бұрын

Nick one question: If you use two categorical variables (sex and education) interaction, how will be the interpretation? 1=female for the first variable and for the second one the categories are: primary, secondary, and university education. Thank you so much!

@NickHuntingtonKlein 6 жыл бұрын

Same basic idea, but everything relative to the omitted category. So if primary is omitted, then the coefficient on secondary is "difference between secondary and primary for men" and hte coefficient on secondary plus the coefficient on female*secondary is "difference between secondary and primary for women"

@antonio23merino1 6 жыл бұрын

thank you so much!

@AK-md1mr 5 жыл бұрын

Hey Nick, First of all thanks a lot for your video!! Can I use interaction terms with longitudinal data as well? For example I want to look how the effect of "female" and "r_white" on "classrank" change in the last lets say 4 years ("year"). Would xtreg classrank female##year r_white##year give me any information about that?

@NickHuntingtonKlein 5 жыл бұрын

Yep, that works fine

@camillepouchucq1133 4 жыл бұрын

Hi Nick, thank you very much. Is it possible to use interaction terms with stepwise selection ? Because when I try (like this for example : xi : stepwise, pr(.2) lockterm1 : logistic success (i.classage i.sex) i.surgery##i.sex) it answers = "interactions not allowed" Thank you for your help.

@NickHuntingtonKlein 4 жыл бұрын

Sounds like it's not compatible with interactions (some commands aren't). You can usually use interactions anyway by creating them as their own variables, but in the case of stepwise regression in guessing that would make it work improperly since it wouldn't know what to drop. Maybe do a LASSO instead? That's usually better than stepwise anyway

@selcenzorlu 4 жыл бұрын

Where can we download the data you are using?

@NickHuntingtonKlein 4 жыл бұрын

I don't think I have this data on hand any more, but it comes from the NLS Investigator

@amaranwar1249 4 жыл бұрын

It's the mean difference between the base category and the reported category.

@takesuretozooneyi4836 10 ай бұрын

video of the century

@CoryBradley 5 жыл бұрын

This is tremendous. Does anyone know if a video like this exists for logistic regressions?

@ellenkileyfriends Жыл бұрын

Mike Crowson has a ton of great videos on logistic

@LenaGlitzert 6 жыл бұрын

This is just awesome. Thanks a lot!

@FemkeHuisman 3 жыл бұрын

Wait, but when should you use # and when ##?

@NickHuntingtonKlein 3 жыл бұрын

# is for interaction only, ## is for interaction plus each variable by itself

@TheFruitInvasion 4 жыл бұрын

You really helped me thanks!!

@jawadtariq88 5 жыл бұрын

I am so confused right now, can any one tell me that is moderation and interaction are same ??

@NickHuntingtonKlein 5 жыл бұрын

Roughly, if z moderates the relationship between x and y, then the coefficient on x will shrink when you include z as a control. If z mediates the relationship between the two, then the coefficient on the interaction between x and z will be nonzero.