An intuitive introduction to Difference-in-Differences

  Рет қаралды 230,797

Doug McKee

Doug McKee

9 жыл бұрын

Difference-in-Differences is one of the most widely applied methods for estimating causal effects of programs when the program was not implemented as a randomized controlled trial.
In this video I describe the situations where the method is applicable and give you the intuition behind it. I also explain how and why you might want to use regression to estimate diff-in-diff effects. Throughout, I talk about the key assumption required for the diff-in-diff estimate to be valid.
Intended audience: Folks who have had some exposure to linear regression models, but want to learn more statistical methods.

Пікірлер: 106
@dianaadamczyk5273
@dianaadamczyk5273 6 жыл бұрын
Can't tell you how useful your videos are. Thanks for passing on the knowledge!
@anuvaagarwal3492
@anuvaagarwal3492 3 жыл бұрын
One of the best and most lucid explanations of the DID method. Thank you for this, Doug. Especially how you explain the intuition behind how the calculation of the DID estimate done by hand is same as that estimated by the regression model. And the part where you elaborate on the simple benefits of using a regression for a DID model, is great. Really appreciate it that you having shared your understanding here.
@pedrocolangelo2458
@pedrocolangelo2458 3 жыл бұрын
This is probably one of the best videos on this subject that I've ever seen. Thanks!!
@bl.l1506
@bl.l1506 4 жыл бұрын
Your videos have been vital for understanding the contents of my statistics course for me! So far, I've supplemented every new concept with your videos. Sometimes, I even watch your video first and then do the readings. Please keep doing these videos!
@lawrencecobb2107
@lawrencecobb2107 2 жыл бұрын
This is such a clear and helpful video. I’m taking an exam in an hour and doing last minute double checks. This makes me feel more confident, thank you
@thefadingmoonlight
@thefadingmoonlight 8 жыл бұрын
Thank you so much for uploading this! I had looked online at DID and was confused. This made it so easy to understand and apply.
@monicamu8013
@monicamu8013 4 жыл бұрын
When I watched the video for the first time, I was totally lost. During the second time, I took pauses in between to allow myself take more time to understand your super intelligent and super long sentences. It is so much clearer now. Thank you so much!
@SalehBabazadeh
@SalehBabazadeh 8 жыл бұрын
Thank you so much Doug! I just wanted to encourage you for keeping up this great job. your videos are awesome and I believe , they are being used by different people in different field.
@dougmckee673
@dougmckee673 8 жыл бұрын
+Saleh Babazadeh Thanks so much for the kind words! I really should post more of these!
@TaroQuispe
@TaroQuispe 4 жыл бұрын
@@dougmckee673 thanks from my side too, very clear and easy to understand. Do consider posting similar vids on regression techniques and similar, cheers!
@brothermalcolm
@brothermalcolm 3 жыл бұрын
Absolutely brilliant tutorial, first result returned, wish youtube was always this helpful!
@zaraazami4936
@zaraazami4936 8 жыл бұрын
Thank you so much! This video was waaay much helpful than reading pages and pages on DD! Very clear and to the point! Thank you!!
@sharonie
@sharonie 3 жыл бұрын
Best Diff-in-diff course I have learned. Thanks!
@Josefk40
@Josefk40 8 жыл бұрын
Excellent explanation in 12 minutes. Thank you
@Itachi0567
@Itachi0567 4 жыл бұрын
thanks a lot for this clear explanation, you dont know how much it helped me
@marben7062
@marben7062 8 жыл бұрын
Thank you very much Doug. It helped me to analyse my data (pooled cross section).
@yading9202
@yading9202 5 жыл бұрын
Very clear, easy to understand. Great job!
@anglofranses8205
@anglofranses8205 3 жыл бұрын
This is pure gold. Thanks!
@xb2856
@xb2856 Жыл бұрын
way more intuative than previously thought, well put thanks
@braddoremus588
@braddoremus588 6 жыл бұрын
Thank you - very good explanation. Helped clear a lot up for me.
@techierealestate
@techierealestate 5 жыл бұрын
Clear and right to the point. I always wondered why the multiplication coefficient is the DD coeff, Now I know :D
@fritzlouw8434
@fritzlouw8434 8 жыл бұрын
Much appreciated. Keep it up man!
@wisuraweerathunga2188
@wisuraweerathunga2188 4 жыл бұрын
Thanks for this one ! You made it clear !
@rheabanerjee4938
@rheabanerjee4938 5 жыл бұрын
I wish you would post more, you're great!
@zhouchen7682
@zhouchen7682 8 жыл бұрын
Very useful, wait for more.
@Non-disjunction
@Non-disjunction 3 жыл бұрын
You are such a legend mister McKee
@DavidLihm
@DavidLihm 8 жыл бұрын
Thank you so much, this has been really useful!
@VikramSingh-sf1ev
@VikramSingh-sf1ev 2 жыл бұрын
Very clear to the point
@xingu7561
@xingu7561 5 жыл бұрын
It is really helpful!This vedio is easy to understand for new learners like me!I really appreciate your help!If i can survive from my phd program,i hope i can make vedios like this in the future!
@tuhinurrahmanchowdhury9705
@tuhinurrahmanchowdhury9705 3 жыл бұрын
Great video. It saved me!
@sembilanbereguler2602
@sembilanbereguler2602 8 жыл бұрын
Based on regression result (at 8:59), what is criteria to reject null hypothesis (to say that the effect of lunch program is statistically significant)?
@digray6732
@digray6732 2 жыл бұрын
Thank you for this! I didn't quite understand the very last point, i.e. the difference between the points made for when DD is 'ok' (appropriate) and 'not ok'
@oldtree700
@oldtree700 6 жыл бұрын
Hi, Doug! Thank you so much for your great video. I have a quick question. At the end of the video you mentioned the example for the case where DiD is not ok. If the free lunch program has been implemented already in the control group, is there anyway I can still use it as a control group? Semiparametric DiD can be used?
@leopan54321
@leopan54321 2 жыл бұрын
Dude. This saved me thanks :)
@huekim589
@huekim589 2 жыл бұрын
Very good and funny videos bring a great sense of entertainment!
@vedantss
@vedantss 2 жыл бұрын
Very useful!
@kevinvandenbrink8214
@kevinvandenbrink8214 9 жыл бұрын
Thanks for the video, really helped me in my finance research. Just one thing when you talk about the dummy variable Dtr, I think it takes 1 if the person is in the treatment group and 0 if the person is the control group.
@dougmckee673
@dougmckee673 9 жыл бұрын
Kevin van den Brink You're exactly right--When (if) I re-record this I'll fix that. Thanks!
@libbyalthea3061
@libbyalthea3061 7 жыл бұрын
Hello! Thank you for a great video! Do you any advice for estimating necessary sample size before implementing treatment? Thanks!
@Non-disjunction
@Non-disjunction 3 жыл бұрын
Amazing video
@eiinre
@eiinre 7 жыл бұрын
Hi Doug, how do I add additional controls (i.e. X) into the model? I am using SPSS to do the DiD. Do I just add the control variable and regard it as an independent variable?
@saraly2
@saraly2 2 жыл бұрын
Thank you!
@thej1091
@thej1091 2 жыл бұрын
Thank you kind sir! :)
@Ytremz
@Ytremz 8 жыл бұрын
Brilliant
@homayoungerami4176
@homayoungerami4176 4 жыл бұрын
thanks, it was easy to digest
@GradualReportSerbia
@GradualReportSerbia 3 жыл бұрын
Abrupt ending, good video
@ursulapulyer916
@ursulapulyer916 8 жыл бұрын
thank you!
@sjhoenen
@sjhoenen 9 жыл бұрын
Thanks!
@bright1402
@bright1402 5 жыл бұрын
Thank you so much for your video! But in the last slide, I could not understand the Not OK case...
@sarapluviano410
@sarapluviano410 7 жыл бұрын
Hi, thanks for the video. In the beginning you say that DID is useful for estimating causal effects of programs when the program is not implemented as a randomized controlled trial. So, in a randomized controlled trial DID are not necessary? Thanks!
@GoonieFridkin
@GoonieFridkin 8 жыл бұрын
Hi. Thanks so much for this! Quick question though. I've just run a DD regression on my data. The DD beta score isn't significant, but the group (test vs control) beta is. What does this mean?
@dougmckee673
@dougmckee673 8 жыл бұрын
The insignificant DD beta means there is no significant effect of the treatment. The significant group beta means you have significant pre-treatment differences between the groups.
@tarpinianmt
@tarpinianmt 9 жыл бұрын
Thank you so much for this, I had never heard of difference in differences until a reading I had for economic development. I'm actually planning to reference this video in a paper; do you have anything you'd want me to include for a citation? Thanks again.
@dougmckee673
@dougmckee673 9 жыл бұрын
Matthew Tarpinian I'm really glad you've found the video helpful, but it's probably not appropriate for a citation in your paper. If you want a good reference for the method, I suggest using Angrist and Pischke's _Mostly Harmless Econometrics_ instead.
@sembilanbereguler2602
@sembilanbereguler2602 8 жыл бұрын
Based on regression result (at 8:59), what is criteria to reject null hypothesis?
@emeraldwei6672
@emeraldwei6672 Жыл бұрын
Thank you! I would like to know, if there isn't a comparable group, like Rio, then how can one figure out the effect of this programme?
@lauramendezcarvajal5149
@lauramendezcarvajal5149 7 жыл бұрын
Douglas thanks for this amazing video, it helped me so much! I just have a question: why (y) has only one test score? I am a little bit confused about the pre-test and post-test information. If I have the test scores before the implementation and the scores after, how do I compute them? Thanks
@dougmckee673
@dougmckee673 7 жыл бұрын
They key is to have (or be able to compute) the average test score of both groups before AND after the intervention.
@inferno9004
@inferno9004 8 жыл бұрын
IGreat video Doug !!! if there is just have 1 treatment and control group with pre vs post time data and we want to include many control variables , say 5, how do we fit a model with 5 control variables ? What does the regression equation look like ?
@dougmckee673
@dougmckee673 8 жыл бұрын
+inferno9004 It looks just like the regression model shown in the video with the addition of your control variables.
@chocolateyum678
@chocolateyum678 6 жыл бұрын
thank . you!!!!!!!!
@josephdover6822
@josephdover6822 8 жыл бұрын
Hi Doug! Thank you so much for your video I just wanted to ask you a small question: I am also planning to use the difference in differences model. I am looking at the impact of the EURO (introduiced in 1998 and in circulation in 2002) on trade flows between countries in Europe and I am new to STATA hence I am not too sure how to proceed. I did the following regression regress Tradeflow Governmenteffectiveness1 Unemployment1 GDPpercapita1 Populationsize1 Governmenteffectiveness2 Unemployment2 GDPpercapita2 Populationsize2 Distance1-2 But I am not sure what I should do next? Any help would be very much appreciated! :) Best, Joseph
@dougmckee673
@dougmckee673 8 жыл бұрын
+joseph dover To apply a difference in difference, you'll need to divide your trade flows into some set that might be affected by the introduction of the Euro (treatment) and another set that definitely would not be (control). You will also need to reshape your data so you have observations of each trade flow before and after the Euro was introduced. Then you should be able to apply the regression method shown in the video. Good luck!
@bright1402
@bright1402 5 жыл бұрын
Thank you for your video! But at the time 8:06, what is the difference between \beta_0 and \epsilon?
@jotaeleoh
@jotaeleoh 5 жыл бұрын
Beta_0 is the effect or value of outcome "y" (not including the rest of the variables). Epsilon is the error term which basically contains all other components of "y".
@Dniem
@Dniem 3 жыл бұрын
Hello Professor Armstrong!
@mkerryable
@mkerryable 3 ай бұрын
In this EX, are y-scores the post-scores or the pre-post differences? I`m guessing just post scores? Thanks for clarifying!
@linearseller2835
@linearseller2835 8 жыл бұрын
What a great video. I did miss conclusions about the example, though. Beta3 is 30, but it has a p-value equal to 0.228. Can we conclude that this free lunch plan didn't have a statistical relevance (at 95%), right? Those 30 points could have been by chance, right?
@dougmckee673
@dougmckee673 8 жыл бұрын
+Linear Seller Absolutely correct and not that surprising given there were only 10 observations in this sample.
@alfonsoga95
@alfonsoga95 4 жыл бұрын
Thanks, I have one question though, what's the name of the program you're using for the regression? I'm not familiar with it, I find it quite practical
@oyvsni6679
@oyvsni6679 4 жыл бұрын
Doug is using Stata
@hd81504
@hd81504 7 жыл бұрын
First off, thanks for the great video, Doug! I have a follow-up question to one of the comments below: One person commented: So do I understand correctly an extension of the model for 3 treatment groups and 1 control with pre and post could look the following: y = β0 + β1 * Dpost + β2 * Dtr1 + β3 * Dtr2 + β4 * Dtr3 + β5 * Dpost * Dtr1 + β6 * Dpost * Dtr2 + β7 * Dpost * Dtr3 + β8 * X β5: DiD effect for Treatment 1 β6: DiD effect for Treatment 2 β7: DiD effect for Treatment 3 And you replied that is correct. So my question is can you do this same procedure in logistic regression when your dependent variable is dichotomous (e.g., disease vs. no disease)?
@dougmckee673
@dougmckee673 7 жыл бұрын
Interpreting coefficients on interaction terms in nonlinear models (like logistic) is tricky. If it were me, I would just estimate a linear probability model, but there's a much longer (and better) answer here: stats.stackexchange.com/questions/89513/difference-in-differences-estimator-for-logistic-regressions
@lemoncobra2563
@lemoncobra2563 5 жыл бұрын
To respond to doug, I want to use a word of caution on using LPM is that you can have unbounded probabilities and your errors will be heteroskedastic. The latter can be fixed by an extra option but the former as a fundamental issue within the estimator itself. I would argue the point of using DiD is to examine the magnitude of change from a program, etc and with a logit regression you will get your coefficients, calculate the margins, and use the margins to calculate a probability that the DD had on your dependent variable. You're kind of muddling the point of using a logit in this regard but it still works. Kind of loses some explanatory power and loses the charm. Still doable though.
@zeinebouni8764
@zeinebouni8764 8 жыл бұрын
Hi Mr Doug, Thank you for this interesting Video. Is it possible to do DID with ordinal Outcomes? My variables: Rating Firms (Y), D1 (D1== Treated simple; 0 Control Sample); D2 (D2==1 if after treatment; 0 Before). I didn't found any examples to know if is it possible and to see how we can interprete the estimators. Your response is very important for me. Thank you.
@dougmckee673
@dougmckee673 8 жыл бұрын
+Zeineb Ouni I haven't seen it done, but you I believe you could estimate an ordered logit model (ologit) with the same covariates shown above (D1, D2, and D1*D2 in your case). You have to be careful with interpreting interactions in the ordered logit, but I think the basic idea is valid.
@zeinebouni8764
@zeinebouni8764 8 жыл бұрын
+Doug McKee Thank you so much.
@hassanmurtzakhan
@hassanmurtzakhan 9 жыл бұрын
I am trying to run this through STATA and its omitted Beta3 because of multicolinearity between variables can you guide me how to handle it. Thanks
@dougmckee673
@dougmckee673 9 жыл бұрын
Hassan Murtza Khan I don't usually answer Stata questions on KZbin, but I'll make an exception just this once. :) There are two possibilities. The first is that you don't have observations for each group (treatment and control) in both the before and after periods. Tabulate your treatment dummy and your control dummy and make sure all four cells have observations. The second possibility is that you made a mistake constructing the interaction variable. Check this by tabulating the interaction with each of the dummies to make sure the result makes sense. Now your job is to try these and report back so everyone can learn!
@johndupont8596
@johndupont8596 8 жыл бұрын
Hi Doug Thanks a lot for the video! I just have a question. I want to conduct a different in Differences module on STATA between students that received maths lessons and those that didn't . I would like to test when having extra maths lesson help student achieve higher marks. My variables are: "StudentID" "TIME" "MATHS_LESSON" "MARKS" But the problem I have is that not every students have received maths lessons over the period of time and I would like to create 2 groups one "maths_lesson" one "Nomaths_lesson" by adding them to the variable column "StudentID". How should I proceed? Let me recap: I am now trying to obtain is a graph with "time" on the x axis and "marks" on the y axis with two line (one for the group of students who took maths classes and the one for the group that didn't) but I am struggling a bit to achieve this. Hope I am clear in describing my problem! Best regards, John
@dougmckee673
@dougmckee673 8 жыл бұрын
+John Dupont Using your TIME variable, you should divide your observations into "before" and "after" groups. You've already divided your students into those that got the treatment (MATHS_LESSON) and those that didn't. Once you have that, you can compute means of the four cells and subtract them to get the DD estimate. I advise first understanding your data and computing the required numbers before worrying about communicating those numbers with a graph. Hope this helps!
@rohangopalakrishnan7417
@rohangopalakrishnan7417 3 жыл бұрын
Big from you Doug
@JM-fr9bc
@JM-fr9bc 3 жыл бұрын
What are the assumptions of dif in dif?
@monicabraga4344
@monicabraga4344 2 жыл бұрын
how did you do it can you share with me , thank you
@shubrathak.p.7198
@shubrathak.p.7198 8 жыл бұрын
Hi Doug. Please help me! Can I use DID if my data does not follow the assumption of normality? If not..is there a non-parametric DID?!
@dougmckee673
@dougmckee673 8 жыл бұрын
If you have a large enough number of observations (at *least* 25, and I'd feel comfortable over 100), then your outcome doesn't need to be normal--The Central Limit Theorem says your estimate of the treatment effect will be approximately normal. I believe there are nonparametric DiD-like methods when you have a continuous treatment and you believe the effect is nonlinear, but I don't know much about them.
@shubrathak.p.7198
@shubrathak.p.7198 8 жыл бұрын
Thank you Doug!
@ec.juanfranulcuangolee3294
@ec.juanfranulcuangolee3294 4 жыл бұрын
Any impact evaluation it is supossed to be started #Building the #DataBase.. then the methodoly as DID must be analized..isn't???
@brucelee7782
@brucelee7782 5 жыл бұрын
I didnt get the did effect of 30 from 7:35 somebody help please! 😓
@liveybeha
@liveybeha 4 жыл бұрын
I didn't either at first! Remember to average (rather than add) each set of observations before doing the DiD calculation.
@tjahangon7286
@tjahangon7286 9 жыл бұрын
Thank you very much. This video really helps me. What statistic program did you use in this video? Stata?
@dougmckee673
@dougmckee673 9 жыл бұрын
***** I did use Stata to get some of the numbers shown, but the content is fairly independent of the software in this video. Stata plays a bigger role in some of my other videos.
@tjahangon7286
@tjahangon7286 9 жыл бұрын
Thank you very much.
@tjahangon7286
@tjahangon7286 8 жыл бұрын
Doug McKee May I ask one more question? I am using binary dependent variable (dummy). I have search information in internet and find that it is possible to have a regression model with binary dependent variable (in STATA: .probit and.logit command). In your opinion, can it be also implemented in regression of a DD model (I mean, using command .logit y DTr DPost DTrXDPost)?
@dougmckee673
@dougmckee673 8 жыл бұрын
***** Short answer: Yes. Longer answer: If you use your binary dependent variable in a linear regression model exactly as shown here, you are estimating a linear probability model. The coefficients can be interpreted as effects on the probability of the dependent variable being one. Most economists would do this. You *could* estimate a logistic model with the same variables on the right hand side, but it is much harder to interpret the magnitude of the coefficient on the interaction.
@tjahangon7286
@tjahangon7286 8 жыл бұрын
Doug McKee Do you mean that if y is a binary dependent variable and: 1. I use command [regress y DTr DPost DTrXDPost], then I am "estimating a linear probability model. The coefficients can be interpreted as effects on the probability of the dependent variable being one." 2. I use command [.logit y DTr DPost DTrXDPost], then "it is much harder to interpret the magnitude of the coefficient on the interaction." I hope your answer is "yes".
@vegasastras9194
@vegasastras9194 3 жыл бұрын
What is that program 8:17, looks very neat
@donasp5391
@donasp5391 3 жыл бұрын
Stata
@ahmedseliem3201
@ahmedseliem3201 3 жыл бұрын
how to do a difference in difference method using SPSS? need practical steps
@brothermalcolm
@brothermalcolm 3 жыл бұрын
everything made sense until @7:55 help!
@matinhewing1
@matinhewing1 6 жыл бұрын
Who down voted this video? Someone who didn't get a free lunch?
@weoweoteo
@weoweoteo 6 жыл бұрын
lol! this vid was super helpful. especially for my econometrics exam tomorrow xd
@joaoluistbarroso6917
@joaoluistbarroso6917 3 жыл бұрын
Show
@Nem3siS4o
@Nem3siS4o 7 жыл бұрын
Thanks!
An intuitive introduction to Propensity Score Matching
17:06
Doug McKee
Рет қаралды 210 М.
Econometrics - Difference in Differences
16:29
Econometrics, Causality, and Coding with Dr. HK
Рет қаралды 12 М.
Osman Kalyoncu Sonu Üzücü Saddest Videos Dream Engine 170 #shorts
00:27
1 or 2?🐄
00:12
Kan Andrey
Рет қаралды 33 МЛН
Can You Draw A PERFECTLY Dotted Line?
00:55
Stokes Twins
Рет қаралды 85 МЛН
An intuitive introduction to Instrumental Variables
19:11
Doug McKee
Рет қаралды 86 М.
Basic Difference-in-Differences Method (DID) | Estimation Methods | Stata Tutorials Topic 42
12:00
Dr. Bob Wen (Stata, Economics, Econometrics)
Рет қаралды 1,5 М.
Difference-in-differences methods
16:18
Mikko Rönkkö
Рет қаралды 42 М.
An intuitive introduction to Regression Discontinuity
6:03
Doug McKee
Рет қаралды 108 М.
9 - Difference-in-Differences
33:01
Brady Neal - Causal Inference
Рет қаралды 9 М.
Staggered Treatment in Difference-in-Differences (The Effects, Videos on Causality, Ep 56)
9:02
Econometrics, Causality, and Coding with Dr. HK
Рет қаралды 12 М.
Quasi-experiments: difference-in-differences
11:34
Nathan Wozny
Рет қаралды 15 М.
Causal Inference: A Simple Difference-in-Difference Model
26:38
Mike Jonas Econometrics
Рет қаралды 54 М.
Osman Kalyoncu Sonu Üzücü Saddest Videos Dream Engine 170 #shorts
00:27