I just want to say thank you for making this video. It gives access to all researchers who don't have the benefit of having Prof. King right there in the office to talk to about methods. I really appreciate it
@DrMichael_Psychology5 жыл бұрын
I have just read your Ho et al. (2007) paper on this topic and I have NEVER NEVER read a statistics/methodology paper that was as clear as this one. Written in a way that tries to make it as simple and down to earth as possible. I understood every single word and tip, which never ever happens with this kind of reading. A big thanks, I'm very grateful
@juliocardenas44853 жыл бұрын
Can you please share the citation here ?
@DrMichael_Psychology3 жыл бұрын
@@juliocardenas4485 "Matching as Nonparametric Preprocessing for Reducing Model Dependence in Parametric Causal Inference" that's the title
@juliocardenas44853 жыл бұрын
@@DrMichael_Psychology thank you
@Robokopp96 Жыл бұрын
Absolutely amazed at how simply and concisely he explained this. Thank you so much for this lecture.
@davidaustin69624 жыл бұрын
Every medical researcher in the world should be required to watch this. Every medical doctor should be made aware that a properly matched high-population observational study can be as good as an RCT so long as there isn't an endogeneity or simultaneity problem.
@chunhuigu40862 жыл бұрын
In 33:13 Dr. King gave example when propensity score is a constant for all observations (pi=0.5), he showed that pruning by matching on propensity score did worse in creating pairs when we could have perfect match by Mahalanobis distance matching. But I have said that PS match is designed to solve problems when exact matching (or even coarsened exacted matching) is not possible, and PS contains information about the covariate’s distribution. Dr. King basically gave an example where PSM is not designed from the beginning. It is like you can do a paired t-test in the original data but we decided to do a two-sample t-test. Is the two-sample t-test inferior to the paired-t-test in this case? Of course, if your data is perfect. But how often you have perfect data like this by chance in an observational study, the answer is almost never. I don't think the title should be Why Propensity Scores Should Not Be Used for Matching. It should be PS matching should not be used with no brain (since you need to check the data where other methods can be applied).
@bossdunkz2 жыл бұрын
I agree with your note here as I had a similar reaction. However, I will say that in my field people tend to default to PSM without much critical thought, as it's just what they've seen in the literature. I appreciate Dr. King explaining why choice of matching algorithm is important and what considerations we should keep in mind.
@nbwork5002 жыл бұрын
Amen. Why in the world would anyone use the output of a PS model (or any model) that is such absolute trash that it always predicts the same value?!?! If you have a garbage model, you get matching at random. It's a straw man argument (he doesn't address GOOD PS models, he claims that they are all bad because he found that they don't work when they are done terribly). I disagree with King's conclusions EXCEPT when models are automated and no one is watching how the DV is random or the IVs don't relate to the DVs. In those cases, you get a trashy model that really does predict everyone at the same value (or nearly so). I think we can rename this "The dangers of blindly applying models that have no fit and predict nothing." I love King's way of showing how we need to make our PSs meaningful or we are adding bias and hurting ourselves. Those explanations are nice!
@kypriotis12 жыл бұрын
My understanding was he used that as a thought experiment to illustrate that even if you match with propensity scores on a dataset where treatment was randomly assigned (which is the dataset we hope to have), then when you prune it would be at random. But I think the more practical point is that if you had a normal observational dataset and project multiple covariate dimensions into a single dimension with a propensity score, then match and prune, and then project the remaining observations back to multidimensional space, your matched pairs could be very different covariate by covariate even though they had similar propensity scores. The distributions of each covariate overall will be similar between the treatment and control group, but the matches themselves likely will not. CEM or MDM don't have that problem and are strictly superior to PSM when possible.
@melkamumerid827 Жыл бұрын
What a terrific lecture ❤️❤️❤️ Thank you very Professor Gary King!
@luanacuadro97098 жыл бұрын
Thank for sharing! It makes it easier to understand the subject.
@desmondcampbell93583 жыл бұрын
I believe Prof King knows what he's talking about. However at 32:18 he gives an example of propensity score matching where all subjects (cases and controls) get the same propensity score and then goes on to say this doesn't produce very good matching. Of course it won't. The propensity score contains no information. I don't understand how this is a good example of propensity score matching. Can anyone explain?
@RubenMartinezCuella3 жыл бұрын
My understanding here is that he is giving best case scenarios. The best case scenario for PSM is that all observations have the same PS (meaning that they are as perfectly comparable as they can be, they would belong to the same strata). Thus, rather than the PS containing no information, it does contain the information that those units are comparable to the highest degree according to PS.
@nbwork5002 жыл бұрын
The only way for a model to produce only a single value is for the model to have no predictive power... no fit....a terrible and useless model. In PS application, this is where you are predicting a random selection -- where your groups are already balanced. They are ALREADY balanced. You don't need to PSM them to get an outcome measurement, you can just take one naively on the total T and C groups.
@lucasng96174 жыл бұрын
Very clear explanation. Prof. King is awesome.
@molloarden89387 жыл бұрын
Great video. It would appear that when using the PS, the focus should not be on minimizing overall PS distance in the sample, but on finding suitable comparator cases to established a believable counterfactual to the treated case, hence the preference for blocking, stratification, and other nearest neighbor caliper-based (one on one, or many on one) PS methods, whatever the distance measure being used (Mahalanobis, logit etc). And, yes, pruning should be kept at a minimum. Thus, IF the PS is well estimated (i.e. reflecting accurate understanding of the treatment selection), an IPTW approach may suffice without any sort of formal matching and the associated pruning.
@annemontgomery61675 жыл бұрын
You just saved this researcher from PS matching, thank you! Will have to find out how to do Euclidean matching but it can't be worse haha.
@MsLinda55554 жыл бұрын
Hi! I am working on this as well atm, did you find out how to do the matching with Euclidean?
@yulinliu8503 жыл бұрын
I really enjoyed the lecture. Thanks a lot!
@anneshrestha66037 жыл бұрын
Thank you very much for this video. Would it be possible to expand on why propensity score matching does not help with the curse of dimensionality problem.
@seckinbilgic3 жыл бұрын
Propensity skor hesaplamalarına göre eşleştirmenin 4 adet önemli avantajı vardır: -Dengeli dağılım sağlanır -Modele bağlılık ortadan kalkar -Araştırmacının yargısı ortadan kalkar -Yanlılık ihtimali ortadan kalkar
@ashumuluneh20514 жыл бұрын
Thank you for your detail explanations Do you have video explanation about Endogenous treatment switching regression
@imgamaleldin8 жыл бұрын
Thanks for sharing the video. It is very useful.
@remkoning89159 жыл бұрын
Great explanation of matching, model selection, and why using propensity scores with observational data will lead you astray.
@JC-dl1qr7 жыл бұрын
Very great video, thanks for uploading.
@nextcontext2 жыл бұрын
Hm, but you only get the treatment effect on the treated, if you drop all the controls?also, matching will not help, if you have unobserved confounders, right?
@juliocardenas44853 жыл бұрын
Absolutely fabulous!!
@muhammadrashidjaved65887 жыл бұрын
Hello, I am using Stata 14, How should I take into account sampling weights when using psmatch (because my dataset is survey dataset I did estimation with survey commands)?
@tomasarriaza938 Жыл бұрын
Prof. King suggests that unobserved variables are balanced on average using matching, but matching doesn't work in that way. As far as I know, if you don't observe an important variable (confounder), you will get a biased estimation, won't you?.
@lunafeliz16696 жыл бұрын
Great and helpful contents!
@ivansteenstra96787 жыл бұрын
Conclusion: "Matching methods still highly recommended; choose one with higher standards." So question is: which one(s)?
@184mauro6 жыл бұрын
No ! clearly CEM - read the paper and watch the whole video
@Sara-uj4xw6 жыл бұрын
The paper published in November 2018 has some helpful and concrete guidelines on what they consider preferable options
@khoipham61055 жыл бұрын
Genetic Matching is a good option if computational resources is not a constraint
@khoipham61055 жыл бұрын
airjesse123 kNN might not be suitable for cases when features have different contribution towards the result. For example, for probability of hiring, maybe matching based on age is more important than on height, but kNN would calculate the distance and regard these two dimensions as equivalent.
@kordigimon6 жыл бұрын
Great lecture!
@philippr.45587 жыл бұрын
Hey guys, very interesting video. Thanks for uploading! now I have a question. How to cite the content of this video in a masters thesis :D? Is there any paper - I'm not able to find - which contains these findings and is citable?
@philippr.45587 жыл бұрын
well. I found it. see the QR at the end of the video.
@kavyaravindranath52066 жыл бұрын
Definitely too late, but here is the paper- gking.harvard.edu/files/gking/files/psnot.pdf
@janus114 жыл бұрын
Excellent video! Question: Why is a RCT considered the gold standard when it’s more prone to bias? (Or is this a wrong assumption I’m making?)
@binhbui415 Жыл бұрын
I don't think you are wrong. RCT is a gold stanard if the assignment of subjects is not biased. Once it is not biased, it certainly indicates the effect of treatment. However, RCT is prone to bias because it's hard to make sure that two groups are of the same characteristics (i.e., systematically similar). Even when we can do that, there is ethical fear. E.g., we cannot assign a group taking a medication and the other not taking medication. What if the medication is bad for the treated group? - My personal thinking.
@powermod67723 жыл бұрын
I am not fully convinced. Sure, you get better matches by using a distance measure directly on the covariates. However, this does not guarantee that the matched control observation gives us a better estimate of the unobserved potential outcome Y_i(O). Certainly not for individual treatment observations i. Maybe on average... But then I wonder if on average PSM is really worse in these terms. Are there some theoretical result that prove that closer covariates yield better estimations of the treatment effect? Such a result is missing in this talk to fully convince me.
@bossdunkz2 жыл бұрын
The theory on why closer covariates would give you a better estimate of treatment effect are pretty straightforward. The matching is a form of control. For individuals with essentially the same covariates but a different treatment effect, what is the outcome? Measure the differential in the outcome between matches with and without the treatment effect and you have your estimate.
@richardchen65782 жыл бұрын
I think PSM's strength is at dealing with high dimentional X
@daytonkillian23307 жыл бұрын
Can we get this slide deck anywhere?
@daytonkillian23307 жыл бұрын
Found it: gking.harvard.edu/files/gking/files/psnot-yale.pdf
@matteozantedeschi Жыл бұрын
Using PSM is collapsing all the variables in a single one - the assumption here is that the obtained variable is "all you care about" for measuring differences in Treated vs Control observations Therefore, you can even have a young educated person matched with an old non-educated one - and this is fine if the Propensity Score they are similar (of course, previous assumption must hold) In the extreme case all observations have PS = 0.25, it means all observations are equal for what it matters for the experiment, and therefore you can really randomly select one for every treated unit: is like you really did a randomized experiment I think the whole discussion is about if the PSM is able to effectively collapse/approximate all variables into one (the PS) without loosing critical information, and of course this is not always true. On the other side, fully blocked experiments (based on distance measures) may struggle on high-dimensional spaces where non-important variables may dominate the matching. I agree PSM is never optimal, but I would prefer it when I'm not sure about all confounding variables (without considering the easiness of doing it in real-world scenario) - CEM or other methodologies when I'm more confident about experiment environment and all relevant confounding variables
@ZhaoXiliang9 жыл бұрын
Great!
@golagaz3 жыл бұрын
x2 speed up for me works better, but great lecture.