Propensity Score Analysis in R with Nearest Neighbor, Optimal Pair, and Optimal Full Matching

  Рет қаралды 13,693

statsguidetree

statsguidetree

Күн бұрын

Пікірлер: 44
@statsguidetree
@statsguidetree Жыл бұрын
I needed to update the rcode to load and clean the dataset to get the data ready for the analyses. Please use the updated rcode here to follow along: gist.github.com/musa5237/78a694bd6663a92a82e45e684e616724
@lizhang9898
@lizhang9898 15 сағат бұрын
Just to clarify, the first covariate is the same as your dv?
@basser1995
@basser1995 2 жыл бұрын
I am pretty desperate because i need to perform propensity matched analysis, having never used R-statistics, (used SPSS). But 15 minutes into this video i can already tell it's going to be extremely helpful!
@statsguidetree
@statsguidetree 2 жыл бұрын
Thank you so much for the compliment.
@hasanhash12
@hasanhash12 8 ай бұрын
Hi, Thank you for video. I loaded dataset coll from the link that you have pinned and then ran the script from identify field names to adjust units for continuous variables. After running it makes all values as NULL in coll and makes coll2 as o obs. of 6 variables. what should i do?
@hasanhash12
@hasanhash12 8 ай бұрын
and also at line 136 #no psa, just regression if i run mod_test1
@hasanhash12
@hasanhash12 8 ай бұрын
I suppose problem is here at line 22: coll
@francyy-ug1qr
@francyy-ug1qr 3 ай бұрын
thank you sm!!
@brainwt
@brainwt Ай бұрын
Very good guide! Thanks
@alexwisniewski7105
@alexwisniewski7105 11 ай бұрын
Do you include both the quadratic and non quadratic terms in your propensity match? For example, if my quadratic term had a lower SDM, should I remove the non quadratic term and just include the quadratic one in my final model?
@statsguidetree
@statsguidetree 11 ай бұрын
This depends on your data and the type of relationships you want to capture and what makes sense specifically for the data you are working with. If you have a quadratic term and quadratic term for your explanatory variable in the model, you are saying that the relationship between your response and the explanatory variables is quadratic and linear (i.e., your model captures both), but just keeping the quadratic term you are saying the relationship is just quadratic. Generally, if you want to capture wider scope of relationships you can leave both but be mindful this could lead to overfitting.
@priyankaroy7243
@priyankaroy7243 2 жыл бұрын
while im installing "MatchIt" it shows "There is no package called MatchIt". How to solve it?
@statsguidetree
@statsguidetree Жыл бұрын
Hello, just saw your post. Did you run the code library(MatchIt) first with out running install.packages("MatchIt") I did not install it again because I already installed it before. I kept that line in the code but put the hash sign # first so it was there as a note. Try running it without the hash sign.
@priyankaroy3686
@priyankaroy3686 Жыл бұрын
@@statsguidetree Yes that's solved. Thanks!
@vikasmishra4485
@vikasmishra4485 2 жыл бұрын
This video is pretty informative. I have one question. In cov balancing plot using cobalt, we need to match both mean and variance stats? In my case mean us balanced with in the threshold but variance is not. Can i say that matching is balanced with mean balancing only?
@statsguidetree
@statsguidetree 2 жыл бұрын
It is good to have both, I presented only one set of criteria to use but there has been other suggested criteria. Also, recommendations in the literature are always changing. I would try some techniques to see if I get a better balance. But, if I cannot do a better job I would just report in the methods and discussion/limitation. Balancing the covariates will be a big part of the challenge to PS matching.
@sanjanakhondaker887
@sanjanakhondaker887 9 ай бұрын
What an amazing explanation!!! Hats off. You even provided the R-script. Super helpful! You saved my thesis, thank you so very much.
@maddybond007
@maddybond007 2 жыл бұрын
Please validate if this link has same data, which you have posted initially, since your link is no more accessible: LINK: ed-public-download.app.cloud.gov/downloads/CollegeScorecard_Raw_Data_04262022.zip
@statsguidetree
@statsguidetree 2 жыл бұрын
I will try to find a way to load the dataset on my GitHub. But, until then, I can email it you. Just send me an email at statsguidetree@gmail.com
@amalalkalbani4572
@amalalkalbani4572 2 жыл бұрын
Thank you for the comprehansive explanation. I have an issue with my PSA, the variance ratio doesn't appear when I use the summary function. I got dots only! could you please tell me why? Thank you. (All my covariates are categorical & Binary)
@fleurestethique
@fleurestethique 2 жыл бұрын
I had the same problem when I entered my covariates as factors into the formula, but variance ratios appeared once I converted them as.numeric. I don't know what that means in terms of interpretation though
@statsguidetree
@statsguidetree 2 жыл бұрын
​ @fleurestethique I noticed that the function to visualize the overrate imbalance love.plot() does not allow for categorical variables. However, you can still inspect the covariate imbalance when you use the summary() function.
@user-iq2qr8lb2y
@user-iq2qr8lb2y 10 ай бұрын
I did the the first step (design phase: selecting covariates) but only 3 out of 14 are significant. And I want to know if it is considered balanced or not and what to do.
@statsguidetree
@statsguidetree 10 ай бұрын
So if covariates are significant it won't be related to whether the values of those covariates are balanced across treatment conditions. To check balance you have to look at standardized mean difference and/or variance ratios values to see whether they are in some threshold you decide to use.
@sharmilibalarajah1940
@sharmilibalarajah1940 2 жыл бұрын
Thank you, this was really helpful! Do you have any ideas about how I can approach this if I want to match three groups i.e. non-binary??
@statsguidetree
@statsguidetree 2 жыл бұрын
I can say that generally PS analyses can be conducted with non-binary treatment groups (i.e., treatment variable with more than 2 levels). But, I do not think the MatchIt package supports it (I could be wrong because it could have been updated). There is another package available if your treatment variable has 3 levels instead of 2 levels called TriMatch. I am not too familiar with the package but here is the general documentation: cran.r-project.org/web/packages/TriMatch/TriMatch.pdf
@lanredaodu945
@lanredaodu945 2 ай бұрын
excellent tutorial i watched 3x
@muhammedhadedy4570
@muhammedhadedy4570 Жыл бұрын
I've watched many tutorials explaining propensity score matching on KZbin, and I can tell that this video is the best I've ever seen. Well done, sir. You helped me a lot. ❤❤❤❤
@manonkinaupenne2090
@manonkinaupenne2090 Жыл бұрын
Thank you very much for this clear explanation! I have a small question: would you use PSM to match patients to healthy controls in a cross-sectional case-controled study? I want to look at the difference in physical activity expressed in minutes per day (dependent variable) between these two groups. thank you!
@statsguidetree
@statsguidetree Жыл бұрын
Yes, PSM should always work when you have a control group.
@SCaRaB6288
@SCaRaB6288 2 жыл бұрын
can we use categorical covariates e.g. 1 = male 2 = female or should they be dummy coded? Thank you
@statsguidetree
@statsguidetree 2 жыл бұрын
Yes. Categorical covariates can be included.
@festusattah8612
@festusattah8612 Жыл бұрын
great video!!! what will you advise I do if I have more 'treated than control' and the matching approach to use if treatment is not randomized; take for example a state legislation
@statsguidetree
@statsguidetree Жыл бұрын
You can try using K to 1 matching and optimization or you can try full matching. You can run both and compare which gives you better balance across your covariates.
@김수연-h2r5k
@김수연-h2r5k 2 жыл бұрын
Thank you for informative video. I did full matching based on your video, and ran comparisons after propensity matching. But, mean, standard deviations and p score did not change at all compared to unmatched data. How can I solve this problem?
@statsguidetree
@statsguidetree Жыл бұрын
That is a good question, I assume you are talking about p-values in your final model post matching -- if that is the case, ultimately with PS matching you are attempting to just balance the data between your treatment and control groups to make more reliable interpretations of your final model. It could be that after balancing your data you find no average treatment effect.
@fleurestethique
@fleurestethique 2 жыл бұрын
This was extremely helpful thank you so much! When working with subsets, should I calculate the propensity scores on the whole dataset first and then apply them on the subset or directly calculate the propensity scores only for observations in my subset? Also, the dataset I am working requires me to incorporate additional weights due to the way they did the sampling. How can I apply both the propensity score and the other weights in my regression? Thank you
@statsguidetree
@statsguidetree 2 жыл бұрын
I may need some more information on the nature of the dataset. But, generally, you could calculate PS for the whole dataset. For your other question about weights, not all PS matching methods produce weights. For example, if 1:1 matching without replacement is used, all the weights =1. But, if you are using a PS matching method that does produce weights and you already have a set of weights you need to apply -- there are a few things you can do. The issue is the 'weights' argument in the lm() function only allows you to use a vector. Now you may have a reason depending on the nature of your dataset to not use the whole dataset and consider subsets -- if that makes sense. Or you may want to consider combining the two sets of weights by multiplying; however, you would need to look at the weights produced and see whether they make sense, before carrying out your regression analysis. Ultimately, my suggestions are just general statements, you may want to consult with some other sources (e.g., previous PS analyses using your dataset or a similar dataset, content experts, etc.).
@velonty
@velonty 26 күн бұрын
@@statsguidetree Thank you for the awesome video. I have a similar question. If I am using a data set that has survey design requirement. Do I carry out propensity score matching with just the sample data or with the weighted dataset?. Or can I just carry out the PS matching with the sample and in my final regression of the matched data I use the weighted data(survey design weight)
@analyticspipeline2526
@analyticspipeline2526 2 жыл бұрын
Great video, thank you for that!
@praveena6095
@praveena6095 2 жыл бұрын
Great video. If I want to include in my analysis part some additional covariates which are not used for matching, how can I get it in my data after using match.data.
@statsguidetree
@statsguidetree 2 жыл бұрын
If you want to use additional variables in the analysis phase you can enter those additional variables in the final regression model that were not included in the matching process.
@katieweir4166
@katieweir4166 Жыл бұрын
The data doesnt work anymore!
@statsguidetree
@statsguidetree Жыл бұрын
My apology for the delayed response, you can use the following code to load it into r: coll
An intuitive introduction to Propensity Score Matching
17:06
Doug McKee
Рет қаралды 213 М.
Propensity Scores: How and Why they are Used in Clinical Research
4:28
F. Perry Wilson, MD
Рет қаралды 21 М.
Worst flight ever
00:55
Adam W
Рет қаралды 29 МЛН
哈莉奎因怎么变骷髅了#小丑 #shorts
00:19
好人小丑
Рет қаралды 54 МЛН
Стойкость Фёдора поразила всех!
00:58
МИНУС БАЛЛ
Рет қаралды 4,5 МЛН
Random Forest in R for Classification and Regression
51:37
statsguidetree
Рет қаралды 108
Gary King, "Why Propensity Scores Should Not Be Used for Matching"
1:00:56
Methods Colloquium
Рет қаралды 69 М.
Treatment effects in Stata: Matching estimators
9:03
StataCorp LLC
Рет қаралды 1,7 М.
Propensity score matching: an introduction
8:32
Ben Lambert
Рет қаралды 97 М.
Propensity score weighting
33:22
Philip Moons
Рет қаралды 16 М.
Propensity scores: Everything you need to know in 5min
6:49
Michael Fralick
Рет қаралды 84 М.
Propensity Scores 101
6:57
Terry Shaneyfelt
Рет қаралды 20 М.
Worst flight ever
00:55
Adam W
Рет қаралды 29 МЛН