Multiple Imputation: A Righteous Approach to Handling Missing Data

  Рет қаралды 37,889

Omega Statistics

Omega Statistics

Күн бұрын

Пікірлер: 77
@georgezisis1122
@georgezisis1122 3 жыл бұрын
OMG!!!!!!!!!!!!!!!!!!! you have no idea how many years of life you saved me. Thanks!
@omegastatistics
@omegastatistics 3 жыл бұрын
Glad to have helped!
@joat9105
@joat9105 3 жыл бұрын
so which data we should input to the blank fill, the first one or the other? thanks
@omegastatistics
@omegastatistics 3 жыл бұрын
Hi The Suck, all of the imputed datasets are used when you run the analysis. You use the multiple sets and that is why it is called multiple imputation. Choosing any of the analyses with the little sea shell looking icon next to them from the analysis menu does this automatically for you.
@talzabidi1569
@talzabidi1569 3 жыл бұрын
Hi Dr, thanks so much for your efforts. I would like to ask , is there any conditions we can't use MI to treat missing data?
@omegastatistics
@omegastatistics 3 жыл бұрын
Hello Talal, I am not a Dr., but thanks for thinking so :) The most important is that your data is not MNAR. If your data is MNAR, then imputation and other methods are not going to return good estimates.(read more about missingness and basics of imputation here: www.ncbi.nlm.nih.gov/pmc/articles/PMC2818781/ In this presentation I showed MI using linear regression, which is used for continuous variables. You can also impute nominal variables with logistic regression imputation or discriminant analysis imputation. Here is a link to a paper about imputation of nominal variables: support.sas.com/resources/papers/proceedings/proceedings/sugi30/113-30.pdf
@majid85
@majid85 3 жыл бұрын
Can this method be used for precipitation data, when a whole year of data is missing? or in some years a few months in a row are missing?
@omegastatistics
@omegastatistics 3 жыл бұрын
It depends on how your data is structured. Also, remember that your data must be MCAR or MAR to use Multiple Imputation. If you are MNAR data then MI should not be used (but is often is). Try to run the imputation and see what you get.
@Collinskoma
@Collinskoma 4 жыл бұрын
Hi im trying to impute this data and it has lot of NAs and i cant remove the NAs because there are a lot. so when i run the mice code i get this error: Error in solve.default(xtx + diag(pen)) : system is computationally singular: reciprocal condition number = 3.35108e-20. any idea how to handle this kinda problems.
@omegastatistics
@omegastatistics 4 жыл бұрын
Hi Collins you will need to remove the NAs otherwise the program will think you have string variables, and imputation works on numeric variable. If there is a way you can put your dataset into Excel, then you can do a search for the NAs by clicking on the arrow by "Find and Select" then "Replace" then under the Find What: type NA (Type it exactly like it is in the data) and then in the Replace with: hit the space key. This will erase all of your NA's. Be sure to make a backup of the file before trying this in case things go awry. Hope this helps
@Collinskoma
@Collinskoma 4 жыл бұрын
@@omegastatistics thanks will do
@20jakubukaj08
@20jakubukaj08 5 жыл бұрын
Another question: what is the pooling method that is employed in SPSS to aggregate the imputed results? Is this just the mean of the 5 regression coefficients, for example?
@omegastatistics
@omegastatistics 5 жыл бұрын
Look at slide 34 in the presentation at this link I think this may help. rmc.ehe.osu.edu/files/2018/02/0.0-Workshop_missing-data-with-SPSS_Finalaudience.pdf
@bevansmith3210
@bevansmith3210 5 жыл бұрын
Thank you for the video. Could you kindly respond to this question? For future analyses, which imputed dataset do we use? I know we are meant to use the pooled data, but that means you need to use all 5 imputed datasets each time you want to measure something (mse, regression, etc.) . But what if you just want to impute the missing values and end up with ONE dataset for future use? Thank you, Bevan
@omegastatistics
@omegastatistics 5 жыл бұрын
Hello Bevan, Please look at Dr. Kumar's response on this link: www.researchgate.net/post/How_can_one_create_a_pooled_dataset_in_SPSS_for_further_analysis There is a part where he states to pool the 5 imputed variables by summing: V1+V2+v3+V4+V5 for instance. However he forgot to say to divide the sum by the number of imputations, in this case 5. So basically you can take the average of the imputed values to derive the pooled values. I am not sure if SPSS now gives pooled standard deviations, and so you may have to do some averaging with standard deviations and standard errors too.
@SereniTy_Corner
@SereniTy_Corner 4 жыл бұрын
Please what ad-on do I need in SPSS to be able to perform Multiple imputation. Can't seem to find it in 'analyze'. Also, how do I add it to what I already have?
@omegastatistics
@omegastatistics 3 жыл бұрын
Please check with IBM regarding obtaining a license for the Missing Values add-on and adding it to what you have.
@joannayeung6695
@joannayeung6695 3 жыл бұрын
Thank you so much! This is really helpful! May I know if my data sets include both categorical and continuous data, how should I handle it? I have coded the categorical missing data as dummy variables, then should I just do the multiple imputations on the continuous missing data?
@wellsmemphis8170
@wellsmemphis8170 3 жыл бұрын
I guess Im asking the wrong place but does someone know a way to get back into an Instagram account..? I stupidly lost the account password. I appreciate any tips you can give me
@baylortucker1867
@baylortucker1867 3 жыл бұрын
@Wells Memphis Instablaster =)
@wellsmemphis8170
@wellsmemphis8170 3 жыл бұрын
@Baylor Tucker thanks so much for your reply. I got to the site on google and I'm waiting for the hacking stuff atm. Looks like it's gonna take a while so I will reply here later when my account password hopefully is recovered.
@wellsmemphis8170
@wellsmemphis8170 3 жыл бұрын
@Baylor Tucker it did the trick and I actually got access to my account again. I'm so happy:D Thanks so much, you saved my ass !
@baylortucker1867
@baylortucker1867 3 жыл бұрын
@Wells Memphis You are welcome :D
@snigdhodas2848
@snigdhodas2848 3 жыл бұрын
Great video...I wanted to ask is there any method by which we just impute only certain missing cells and keep the other missing cells unimputed or vacant as it is?
@omegastatistics
@omegastatistics 3 жыл бұрын
Hi Snigdho, You can easily leave entire variables out of the mix, but cell by cell, you may need to use some code with filters. Here is a link to some information on using filters: wlm.userweb.mwn.de/SPSS/wlmssel.htm
@snigdhodas2848
@snigdhodas2848 3 жыл бұрын
@@omegastatistics thank you
@omegastatistics
@omegastatistics 3 жыл бұрын
You're welcome
@bethanyhendriks4596
@bethanyhendriks4596 4 жыл бұрын
Can somebody please help, I have a very large data set, when I try to run the impute I get a error message come up on my output which says "contains more than 100 parameters, no missing values will be imputed" Please help
@omegastatistics
@omegastatistics 4 жыл бұрын
Hi Bethany, Here is a link to IBM with info on what you can check. I would check the level of measurement for your variables in the Variable View of your dataset first, then go from there. www.ibm.com/support/pages/multiple-imputation-warning-model-contains-more-100-parameters
@zoiyaehtisham818
@zoiyaehtisham818 3 жыл бұрын
Hello, its a very helpful information. I am now stuck in a situation, please can you guide me why my minimum and maximum values are not appearing in constraint section when I type zero and 100 ?
@omegastatistics
@omegastatistics 3 жыл бұрын
Hi Zoya, It is hard for me to know what exactly you are asking and to give an answer without seeing your study/data. I do know that the minimum and maximum options for constraints are only available for when you choose "Linear Regression" as the scale variable model type in the "Method" tab. I hope you were able to figure this out.
@zoiyaehtisham818
@zoiyaehtisham818 3 жыл бұрын
@@omegastatistics thank you so much for your reply. I figured it out. my data is large and variables are categorical and ordinal and data is missing not at random as I was experiencing maxmodelparam , after giving the maxmodelparam = desired no an error has occured mentioning Warnings The procedure cannot access a file with the given file specification: imputeddata for keyword IMPUTATIONS of subcommand OUTFILE. The file specification is either syntactically invalid, specifies an invalid drive or directory, specifies a protected directory, specifies a protected file, or specifies a non-sharable file. Execution of this command stops.
@HealthbeautyluckyshahBlogspot
@HealthbeautyluckyshahBlogspot 3 жыл бұрын
Great way to teach and I solved the issue I was having. But I have few questions, 1) can you share the reference article to add? 2) I have read few papers and they suggest different methods(linear or logistic) for different type of variables. I noticed you used for all type variables. Can you explain or give reference to article? 3) if I understood correctly data MAR not NMAR?
@omegastatistics
@omegastatistics 3 жыл бұрын
Hello Beautywdbrain, which article did you want? Please give me the name and I will see if I can find a copy. The example I did was for linear regression with a continuous variable outcome. Of course there will be different ways of working with different models. More than I can explain here. Goggle is your friend, do some searching and you will find info. :)
@HealthbeautyluckyshahBlogspot
@HealthbeautyluckyshahBlogspot 3 жыл бұрын
@@omegastatistics thank you for reply. I wanted to know reference of article to quote if I perform the same method as you did for research. Hope this clears
@gracexu602
@gracexu602 4 жыл бұрын
Great presentation. I have a question. I am analysing longitudinal surveys (3 time points: T1, 2, 3) by using a few scales. I need to compute one of the scales as some questions are negative questions. Then I need to sum each scale at each time points before I ran a post hoc analysis. Due to some missing data with one time point (either T2 or T3), I plan to ran imputation. I want to know if I should impute the original survey results only. Or I should impute all the variables include computed data and sum of each scale.
@omegastatistics
@omegastatistics 4 жыл бұрын
Hi Grace, Typically you will impute your data before you run your tests. But of course, make sure your data is nice and clean first, i.e no strange numbers that shouldn't be there, like an age of 200 or things like that.
@gracexu602
@gracexu602 4 жыл бұрын
@@omegastatistics Hi Omega, thanks for prompt reply.
@datascientist2958
@datascientist2958 4 жыл бұрын
How can we extract pooled imputed data set from SPSS?
@omegastatistics
@omegastatistics 4 жыл бұрын
Check into Rubins Rule: www.ncbi.nlm.nih.gov/pmc/articles/PMC2727536/
@roiad876
@roiad876 3 жыл бұрын
This has been very instructive, thanks! I'd just ask if you could please show us your syntax output? It would add to the reproducibility of research.
@omegastatistics
@omegastatistics 3 жыл бұрын
If you would like syntax and output, please email me at info@omegastatistics.com with your request. Thanks!
@qimeng9800
@qimeng9800 4 жыл бұрын
Thanks for this wonderful video! May I ask you a question that how could I get or deal with the not-showed pooled p、F SD values when I do the t-test/regression analysis?
@omegastatistics
@omegastatistics 4 жыл бұрын
Hi Qi, I am not sure exactly what you are asking. I know SPSS doesn't give pooled stats for many things, so if you are asking how to pool the standard deviations etc. Then you can start by checking out this link: stats.stackexchange.com/questions/460238/creating-a-pooled-data-set-from-multiple-imputation-output-in-spss
@lidiabezerra32
@lidiabezerra32 4 жыл бұрын
Hey, I loved this video. Please, i would like to know some references in the literature which approaches this issue
@omegastatistics
@omegastatistics 4 жыл бұрын
Hi Lidia, Here is a link to start with. There are many references at the end which you can check out too: www.bmj.com/content/338/bmj.b2393#:~:text=Multiple%20imputation%20is%20a%20general,obtained%20from%20each%20of%20them.
@interwebzful
@interwebzful 7 жыл бұрын
hey people: start 6 mins into it
@olegstupak7687
@olegstupak7687 4 жыл бұрын
It would be interesting to see more theory behind the procedures. Also, the lecture is inconsistent with the literature in some places.
@omegastatistics
@omegastatistics 4 жыл бұрын
Hi Oleg. I know it would be nice, but I am an applied statistician by trade, so although I've been taught the theory, my expertise is in the work. In most of my presentations I provide references that are often more theory based. As for being inconsistent with the literature, things like restraining the limits etc. I wanted to show that it could be done. Many studies are not like the textbooks, not as pretty, and sometimes it is necessary to relax assumptions or allow for limitations. I hope you still enjoyed the presentation.
@20jakubukaj08
@20jakubukaj08 5 жыл бұрын
Thank you very much for this useful and clear presentation. I have a question: one thing that you repeat a number of times in the presentation is that you cannot tell if your data are MAR or MNAR. It seems to me that if you find a correlation between missingness in one variable and another manifest variable (i.e. a variable for which you have observed data), then the data is certainly MNAR. Or perhaps I have misunderstood what you meant?
@omegastatistics
@omegastatistics 5 жыл бұрын
Correlations can be spurious, and caused more by a latent variable. What I meant by not knowing for sure, is that you really don't typically know for sure why something is missing because in many cases you can't go back to ask the respondent or data source what happened. Of course if you know something is systematic then you know. But often you won't know for sure.
@Silverwing_99
@Silverwing_99 3 жыл бұрын
great tutorial - i abhor SPSS, wish the practical was in R
@omegastatistics
@omegastatistics 3 жыл бұрын
Yes, we all have our favorites :) I am glad you still found the information useful. Here is a nice tutorial in R: data.library.virginia.edu/getting-started-with-multiple-imputation-in-r/
@bonniekenaley3831
@bonniekenaley3831 7 жыл бұрын
Thank you for your most valuable presentation!
@oscarbecerril8343
@oscarbecerril8343 4 жыл бұрын
Thank you for sharing, kind Lady.
@omegastatistics
@omegastatistics 4 жыл бұрын
Thanks for watching Oscar!
@monalisadas4186
@monalisadas4186 6 жыл бұрын
Can we get the ppt plz..Thanks
@omegastatistics
@omegastatistics 6 жыл бұрын
Yes! Anyone who wants the handouts, please email your request to info@omegastatistics.com and let us know the particular presentation you need the slides for. Thanks!
@omegastatistics
@omegastatistics 5 жыл бұрын
Hello Monalisa, you can email me at info@omegastatistics.com and request a powerpoint for any of my presentations. If I still have it I will email it to you.
@luanafantini6366
@luanafantini6366 4 жыл бұрын
It was very clear. Thank you!
@omegastatistics
@omegastatistics 4 жыл бұрын
You're welcome!
@bramantios5797
@bramantios5797 7 жыл бұрын
Nice, but i have a question. I ran 20 Iteration and need one single pooled data before making regression, how i can do from 20 iteration datasets into only single pooled data? Thank you in advance
@mohamedhabashy1262
@mohamedhabashy1262 6 жыл бұрын
It is easy if you are using R there is a package called MICE for Multiple Imputation and there is a function called Pool for aggregate your results for any number of iteration
@chrisjoyce92
@chrisjoyce92 6 жыл бұрын
I have similar problem. Did you get an answer on how to do this in SPSS?
@mohamedhabashy1262
@mohamedhabashy1262 6 жыл бұрын
you can aggregate you data; go to data menu and select aggregate the function is the mean or mode or median depends on your variable types
@bramantios5797
@bramantios5797 6 жыл бұрын
Mohamed Hussein please let me know how to do that, i mean can you give specific instruction so i can make a single pooled data. moreover, is there any academic source about how to make a single pooled data? thanks in advance.
@omegastatistics
@omegastatistics 5 жыл бұрын
Hi Bramantio, SPSS will do that for you if you choose regression and run as usual after imputing. Any processes that can make use of the imputed data will have a little sea shell looking icon next to the routine names in the Analyze menu items.
@Mariajoseschulz
@Mariajoseschulz 5 жыл бұрын
It's so clear.. Thank you very much.
@estat2127
@estat2127 4 жыл бұрын
helpful thanks!
@omegastatistics
@omegastatistics 4 жыл бұрын
Glad it was useful Black Hero!
@sefinehfenta4248
@sefinehfenta4248 3 жыл бұрын
Thank you
@omegastatistics
@omegastatistics 3 жыл бұрын
You're welcome!
@ayhanayhanayhan1
@ayhanayhanayhan1 7 жыл бұрын
Perfect! Thanks a lot.
@guesswhatteapots
@guesswhatteapots 5 жыл бұрын
Thank you!
@abc10il
@abc10il 6 жыл бұрын
Great, Thank you
@sillyflowerdance
@sillyflowerdance 5 жыл бұрын
Thanks :)
GIGO No-No's! Problems and Solutions in Data Preparation
1:08:49
Omega Statistics
Рет қаралды 338
Imputation of missing data - Multiple imputation using SPSS
28:46
Какой я клей? | CLEX #shorts
0:59
CLEX
Рет қаралды 1,9 МЛН
"Идеальное" преступление
0:39
Кик Брейнс
Рет қаралды 1,4 МЛН
How to Use SPSS: Little's Missing Completely at Random (MCAR) Test
4:27
Biostatistics Resource Channel
Рет қаралды 35 М.
Dealing With Missing Data - Multiple Imputation
11:02
ritvikmath
Рет қаралды 48 М.
Handle Missing Values: Imputation using R ("mice") Explained
22:47
DataExplained
Рет қаралды 65 М.
Webinar  Overview of Cox Proportional Hazard Models Cox Regression 11 29 18
1:21:27
Multiple imputation
10:44
Mikko Rönkkö
Рет қаралды 20 М.
Introduction to Bayesian Logic, May 2018
1:12:49
Omega Statistics
Рет қаралды 711
Multiple imputation in Stata®: Linear regression
7:20
StataCorp LLC
Рет қаралды 88 М.