How to Use SPSS-Replacing Missing Data Using Multiple Imputation (Regression Method)

Рет қаралды 341,033

Күн бұрын

Пікірлер: 148

@Flaya12 10 жыл бұрын

A simple thank you might not be appropriate for this great work you did and shared with the public and by doing this with me. So I want to tell you, if you ever feel down and or even feel worthless, remember that somewhere in Austria you made someone really happy by doing this tutorial!!! Thanks a lot. At first I thought it might be a bit long but it was worth every second and you did a really good job.

@TheUgly0duckling 11 жыл бұрын

Thank you! Saved me and my thesis.

@gregl4740 9 жыл бұрын

Thank you for the tutorial. I just ran this on my dataset successfully. However, I was wondering if there is a way to obtain pooled means and 95% CI's across iterations. For inferential analyses (e.g., correlation), I am able to obtain the pooled statistics. However, when I use Analyze -> Descriptive Statistics -> Explore, it will only give me the descriptive for the original data and each iteration *individually*. Is there a way to obtain the pooled descriptive for variables? Also, is there a way for SPSS to generate a dataset that only contains the imputed data after the final iteration? Thanks!

@stephaniesmith6047 11 жыл бұрын

This was a very informative video. I am currently examining some longitudinal data and of course there is a significant amount of attrition. I initially ran a regression analysis using exclude cases listwise but I didn't feel this was the best way to analyze the data. This technique definitely helps address some of those issues. Thank you so much for posting this!

@eligardner5436 9 жыл бұрын

hi, thank you for very helpful video. I followed all the steps but my output after running my first ANOVA, only showed the 5 imputations, not the pooled figures. how do I get the pooled figures?

@jessicabarton93 9 жыл бұрын

what happens if your data is missing not at random? I did the lIttle's test and it was significant. I can't figure out which MI to do in that case

@duallumni369 9 жыл бұрын

The video explains the concept in such a easy to follow steps. A great video for multiple imputation technique.

@kimconsultants 11 жыл бұрын

I have a large number of variables and SPSS does not seem to be able to do the imputation with all the variables at once. So, I did groups of variable separately. However, I get multiple imputed data files. How do you recommend combining the data files?

@masumarahim 10 жыл бұрын

this video was very useful; thank you. however, even when splitting the file by imputation, i cannot get pooled analyses. spss will perform the analysis for the original data and each of the five imputations but will then only give me the means and standard deviations for the pooled data, not, for example, chi-square or t-test values; nor will it give me a p-value. why might this be?

@AldoAguirreC 9 жыл бұрын

How are degrees of freedom reported after a t-test is performed using multiple imputation? I see that the number of df for the pooled data can be in the thousands, and it does not feel right to report such a high number when the N = 50 for example. Any advice or paper or paper that discusses this issue? Thanks!

@seanicusvideo 9 жыл бұрын

Question: if results and parameters are "pooled" (and not averaged) what is the specific calculation? e.g. for bivariate correlations, or linear regression outputs, for example?

@mrflowers1234 9 жыл бұрын

When writing a manuscript for a trial that has used multiple imputation to address missing data, what additional reporting should I include? Data pre and post imputation? Anything else?

@janecooper358 11 жыл бұрын

Thanks so much for your reply - sorry you misunderstood me, I've got 570 participants so I'll do a EM and see how I go. Thanks again, and thanks for doing the videos - I've just started my PhD and I'm sure I'll be tuning in quite a bit!

@2Luhna7 11 жыл бұрын

hello! thank you so much for the video. I have a question however. From what I understood you dont get one single databased with missing values replaced; you should work with the pooled results. So, my question is if there is any way to crate a new single database to import to other programs (for instance mplus or lisrel) and work on. I need to do that for CFA on my data...

@benjaminucr6636 10 жыл бұрын

I have run an analysis like the one shown in SPSS 19, and it didnt provide in the output neither the pooled results nor the fraction of missing information. Under "Edition-->options-->multiple imputation" the option "results for imputed and observed sata" is choosen. Any idea about how can I make to get the pooled results and the fraction of missing information in my output?

@seanicusvideo 9 жыл бұрын

this is helpful. the use and purpose of the extra imputation history file might be better elaborated. was very nice to include some references! thanks!

@shirleyanynameiwant5883 10 жыл бұрын

Thanks for this great video. I found it easy to understand. I now have a data file containing multiple imputations (5 imputations). My question is when reporting the univariate statistics and normality statistics, which results should I report given I have results from the original data set and results from the 5 separate imputations? Thank you in advance.

@carilynne1 10 жыл бұрын

Thanks for this video--your KZbin channel is saving my life! My question is similar to ones asked previously but I could not make sense of the reply about merging data in SPSS. I have completed multiple imputation for missing data (went great!) but I want to move this dataset into listrel for structural equation modelling. How can I get a single data set with the pooled information, rather than having the individual datasets for the imputation displayed and then SPSS pooling them during any further analysis? Thanks, Carilynne

@jackcannon3359 10 жыл бұрын

This is brilliant! Thanks for posting.... What is the minimum total number of observations (including missing obs) that this technique will work with? I have a dataset with 18 observations from 10 cases (should have 180 points in total) and I am missing 10 data points... Would multiple imputation be appropriate for this two-way repeated measures design? Thanks.

@tacappaert 10 жыл бұрын

That should work fine. I don't know that there is a minimum per se, but as long as your missing data is not the majority of the possible observations, it should work for you.

@katy8791 3 жыл бұрын

Hello ! Nice video! Any idea about how to calculate the reliability of a questionnaire in spss if we have missing data in some questions? Is it using the usual cronbach alpha?? And what should be putten in the cells of the missing data in spss?

@Jbalisasa 9 жыл бұрын

This is a great presentation. I really enjoyed it. Unfortunately for me, as I tried to follow it to impute my missing data, I keep receiving a warning which says that the imputation model for some variable contains more than 100 parameters. Below is an example of such warnings: "An iteration history output dataset is requested, but cannot be written. The imputation model for SYNC2 contains more than 100 parameters. No missing values will be imputed. Reducing the number of effects in the imputation model, by merging sparse categories of categorical variables, changing the measurement level of ordinal variables to scale, removing two-way interactions, or specifying constraints on the roles of some variables, may resolve the problem. Alternatively increase the maximum number of parameters allowed on the MAXMODELPARAM keyword of the IMPUTE subcommand. Execution of this command stops." This is repeated for quite a number of variables. Can someone help me understand how to hand this trouble? Thank you. Juvenal Balisasa

@tacappaert 9 жыл бұрын

Juvenal Balisasa This likely happening because there is data that SPSS cannot categorize or falls outside of the expected range that you specified. Be sure all categorical data has a coding value and be sure there are no numeric values that are out of the specific range.

@iskyrisky1969 9 жыл бұрын

+TheRMUoHP Biostatistics Resource Channel I have the same problem.

@iskyrisky1969 9 жыл бұрын

+Juvenal Balisasa Have u solved your problem. I dpo have the same problem

@Sspecial_KK 11 жыл бұрын

I saw earlier on your comment what to do on this issue, but I was not able to set min or max value. However, I found out that you can adjust the parameter in the syntax. It did worked out as I saw all the imputed value on the output. Unfortunately, on the data view tab I couldn't see any imputed variable, nor the upper right option to switch different data files. So, what went wrong? Could you help me out? Thanks in advance!

@rhissarobinson970 10 жыл бұрын

Hello, I really appreciate you sharing this video. It has helped me tremendously to figure out how to understand and implement this method for my data. Would it be possible for you to share the syntax? For some reason, my output for percentage missing (the first output you show us) does not show the mean and standard deviation of the variables in my output. I'm sure it's a just a line I missed in the syntax. Thank you!

@barakatunnisakmohdyusof9938 8 жыл бұрын

thank you. Great presentation. I have one question. Does the imputation can only be focused on primary outcomes?

@sivabalaji30 9 жыл бұрын

I have an query…..I am currently working on SPSS on a survey data…..It Contains Many Missing value’s……… Its is Not a Random sample(MNAR)…..what method should I use to replace Missing data ?

@guanlin6123 10 жыл бұрын

Thanks for the helpful video. if we need to remove outliers, removing outliers should run before or after imputing data? if removing outliers should run after imputing data, I wonder how to do that when we have 5 inputted data.

@tacappaert 10 жыл бұрын

Look for and remove outliers before imputation. These videos may help: How to Use SPSS: Identifying Outliers How to Use SPSS:Dealing with Outliers

@Anika-ze1bh 9 жыл бұрын

Hey, thank you for the helpful tutorial. Still I have a huge problem with my imputation. After running, it imputes, values that are way to high or even negative. So I defined the range which leads to an error that says something like (mine is in german): "after 200 drawings spss can´t find the imputed value for the variable xxx with it´s defined contraints. Please check if the defined min and max is appropriate or choose a higher maximum case draws" So it stops the imputation. Can you help me with that? anika

@singularity00001 10 жыл бұрын

Excellent work you did here! Thank you.

@chetanm12 11 жыл бұрын

First off, thank you so much for posting this video...it was very well made and I look forward to exploring other videos you have. As a follow up question to enemenoff's question...what are the differences for MI for random vs. non-random patterns? Did I miss that part in the video? Do you have a source I could visit? Thank you in advance!

@ruskamihkas9723 10 жыл бұрын

Not sure if you mentioned these, but I didn't succeed until I changed my missing value codes from 99 to blank (.) and changed ordinal variables into scales. Otherwise it wouldn't do the imputations and didn't even let me specify constrains. I had a Likert-scale of 1-5

@tacappaert 10 жыл бұрын

Glad, that worked. That is a pretty common issue.

@iskyrisky1969 9 жыл бұрын

+TheRMUoHP Biostatistics Resource Channel I have to change my nominal to scale

@anastemi 10 жыл бұрын

Hello, thank you for the video, it was very helpful. However when I ran multiple imputation on my data set, I got this message ' the imputation model for 'sex contains more than 100 parameters and no missing values will be added'. So a new data set was not created, Others have cited this as a problem, what should we do? I also have a large amount of missing data about 50%

@tacappaert 10 жыл бұрын

Try this solution direct from IBM: www-304.ibm.com/support/docview.wss?uid=swg21482103

@yad-c3662 11 жыл бұрын

Thanks for doing this! Very clear and helpful

@sabrinadickey2205 10 жыл бұрын

Great video, and very easy to understand! If I wanted to remove multiple imputation from a data set is it possible and how would it be done? Thank you!

@tacappaert 10 жыл бұрын

Do you mean reversing the process, so that the missing values become missing again? I don't know if that is possible but as long as you saved the original data set, you can always revert to that.

@Zanthias 11 жыл бұрын

Hello, Thank you very much for showing this video. My question is once you get all the five imputed values, is there any rule of thumb as to which of the five you should use for your analysis? Also, I realize that in your t-test example, the pooled values did not have standard deviations. How about if you want to report Std Deviations in your study? If you can kindly let me know, I will appreciate this. How about if I want to create composite, which one of the 5 imputed values should I use? Tx

@annapease 10 жыл бұрын

Hello. I am using longitudinal survey responses with a biased drop out - so there is a great big red patch at the bottom right of my missing value patterns graph! Can you tell me the best multiple imputation method to use? If I delete cases I am also biasing the dataset. I have analysed the raw data so I know what I'm comparing it with but I am struggling with the method of imputation. It's also saved across 5 different datasheets - I need to combine it into one, don't I?! Thanks!

@tacappaert 9 жыл бұрын

I would use the regression method of imputation.

@carolineroth2710 9 жыл бұрын

+Anna Pease I used the aggregate command to get all the pooled datasets back into one (I needed to do further imputations on my data, which I couldn't do (or couldn't figure out how to do) once I did one imputation. The thing to note with this, though, is that you won't be able to see which cases/variables have imputed data, like you can when they're not pooled. The syntax I used was this: AGGREGATE /OUTFILE='[location on my computer]\[newfilename.sav]' /BREAK=[variable to break by, which for me was survey participant ID] /[string variable 1]=FIRST([string variable 1]) /[string variable 2]=FIRST([string variable 2]) ... /[string variable x]=FIRST([string variable x]) /[imputed scale variable 1]=MEAN([imputed scale variable 1]) /[imputed scale variable 2]=MEAN([imputed scale variable 2]) ... /[imputed scale variable x]=MEAN([imputed scale variable x]).

@sameeral-abdi6870 11 жыл бұрын

Thanks for this wonderful demonstration. I am facing a problem when I run this test. Number of missing values entered in the multiple imputation analysis was less than number of missing values across all the variables with missing data. Subsequently, completed data after imputation were less than my original data (valid plus missing). So, how can I fix this problem?

@mandyruth9954 10 жыл бұрын

Great video! I have run multiple imputation for 2 variables (missing categorical data for 12% of values), however, I notice after 5 iterations, I still have some missing values. Is this normal?

@tacappaert 10 жыл бұрын

Not usually. Be sure that you designated those variables to be imputed.

@SamanthaBalemba 11 жыл бұрын

Will it automatically use the pooled estimates even for more advanced later techniques, like SEM? I'm using AMOS to run my SEM, but I want to make sure the MI results will automatically get used for this (seeing as how it's an addon to SPSS). I was recently informed that you can't run a proper SEM if you have ANY missing data, so I wanted to make sure I fixed that problem...

@SamanthaBalemba 11 жыл бұрын

Is there a cut-off for using this method in terms of the percentage of cases missing for specific variables? All of my var's are missing

@Zisis21r 11 жыл бұрын

thanks very helpful! I have a question - under the Analyse-ImputeMissingData- Constraints tab on the lower "define constraints" variables, SPSS won`t allow me to set Min and Max values for my variables - and I notice the table rows are coloured blue and not white as in the tutorial - could anybody help me work out how I can define my min-max variables?

@janecooper358 11 жыл бұрын

Hi - thanks for the video - it was really informative. Just a question though... my data has a small amount of missing data - 4 variables, 32 cases, 104 values with .5% missing data overall (104 cases). MCAR was non-significant p=.052 - just! I am running a CFA using AMOS therefore I cannot have missing data. Do you recommend conducting a EM or a Mutiple Imputation or neither? Plus how can I get AMOS to look at the pooled data when conducting the CFA? Thanks !!!!

@sabrinadickey2205 11 жыл бұрын

Hello, the video was very helpful. I have a question regarding the use of the iterations. I had 5 iterations and the pooled iteration was not significant p > .05, but I noticed some of the others were significant. Do you ever use one of the iterations or are you only supposed to use the pooled results?

@sameeral-abdi6870 11 жыл бұрын

The observed discrepancy was because some cases had missing values in all the three included variables for multiple imputations. The problem resolved by adding variable(s) in which these cases had values.

@lirpatex 11 жыл бұрын

I would like to do a MANOVA using my imputed dataset, however, when I run the analysis, there isn't any pooled output. Is it okay to report the output from the 5th imputation? Thank you for your great video and help.

@Bardiyaz 10 жыл бұрын

Hi thanks for this helpful video. I have two questions: 1- do we have to include non missing variables in order to get a better prediction for the missing variables? 2- I need to do a propensity score matching after doing multiple imputation on the dataset with generated data, so I actually need a "pooled only" dataset which is the average of all as you said. is there a way to save the pooled only dataset or do I have to calculate the average for each variable and save it separately? thank you,

@tacappaert 10 жыл бұрын

1. Yes, you should use as many variables as you can to improve the estimation of the missing values. 2. To the best of my knowledge you will have to calculate the mean for each variable.

@johannahedlund3708 10 жыл бұрын

I don't understand what values to use. I want a full table instead of the original one that has gaps where the missing data is. Can I use the New Imputation table? But what imputation do I use? Or do I fill in all the gaps with missing data with the same pooled mean from the t-test analysis?

@johannahedlund3708 10 жыл бұрын

I also get an empty Group Statistics table when doing the t-test. The mean is put out as zero and I get a warning saying "No statistics are computed for a split file in the Independent Samples table. The split file is: Imputation number=... The Independent Samples table is not produced." What am I doing wrong?

@colanfrost3518 11 жыл бұрын

In your video you said you could only use imputed data for the analyses that have a swirl on it. Is there any possibility to use imputed data with repeated measures analyses in SPSS and how might that work?

@suzanneveger7148 9 жыл бұрын

I have a question. I have imputed the data, and I want to conduct an anova test. In order to interpret the data, how do I need to read this ANOVA table? There is the origin solution, and 5 other solutions. However, I do not find a pooled solution. What do I need to do here?

@tacappaert 9 жыл бұрын

+Suzanne Veger Unfortunately, not all inferential techniques don't pool the result such as we saw in the t-test example.

@Ulli0664 10 жыл бұрын

Thanks for the great video, helps a lot! Ive 2 types of missing data in my dataset (working with a questionnaire which has several versions) and Ive coded 2 types of missing data: -9 for actually missing (respondent didnt know / didnt want to give an answer) -99 for n.a. (respondent didnt see this question and therefore couldnt answer it) Therefore, I need to somehow exclude the -99 datapoints from the replacement. Any idea how to do this? Many, many thanks in advance!

@tacappaert 9 жыл бұрын

You can exclude certain data points by using the Select Cases function and then run the analysis.

@efi225 10 жыл бұрын

thank you so much for this helpful video! I run multiple imputation on my data, but I would like to ask you, where are the pooled values you mentioned? I can only see the values in each imputation. Also, I have used many different questionnaires in my research,do you think it's better to run multiple imputation to each questionnaire separately?and some of them are multidimensional,does this affect multiple imputation?maybe I should run multiple imputation on the items of each factor separately?

@tacappaert 10 жыл бұрын

The pooled data should be found in the output as demonstrated in the video. Be sure the data is categorized properly in the Variable View. Be sure it is set up as "Scale/Numeric" data.

@tacappaert 10 жыл бұрын

I would run separate imputations for each questionnaire if they are measuring different constructs.

@khushbeensohi4364 10 жыл бұрын

Can I use this method to replace missing data if my data is not normally distributed and hence, I use non-parametric methods?

@tacappaert 10 жыл бұрын

Yes, you can.

@ilmamufidah6272 10 жыл бұрын

then, which imputed value is to be used? the fifth one? or we have to avegare all 5 inputted data? it will be exhausting right? I also have other question. my data are mostly ordinal (likert scale). But when I tried to run the multiple imputation, the imputed values were beyond the allowable range, some of inputted values were negative, some others were not integer. When I changed the "measure" to "scale" instead of ordinal, then I set the max and min range as well as the rounding, I could get much more beautiful values. Was my approach right? The last question is the same with anastemi. But then I tried to solve it by specifying the role of each variable and it worked. The problem was, I actually didn't know the role of each variable. I just predicted what the role might be (it was actually my hypothesis for a model I tried to investigate). What do you think? I am afraid that my approach is wrong so the inputted value is not valid or something like that

@tacappaert 10 жыл бұрын

There should be a pooled data value that you can use that aggregates all the imputed attempts. In regards to the ordinal data that was the correct approach.

@ilmamufidah6272 10 жыл бұрын

***** Is it ok if I just average it? Since I heard that calculating the average is the simplest way to get the pooled data. Where can I found the pooled data?i didn't find any of it

@tacappaert 10 жыл бұрын

Ilma Mufidah The pooled data should be found in the output as demonstrated in the video. Be sure the data is categorized properly in the Variable View. Be sure it is set up as "Scale/Numeric" data.

@MoCowbell 11 жыл бұрын

I have missing item level data (from a scale with some missing items) and variable level missing data. Should I first impute the missing items so that everyone has a score on the variable with the items or should I just ignore the fact that I have items and just estimate the missing variable that is composed of the items? Thanks!

@TokenFun105 11 жыл бұрын

Thank you very much for this tutorial. However, I notice you did not mention Little's Missing at Random test. Should this not be done prior to all imputation methods? Or is it sufficient to look at the Missing Pattern Values Graph? many thanks

@ertugrulsahn 10 жыл бұрын

Do we estimate missing nominal and ordinal values too? If not what we can do for missing nominal and ordinal values (For example nominal: gender, ordinal: perceived income as categorized by low medium high)?

@tacappaert 10 жыл бұрын

Yes, the procedure can estimate those values as well.

@ertugrulsahn 10 жыл бұрын

***** thanks for your care

@timw.5528 9 жыл бұрын

I have a large number of variables in the imputation model (most of them are nominal) and I keep getting the same error message mentioned by Juvenal below, "...The imputation model for MODEL contains more than 100 parameters. No missing values will be imputed...." I checked all of the variables and they look fine (the nominal variables have values and the numeric variables are within the expected range. If I change one or more of the variables from nominal to scale it seems to work, but then it seems as though the imputations are not going to be accurate as they will be based on linear rather than logistic regression. Any suggestions?

@tacappaert 9 жыл бұрын

+Tim Wadsworth Variables that are ordinal in scale should be categorized as Scale.

@nohadarwish3053 8 жыл бұрын

+TheRMUoHP Biostatistics Resource Channel; Thank you so much for the helpful tutorial. I have this same Warning message every time I am trying to do multiple imputation "The imputation model for Q2_3_TO contains more than 100 parameters. No missing values will be imputed. Reducing the number of effects in the imputation model, by merging sparse categories of categorical variables, changing the measurement level of ordinal variables to scale, removing two-way interactions, or specifying constraints on the roles of some variables, may resolve the problem. Alternatively increase the maximum number of parameters allowed on the MAXMODELPARAM keyword of the IMPUTE subcommand"; although I have checked that all variables are either scale or nominal ones. I have something like 85 variables. What should I do?

@yaldaamir2571 11 жыл бұрын

I want to compare this result then to a less advance way of dealing with missing data. Do you know a reference that say which one of the following techniques produces less biased results? leaving it for list wise deletion, mean imputation (sample mean) or mean imputation of the mean of scores on the scale or sub-scale items for the subject (within subject)? So many thanks.

@tombailey4262 10 жыл бұрын

Hi, thank you for this excellent video. My question appears to be a bit more basic than those below, but I was wondering whether there is any way to store the pooled data set in a separate file. You see I would like to use with an SPSS plug-in e.g. PROCESS, which I don't think will recognise the pooled values as SPSS did with the t-test above? Regards Tom

@tacappaert 10 жыл бұрын

I don't know if that is possible. I would suggest contacting IBM SPSS technical support.

@sajatorunczyk6195 9 жыл бұрын

Tom Bailey Tom, I am looking exactly for the same - a way to use PROCESS with data that have been imputed. Did you figure this out?

@tombailey4262 9 жыл бұрын

saja torunczyk not very simplistically although I think you could do it in r and port it back in. One option (not as good) might be to use expectation maximisation in SPSS?

@sylviaherbozo5811 11 жыл бұрын

Thanks, I was able to get it to work. But I had another question. After running my analyses (t-tests and chi-squares) with the imputed data, I noticed that the sample sizes for each variable on the output are still uneven which normally means some cases weren't used due to missing values. Are these sample sizes supposed to still be uneven? And I just report the total sample size?

@tacappaert 10 жыл бұрын

+Shirley anynameIwant, you should report descriptive statistics for pre and post imputation.

@Sspecial_KK 11 жыл бұрын

I got the same problem, but I managed to run the multiple imputation by adjusting the maxmodelparam (in syntax), cause I was unable to change min and max values. However, I did not see the imputed variable in the data view table. Yet I did see the results of the imputed values in the output file. How do I get to see the imputed variable in the data view. Thanks in advance

@chavianddavid 11 жыл бұрын

What if I have a conjoint study where I have 36 variables and 300 respondents but each respondent only saw a subset of the 36. So I now have a table where each row is a respondent with a constant and then coefficients for only 25 (or more) of the 35 variables. What would be the approach for replacing the missing values (i.e. the missing coefficients for those variables for that specific respondent)?

@yaldaamir2571 11 жыл бұрын

I did, But I want to replace them as well! MI should work with Dummy coded variables as well, shouldn't it? by the way, how can I round the results of imputation for the rest? Thank you very much for your help. gratefully

@yaldaamir2571 11 жыл бұрын

Do we replace the outcome variables as well? if we do, it seems a little bit awkward because we actually want to see if we can predict the outcome variables from other variables or not (our research question/hypothesis). if we already replace them by predicted values from other existing variables, aren't we just increasing the probability of type I error ( the probability that the statistical analysis would support our alternative Hypothesis, even if it is not true).

@yaldaamir2571 11 жыл бұрын

!!!!!! is there any pooled data(set) in the file that SPSS creates?? I just see the original data file identified by 0 and 1 to 10 representing each imputation.

@jamie10157 11 жыл бұрын

Hi thanks for the video it's really useful! Can I just check what exactly the y axis represents when you are looking at the patterns in the diagrame with grey and red squares?

@skincare2010 11 жыл бұрын

hello, I have a few issues with my dataset: first of all, my dependent variable has a wooping 20% missing values (the question is rather sensitive, so I am considering running two models, one that uses this variable and another that uses a similar one, asked in a different way). Is this ok? Also, many of my variables are categorical or nominal (yes/no, agree/disagree etc). Can I still use this imputation method or is it just for numerical variables? Thanks.

@missyp017 11 жыл бұрын

paul, did you figure this out? i need to do the same thing...

@alipolat5393 11 жыл бұрын

I have MNAR type data with sometime 60 percent missing. What I understand is that if my data is NOT random and if I choose automatic from imputation method tab than SPSS will take care of the non randomness problem of data. Is that correct.

@chrislittle9839 11 жыл бұрын

For a scale score, would you calculate the aggregated variable from the pooled imputation iterations?

@yaldaamirkiaie5303 11 жыл бұрын

Hi Thanks for The video, It is very helpful! A question that I have so far by just watching the video is that when applying "Constrains" min 27:19 there are 2 other options saying "maximum case draws" and "maximum parameters draws". could you please let me know what are those?

@georginamartin9337 9 жыл бұрын

I have a dataset that has some missing values represented only by a . and others that have it as a -1 or -9. When I do the imputation the . values are imputed but the assigned missing values remain the same. How do you rectify this?

@tacappaert 9 жыл бұрын

+Georgina Martin If the -1 or -9 values are not actual outcome possibilities, then the values should be cleared and then you can run the imputation.

@georginamartin9337 9 жыл бұрын

excellent thats what I did and it works!

@yaldaamir2571 11 жыл бұрын

I am seeing that your variables are not the items within the scale but they more look like the sub-scales or the final variable. I mean it looks like you are working with missing values in final scale/subscale scores. May I use MI to impute values for missing data within my scales? and May I have all different scales/measures together in one file and use all the "items" completed in all the scales to impute for missing values? thanks in advance.

@TokenFun105 11 жыл бұрын

Would you use the same process to determine mean and stand deviation of 'pooled data'. I would imagine you could use these estimates to standardise all variables and then re-run the regression on those to obtain the standardised regression coefficients (that SPSS also does not provide)?

@fazlihaleem6603 9 жыл бұрын

how would i know that the missing data is MR, MCR or missing systematic. if we do not have MR, MCR then in case of systemic missing do we have some solution

@tacappaert 9 жыл бұрын

+Fazal Haleem If your data is missing systematically, then that typically means there is some response bias of some kind (e.g. questions asking for sensitive information or questions that are unclear). You should try and figure why that might be happening so you can address that as a possible validity issue. This technique that I demonstrate can be used with data that are missing systematically.

@antimandril2281 10 жыл бұрын

When I push Ok (pattern) the computer is blocked .- Do someone knows what happened ¿?

@yaldaamir2571 11 жыл бұрын

MI is very complicated to be implemented for my data, but i cannot convince myself not to use it! May I average the imputed values (e.g., 5 imputed values from 5 imputations) and enter these averages into my missing values in my original file? I actually tried it with one of my analysis and the results (e.g. B weighting coefficients) are slightly different from the pooled data but instead it gives me the model summary which i need! Do you know if this approach is appropriate? Thanks.

@yaldaamir2571 11 жыл бұрын

I am trying to run MI, it gives me error message (warning) like after 100 draws it couldn't replace this or that (it changes as I change the number of drawing) and I need to raise this number or I need to check the min and max in "constraints" and it stops running. I checked both but still didn't work. one of the variables that keeps coming up was a dummy coded variable. I took out all my dummy coded variables and it worked. What should I do with my dummy coded variables?

@kimconsultants 11 жыл бұрын

So you can impute data only for the variables where > 5% of data are missing? Or, if you impute for one you must impute for all variables that have any missing data? I ask because I have many variables and SPSS doesn't seem to be able to handle all of them at once. This means I have to create multiple imputed data sets and I'm not sure how to combine them all.

@yaldaamir2571 11 жыл бұрын

I already ran Little's MCAR test and I got: Chi-Square: 17193.367, df = 26009 and sig = 1.000. So I believe it means my data are missing completely at random. I chose to do Multiple Imputation. I am wondering if I can already choose the method for MI instead of setting it on automatic to be able to change the 3 of iterations, Since I think I need something about 91 iterations to get convergence! I found this when I did Little's MCAR test. Does iteration in MI and EM indicate the same function?

@yaldaamir2571 11 жыл бұрын

Could you please let me know how to calculate R Square for the model in a General Linear Regression model for the pooled data and probably its significance? is there anyway?

@onlyificanloveyou 11 жыл бұрын

Thank you for making this great video! I have actually done multiple imputation in Mplus and it generated 10 imputed datasets (all were .dat files). Is there a way to read these files as imputed data sets in SPSS? I need to do matched-pair t-tests by using these values. My stats consultant suggested that I ask SPSS to read these 10 imputed datasets individually, do 10 t-tests, and then average the t-value. However, I like how SPSS pooled the datasets first. Thank you!

@sylviaherbozo5811 11 жыл бұрын

I keep getting a warrning message such as "The imputation model for EDEQ14.1 contains more than 100 parameters. No missing values will be imputed..." Any advice on how to resolve this problem? I tried changing the measurement level but it didn't help. I wasn't sure how to do one of the other suggestions including: Reducing the number of effects in the imputation model, removing two-way interactions, or specifying constraints on the roles of some variables

@yoox0047 11 жыл бұрын

When you run a hierarchical regression with MI dataset, the output does not provide R, R2, adjusted R square, and F value of pooled imputation (it only provides the calculations for the original and each imputed dataset). It also doesn’t provide beta (standardized coefficients) of pooled imputation (there were only unstandardized coefficients: B and Std. Error) either. However, given these are typical calculations reported in our results, how do we obtain these information from the pooled data?

@georginamartin9337 9 жыл бұрын

this is really awesome!

@chavianddavid 11 жыл бұрын

The raw data was a dummy variable regression so there are only 1 and 0. Also, the experimental design was such that each respondent had their own design where they saw either all or just a subset of the variables. So I am looking to fill in the coefficients for the variables they did not see.

@yaldaamir2571 11 жыл бұрын

I understand your point. But by outcome variables I mean Dependent Variable(s)!

@jaishrik8691 11 жыл бұрын

The missing variables in my data file have a value of '9'. How do I remove these dummy variables? Thank you.

@TokenFun105 11 жыл бұрын

Great thanks! I never trust my 'subjective judgement', so I like to rely on both :)

@marina7181 10 жыл бұрын

great video!!thank you!

@sputaccount6139 11 жыл бұрын

is there a way to get a pooled R-squared value in multiple regression with MI data?

@mariakrista100 10 жыл бұрын

What do you do after you get results from 5 imputations?

@tacappaert 9 жыл бұрын

You use that data to replace the missing data points and then run your additional analyses (e.g. t-test).

@haliltokay3689 10 жыл бұрын

Thanks for some great videos. I get a warning message that says after 100 draws, the imputation algorithm cannot find an imputed value under the constraints for variable [X]. This is strange, because the variable is just a 7-point likert-scale. All "I don't know" responses are coded as 999, and as missing values. So, I tried to change the MAXCASEDRAWS. After a few attempts, it accepted 1000000000. I know. So, it ran the imputations. However, I was met with yet another Warning message. "Some missing values cannot be imputed because a factor in the model has a value that does not appear in the data used to build the model." Does anyone have any good suggestions for how I can solve this problem? Just FYI: - My data is Missing Not at Random (MNAR) - I have 55 variables - Sample size of 317. - Measurement scales: 7-point likert scale + 10-point evaluation scale. I hope someone will be able to help ASAP. Thank you. Halil

@tacappaert 10 жыл бұрын

I think you need to take a close look at the data codes you have used in the variables with missing data. For some reason SPSS cannot recognize those codes and cannot perform the imputation. If I read your post correctly you have coded both missing values and "I don't know" responses as "999". That might be the issue.

@haliltokay3689 10 жыл бұрын

***** Well, in the software I used for collecting data - all "I don't know" responses received the value 999 so they could easily be identified during data analysis. The, there are also some system-missing data which just do not have any value at all. But you believe this may be the source of the problem? I have now tried to recode all variables so the only type of missing values is 999. After doing so, I still get the same kind of Warnings. I have to change the MAXMODELPARAM =500 and the MAXCASEDRAWS=400000 , and still, SPSS does not want to impute the data properly. It says that 'some missing values cannot be imputed' AND 'after 800000 draws, the imputation algorithm cannot find an imputated value ... So.. Any good ideas for how to solve this problem?? Btw. Thank you guys for such a quick response time!!!

@haliltokay3689 10 жыл бұрын

***** I also see that others have encountered a problem related to having ONE data-set with the pooled values. When I run the imputations, SPSS creates a new data file with the original data, the 1-5 imputations - and then that's it. In your video, it is the same. Up on the upper-right-hand section of the screen you can choose between original and the five imputations. But there is not one called pooled data. Is there any operations in SPSS you can do to have one data file with the pooled values WITHOUT the original data and the five imputations. I just want the pooled data for further analysis. How can I do that?

@tacappaert 10 жыл бұрын

Halil Tokay Any missing values should have cells without a code. The cells should be empty.

@haliltokay3689 10 жыл бұрын

***** Thanks. That worked. Thank you so much! Now, how do I transform my data so that I only have the pooled variables without all five imputations, and the original data set? I just want one dataset without missing values. I do not want all the imputations. Only the pooled variables. I need to do a PCA followed by regression analysis, so I need a dataset without missing values.

@d3llikz 9 жыл бұрын

Is this also possible for panel data?

@tacappaert 9 жыл бұрын

+Morten Fjerritslev Can you explain what you mean by panel data?

@yaldaamir2571 11 жыл бұрын

Thank you for the clear response. very helpful, Thanks.

@yaldaamir2571 11 жыл бұрын

Many thanks, I will try.

@ia1167 10 жыл бұрын

Hello: first I would like to thank you for this awesome video! It is super clear and super well explained! I have a question for you. Procedure: According to IBM once one runs MI, following the method of "Fully Conditional Specification" ( FCS; in the output SPSS tells you what method it used) one should verify for FCS convergence, that is, whether it was achieved or not. Problem: This is the part in which I am terribly stuck because I am getting a lot of flat lines in my chart when I test whether FCS convergence was achieved (please look at this link for more info about how to do this: pic.dhe.ibm.com/infocenter/spssstat/v22r0m0/index.jsp?topic=%2Fcom.ibm.spss.statistics.cs%2Fspss%2Ftutorials%2Fmi_fcs-convergence_telco_howto.htm). So, when I looked at my iteration history, for every set of imputed data I am getting the same value no matter what is the number of the iteration. For instance: in my dataset #1 I get the same imputed value for the total score of a questionnaire from iteration 1 to 10000, and so on until the last imputed dataset (these values remain the same within datasets but are different accross datasets). Finally my question: Why do you think FCS convergence is not achieved in this case or why my values are not changing from iteration to iteration? I have been looking in internet what to do about this besides increasing the # of iterations but there is almost no info about it. Please, would you mind giving me your thoughts about this? I will be so grateful.

@tacappaert 10 жыл бұрын

My guess is that the predicted values do not have any variance or have little ability to vary and your iterated values don't change. Generally, this is a good thing indicating that the predicted values are quite accurate being that they don't change between each iteration.

@kloveinn 10 жыл бұрын

Thanks..good stuff but video is zooming stupidly at times...needs better vid editor

@yaldaamir2571 11 жыл бұрын

Many thanks. Regards

@yaldaamir2571 11 жыл бұрын

Sorry, but you lost me!! What is especially important? being cautious with Imputation for dependent variables? Maybe finding the answer to this question help me better: Does multiple imputation based on running several regression equations (predictions) or just based on multiple pattern simulations in accordance with the pattern found in existing data? OR both are one thing?!! am I making sense?