Identifying Multivariate Outliers with Mahalanobis Distance in SPSS

  Рет қаралды 212,625

Dr. Todd Grande

Dr. Todd Grande

Күн бұрын

Пікірлер
@Swityie
@Swityie 2 жыл бұрын
Dr Todd, you've saved my life! I was dying with the Mahalonobis!!! Was crying at midnight while getting stuck at this. Thank you again!
@scarlettthorn9060
@scarlettthorn9060 3 жыл бұрын
Honestly at this point I want to acknowledge you in my thesis thank you notes. Thank you Dr Grande, you are a gem.
@diegohenriquez1542
@diegohenriquez1542 6 ай бұрын
did you do it lol?
@mimimcgee5512
@mimimcgee5512 3 жыл бұрын
Thank you for another helpful video. I am just a month or so away from receiving my doctorate and your videos have greatly assisted me in that! I'm brushing up in prep for my final defense and appreciate all of your videos. Thank you!
@yacinehajji1784
@yacinehajji1784 8 жыл бұрын
I would like to thank you for speaking loudly and slowly, very usefull for someone not native English like me.
3 жыл бұрын
This video was very helpful! Thanks for sharing your knowledge for free on KZbin!
@shapsgh
@shapsgh 3 жыл бұрын
Just Realized that the values of MD and Chi-Square test exactly match the output of the AMOS' outlier table. Thanks Dr. Grande
@naftalibendavid
@naftalibendavid 3 жыл бұрын
This has proven so helpful again and again! Thanks.
@arnelferaer6486
@arnelferaer6486 4 жыл бұрын
Dude you're a legend. Thank you for this.
@thomasbarnes5703
@thomasbarnes5703 3 жыл бұрын
Thank You Dr. Grande, I have no background in statistics....yet had to take a course as a portion of my degree requirements. Your video have really helped me understand this very difficult subject!!
@krunal699
@krunal699 2 жыл бұрын
Dr Grande you are a saviour! Thank You!
@ThatFellowOnline
@ThatFellowOnline 7 жыл бұрын
Fabulous video, explained clearly, concisely. I like how you have also shown the importance of labelling data properly and presentation (decimals) etc as this is really important when keeping data organised i.e. not just focusing on having a tidy output.
@DrGrande
@DrGrande 7 жыл бұрын
I am glad you found this video useful - thanks for watching.
@efrestein
@efrestein 4 жыл бұрын
Your videos add a ton of value!
@voltisathartori6451
@voltisathartori6451 6 жыл бұрын
Thank you Dr Todd, for such a awesome explanation.It was very beneficial for my study to move on.
@DrGrande
@DrGrande 6 жыл бұрын
You're welcome!
@oscarespinozaparra6840
@oscarespinozaparra6840 8 жыл бұрын
Thank you Todd Grande for this extraordinary how to video. This was a prayer answered and feel so much better listening and following your instructions. I want to express how sincerely grateful for the detail analysis and steps you indicated on this video.
@CoreFocusCoaching
@CoreFocusCoaching 3 жыл бұрын
Amazing!! You should do a separate video for the Chi-square distribution. Nowhere on KZbin is the second part to the explanation and because it is not overtly flagged in the title it does not show up. Either way thank you so much!!
@zarifbaihaqi8538
@zarifbaihaqi8538 4 жыл бұрын
Thank you very much Dr Tod..you helped me a lot.....
@fasamad6730
@fasamad6730 7 жыл бұрын
Wonderful explanation. Enjoyed the session. Thank u Todd Grande it was a great help
@DrGrande
@DrGrande 7 жыл бұрын
You're welcome, thanks for watching -
@lyrahazel2079
@lyrahazel2079 4 жыл бұрын
Omg thank you i was so frustated . My data wouldnt met the normal multivar assumption until i stumbled onto this!
@mohammedimam3651
@mohammedimam3651 3 жыл бұрын
Wooooow! This is extremely useful! Thank you! 👌
@denniscraggs8393
@denniscraggs8393 6 жыл бұрын
I liked your presentation. SPSS has evolved from the old text script product. I am a current user of both Minitab and Matlab. I am studying the Mahalanobis Distance and see that it has many applications. The SAE and ZVEI published a standard where electronics were judged to be fit for use in a temperature x voltage environment defined by a potato shape. However, they never provided a method of dealing with the different unit scale distances. I am thinking the Mahalanobis Distance would be a more technically correct means of classifying a component's fitness for use in a temperature x voltage environment.
@payonrayaneh
@payonrayaneh 9 жыл бұрын
Very useful......Thanks a lot professor Grande.
@alibezzaa809
@alibezzaa809 4 жыл бұрын
I really appreciate the efforts your are putting to making concepts easy to understand. Do you have a video on transforming a multivariate outlier to a dummy variable.
@ibrahimmkheimer5311
@ibrahimmkheimer5311 4 жыл бұрын
awesome video dr
@ammaarkidwai2732
@ammaarkidwai2732 4 жыл бұрын
Hi Todd! Great video as usual. Why was the cut off for the probability_MD column .001? Is that the norm cut off or based on your data?
@hafizahusairi
@hafizahusairi 5 жыл бұрын
Thank You!! I more understand after watching your video =)
@jaynastics2
@jaynastics2 4 жыл бұрын
Very helpful video!
@Thejubeabides24
@Thejubeabides24 5 жыл бұрын
Excellent video!
@thoshsamanthar4815
@thoshsamanthar4815 5 жыл бұрын
Dr Todd, the video helped me a lot. I have 2 questions 1) I have an integrated framework, where analysis is done in 2 stages. Should I check MD for each stage? One of my variable will look like a mediator but it is not. It will be a DV in first stage and subsequently an IV in 2nd stage of the analysis. Stage 1 and stage 2 does not have any connection. I have done each testing and got different Prob_MD / outliers to be deleted. 2) Should I include demographic questions as part of df, as the prob outliers results are different when I omit or include?
@muhammadfaisal9918
@muhammadfaisal9918 5 жыл бұрын
Thank you Dr. Todd for your awesome work. This is a very useful video. I am wondering if you could mention the reference for this process (or a reference for the significance value - is it by Tabachnick & Fidell 2007?). Many thanks
@sebastiankruse4981
@sebastiankruse4981 2 жыл бұрын
Hair et al 2010 also recommend this process. They suggest to divide MD by the number of predictors and then designate outliers in small samples if these values surpasse 2.5 and in large samples if they surpass 4. I think the 2.5 cutoff point corresponds very closely to the .001 p-value used by Dr. Grande.
@Lello991
@Lello991 2 жыл бұрын
@@sebastiankruse4981 Hi! Could you please provide the full reference for Hair et al 2010? Is it this one? Hair, J.F., Black, W.C., Babin, B.J., & Anderson, R.E. (2010). Multivariate Data Analysis. Seventh Edition. Prentice Hall, Upper Saddle River, New Jersey
@sebastiankruse4981
@sebastiankruse4981 2 жыл бұрын
@@Lello991 yes, that‘s the one
@thewaterhub
@thewaterhub 9 жыл бұрын
Thank you, very useful video and clear explanation.
@Elianaco
@Elianaco 11 ай бұрын
Hello, thank you for your helpful videos. Quick one, I'm running a moderation with multiple mediators. Are mediator variables independent variables? I'm trying to run the Malanobis distance but unsure if I should add my mediators to the IV box. Thank you
@St0rytell3r
@St0rytell3r 6 жыл бұрын
Thanks for the video, very thorough.
@thankyou6555
@thankyou6555 2 жыл бұрын
Thank you! Very helpful.
@herix7342
@herix7342 3 жыл бұрын
Great contribution! Is there any reference for the described procedure?
@GeeWhit
@GeeWhit 8 жыл бұрын
Thanks for the great video! Does this method expose two-tailed outliers? If not, how can this be achieved?
@HarerimanaAlexis
@HarerimanaAlexis 6 жыл бұрын
Dear Dr Todd, Thank you very much for this wonderful video. I h ave the same question about how do you decide on the degree of freedom, and whether .001 is the absolute rule. Thank you
@felipemcse
@felipemcse 8 жыл бұрын
Thanks for the video, Todd. Do you have some references that explains why the number of degree of freedom should be the same of the number of variables?
@jongsuksong7493
@jongsuksong7493 8 жыл бұрын
Thank you so much for your great explanation! It really helped me a lot!
@DrGrande
@DrGrande 8 жыл бұрын
I'm glad you found the video useful. Thanks for watching.
@zohalh14
@zohalh14 3 жыл бұрын
Thanks for the video! Can you use Mahalanobis distance if your IVs are categorical in a mixed anova?
@HughMupfunya
@HughMupfunya 5 жыл бұрын
Awesome... Thank you very much
@RichardMcCrory_Neph
@RichardMcCrory_Neph 7 жыл бұрын
+Todd Grande - could I check the degrees of freedom for the Chi-Square distribution is n or n-1. e.g. for 20 variables, is the d.f. 20 or 19?
@chinchinhoh7893
@chinchinhoh7893 6 жыл бұрын
Dr Grande, 1 question. Frequently, the examples of identifying & handling outliers are about independent variables. Does it mean that we don;t have to identify & handle the outliers of dependent variables? TQ!
@cecyliaadamczak4301
@cecyliaadamczak4301 2 жыл бұрын
Hi Dr. Grande, can we include the outcome variable (DV) with the IV in the mahalanobis distance analysis?
@polomarco1256
@polomarco1256 4 жыл бұрын
hi. Dr. Todd. Thanks for sharing knowledge. May I ask you something? Can I use Mahalanobis distance for identify multivariate outliers with ordinal data?
@wongjanice7753
@wongjanice7753 8 жыл бұрын
Thank for your sharing! I would like to ask a question: if i detected 8 outliers with Mahalanobis distance, is this necessary for me to delete all outliers ? or 8 outliers out of 200 respondents is still in acceptable range ? is there any reference mention about it ?
@j.a.o.5535
@j.a.o.5535 8 жыл бұрын
+Wong Janice According to Mead and Craig (2012, Identifying Careless Responses in Survey Data), you may have up to 20 careless responders, especially if you used web-based questionnaires, so I would eliminate those 8 outliers to improve the quality of the data, although it is not always a straightforward rule.
@chinhankim
@chinhankim 5 жыл бұрын
Dr.Grande, I have two independent variables and three mediation variables of one dependent variable. Question is should I put five variables(independent plus mediation variables) to figure out outliers or should I put only two independent variables? Thanks.
@khaledlahlouh6944
@khaledlahlouh6944 4 жыл бұрын
Dear Dr. Todd, how should we do when we have a model with many IV, two mediators and two VD ? should we consider the mediators as IV ?
@Oz4rmEg
@Oz4rmEg 3 жыл бұрын
Best vid ever
@frajtervivien
@frajtervivien 9 жыл бұрын
Thank you so much it was a lifesaver!
@harithfarhan5535
@harithfarhan5535 3 жыл бұрын
thanks for this
@kamrannawaz
@kamrannawaz 7 жыл бұрын
Thanks very helpful.....I understand that why you used 3 as DF, however please explain what is Chi Square?
@nargisali7298
@nargisali7298 4 жыл бұрын
He does not reply Dr. Nawaz :)
@guitaqui
@guitaqui 2 жыл бұрын
Perfect !!! Thank you!!!
@ljubomirpupovac2009
@ljubomirpupovac2009 8 жыл бұрын
Hi Todd. Thanks for the video. Just one question: your main independent variable is program? Shouldn't we compare MAH_1 value for samples that received treatment and ones that didn't? The things is, main independent variable is not used in the analyze, so whatever value I put there the results (removed cases will be the same). Regards
@jahanzaibalvi2010
@jahanzaibalvi2010 7 ай бұрын
thats great. thank you so much sir
@rahimbehrad63
@rahimbehrad63 8 жыл бұрын
Thanks Dear Todd. great !
@karimatouati5256
@karimatouati5256 4 жыл бұрын
Thank you for this useful video. I have a question please : What to do in case of ordinal variables when checking for these outliers ? what method is the adequate one? Mahala Distance or Cook's Diastance ? Does it have sense to apply this method when my data is only composed with ordinal variables and not continuous ones ?
@annabelleatkin1884
@annabelleatkin1884 6 жыл бұрын
Would you include control variables as predictors in the regression? And if you're testing a latent interaction in MPlus, do you simply input the observed variables into the regression in SPSS to do this test?
@94bfm
@94bfm 6 жыл бұрын
Great explanation! Thank you so much!
@DrGrande
@DrGrande 6 жыл бұрын
You're welcome!
@KristinColletteScott
@KristinColletteScott 6 жыл бұрын
Hi Dr. Grande, I've got 7 constructs (3 IVs, 3 intermediary, and 1 DV) each with multiple items. How do do you recommend handling these when searching for D2? I also need to test for multivariate normality using the Wald statistic on the same data set. Do you have a video on that?
@evannadhim6631
@evannadhim6631 8 жыл бұрын
Todd, thank you so much for this clear explanation, but you've done the identification for multivariate outliers with Mahalanobis distance for the cases. My question: is there any differnce if we can do it for variables? As the variables have their onw distributions while they are affected by the outliers
@henkpiet1908
@henkpiet1908 Жыл бұрын
What do I do if there’s a missing value in one of the scales when I use pair wise deletion for my regression. In that case the mahalanobis distance returns a missing value as well.
@patfennell
@patfennell 7 жыл бұрын
Great video - thanks for posting!
@DrGrande
@DrGrande 7 жыл бұрын
You're welcome - thanks for watching.
@ravindarmadishetty736
@ravindarmadishetty736 7 жыл бұрын
Dear Todd good explanation. The outliers which we got are similar to Residual(Actual-Predicted) outliers to remove from the data?
@marinacuk1400
@marinacuk1400 9 жыл бұрын
Thanks you for this very helpfully video. Whether these method may be applied to lognormal datasets? Whether it is necessary the data to follow a normal distribution?
@prof.thakshilakumari7847
@prof.thakshilakumari7847 5 жыл бұрын
Thank you so much I followed your video and did the test with my sample. But I have a question on the degree of freedom? why you consider it 3?
@xunzhou962
@xunzhou962 8 жыл бұрын
Exactly what i need! Thank you!
@kathrinho9136
@kathrinho9136 9 жыл бұрын
Hi, I have one question on the method. Hope you can help me :). In your data set, you have your manipulations, descripted as "program" and then you said that you have your independents named "functioning, severity, motivation". 'Why do additional metric independents exist in your file? In my data set I have 2 independents but they are in a nominal scale. So, what do I put in the text box of the linear regression where it says "independents"? Thanks in advance!!
@omidmahdieh7882
@omidmahdieh7882 2 жыл бұрын
Hello Dr. Grande. Thanks for your helpful demonstration. Can items be used to calculate Mahalanobis distances? Or should I use variables. I mean composite variables.
@devildman3128
@devildman3128 9 жыл бұрын
hi, are there any changes to be made if I find negative values for the probability_MD?
@MrFoganholo
@MrFoganholo 9 жыл бұрын
Todd, great explanation! Thanks. One question: Why you used 3 as degree of freedom? Why you used .001 as reference? Can I use for any sample? Thanks again.
@DrGrande
@DrGrande 9 жыл бұрын
+André Foganholo Three degress of freedom were used because there were three variables in the analysis. Using the probability of .001 is a common practice when identifying multivariate outliers.
@n.einstein6088
@n.einstein6088 8 жыл бұрын
+André Foganholo as a reference for the .001 threshold I used Tabachnick, B.G., & Fidell, L.S. (2007). Using Multivariate Statistics (5th Ed.). Boston: Pearson. (p. 74). according to www-01.ibm.com/support/docview.wss?uid=swg21480128. just in case anyone needs that.
@wenyuanliu4602
@wenyuanliu4602 6 жыл бұрын
Thanks everyone!
@richguides10
@richguides10 5 жыл бұрын
He used 3 because of the number of independent variables. Thank you
@barbaratoson6455
@barbaratoson6455 7 жыл бұрын
Great video. Could you recommend a method to identify outliers in an RM ANOVA set up? I am looking for something similar to INFLUENCE option in SAS MIXED procedure but for SPSS
@maheshvykuntam2809
@maheshvykuntam2809 7 жыл бұрын
+Todd Grande - Thanks a lot for the great explanation. Could you please help me in understanding- 1. Will this process work even if we have missing values. Why do we use DF as 'n' y not n-1.? Thanks a lot for the help.
@jiffylimborks
@jiffylimborks 11 күн бұрын
many thanks 🧡
@sskshats6453
@sskshats6453 8 жыл бұрын
Thanks Alot. May.Allah bless you
@ainannur5836
@ainannur5836 7 жыл бұрын
Mr Todd, I have 4 variable; AsliG, AsliB, GreenBP, and BlueBP. I want to know the value of Mahalanobis distance between (AsliG AsliB) and (GreenBP BlueBP). Can I calculate its variable using Mahalanobis distance using SPSS? Why I cant input 2 variable in dependent and two independent other in SPSS?
@wpadilla72
@wpadilla72 4 жыл бұрын
Dear Dr. Grande, my variables are measured by likert scale...how must be applied the Mahalanobis test in this cases?...thanks
@loversloss101
@loversloss101 5 жыл бұрын
So what happens when you follow these instructions and every number you get for the MAH_1 is the same?
@moeshams4504
@moeshams4504 4 жыл бұрын
Excellent!
@micahgardner7836
@micahgardner7836 3 жыл бұрын
what if one of your variables was excluded by SPSS when calculating Mahalonobis Distance? Are the degrees of freedom the same, or would you minus one? Example, 5 variables entered but one was excluded. Would degrees of freedom be 5 or 4?
@madiharazzam1098
@madiharazzam1098 7 жыл бұрын
i have a sample of 300 and 2 predictors. what would be the Mahalanobis Distance for it???
@ninab6136
@ninab6136 8 жыл бұрын
so i guess mahalonobis cant be calculated when you have missing values somewhere in the items. any other way i can include those cases?
@rashidsaid-ti3jz
@rashidsaid-ti3jz 5 жыл бұрын
Thank you Dr.Todd for these useful lessons. Please can you mention for the reference of using formula which you wrote in compute variable. 1-..chi(mahalanobis, df). Thanks alot
@nahk-lx2tn
@nahk-lx2tn 5 жыл бұрын
rashid said he is not replying to actual questions. That’s sad
@rashidsaid-ti3jz
@rashidsaid-ti3jz 5 жыл бұрын
@@nahk-lx2tn hi wasim, I found the reference (hair, 2014)
@jameslebron9412
@jameslebron9412 7 жыл бұрын
Dear Todd nice video clip. I have a question that in your video i think you are using 3 independent variable and 1 dependent variable so actually you are using 4 variables totally. I guess degree of freedom in this case is 4-1 = 3 since you are measuring distance on the 4 dimensional scales.
@farhanselfatan
@farhanselfatan 11 ай бұрын
Thank you dr
@moroomario4007
@moroomario4007 2 жыл бұрын
Sir, if I used a Likert scale, the DV should be the mean score of all the items and IV should be the score of each items?
@godnkr236
@godnkr236 5 жыл бұрын
thanks for this amazing video!
@chriskeran4480
@chriskeran4480 9 жыл бұрын
Dr. Grande--thank you kindly. Awesome demonstration. The question I have relates to the number of independent variables (IV) chosen when calculating a Mahalanobis Distance (MD). Should the particular IVs chosen be related in some way or can you through in all of your numeric variables into the one regression when attempting to find multivariate outliers using MD?
@next_trip_loading
@next_trip_loading 6 жыл бұрын
can we apply ANOVA for the factor at 2 level? I have seen lot of studies using 2 levels and testing it with ANOVA.. Secondly, don't know how they check the normality when they use single item likert scale .. could you please explain me this concept
@oliviasimms3897
@oliviasimms3897 3 жыл бұрын
Hi, does anyone know why it won't give me output when I add two variables to the 'independents box? I can get output for them both separately but cannot get 1 output for them both
@shafeekafadlikhzamri7068
@shafeekafadlikhzamri7068 5 жыл бұрын
hello Dr.Todd. Your video helped a lot and the steps are easily understood. but i seemed to have too many outliers , i would like to have your contact to ask you regarding this matter.
@desterward
@desterward 7 жыл бұрын
Hi. Is it possible to use it in non-linear multivariate as well? Thanks
@abdulmoeed4661
@abdulmoeed4661 2 жыл бұрын
If we have more than one independent latent variables, mediators and final dependent variable, how we would place them in the 'Independent & Dependent ' variables list box while doing this test? Thanks Waiting for response.
@sinemkaraoglu1717
@sinemkaraoglu1717 2 жыл бұрын
Hello, did you find the answer to this question?
@theresiabusagara7909
@theresiabusagara7909 6 жыл бұрын
Great it was very useful
@DrGrande
@DrGrande 6 жыл бұрын
Thanks!
@selamawitweldegebriel3421
@selamawitweldegebriel3421 4 жыл бұрын
This was very helpful, how do we contact you. Cause I have an urgent problem
@adrianfajar323
@adrianfajar323 5 жыл бұрын
prof, i have 3 dependent variable and 6 independent variable, how to see mahalanobis ?
@alexandrafiedler3113
@alexandrafiedler3113 4 жыл бұрын
Do I use for CLP-Analysis (2-waved longitudinal design) the dependend variable time 1 or time 2? sorry but i am confused whether i compute Mahalanobis d for the regression term in my CLP-Model with: Dependend Variable (t2) regressed ON --> Dependend Variable (t1), independent Variable (t1), Moderator (t1). Or it wont matter if I do the mahalanobis for a simple regression time 1: Y1 regressed ON --> X1, M1 (and what about my second independent variable ? - should i put it into the regression for timepoint 1, too?) I would be very glad if anybody could help me with this confusion !! :D
@priyas8052
@priyas8052 9 жыл бұрын
What if you get zero as a result for one of the rows?
@karolinakonciute9299
@karolinakonciute9299 Жыл бұрын
God bless
@sskshats6453
@sskshats6453 4 жыл бұрын
what if we have 5 dependent variables and just one independent ??
@zubairawan9088
@zubairawan9088 Жыл бұрын
Why have you selected the p-value to be 0.001?
Conducting a MANOVA in SPSS with Assumption Testing
25:12
Dr. Todd Grande
Рет қаралды 94 М.
How To Choose Mac N Cheese Date Night.. 🧀
00:58
Jojo Sim
Рет қаралды 115 МЛН
Муж внезапно вернулся домой @Oscar_elteacher
00:43
История одного вокалиста
Рет қаралды 8 МЛН
5 Unexpected Limitations of Make.com (& How to Solve Them)
1:06:45
What are Mahalanobis Distances
4:58
Aric LaBarr
Рет қаралды 17 М.
Finding Multivariate Outliers with the Mahalanobis Distance Test in SPSS
16:20
Understanding and Identifying Multicollinearity in Regression using SPSS
11:27
How to detect outliers in SPSS
7:52
how2stats
Рет қаралды 291 М.
Multiple Linear Regression in SPSS with Assumption Testing
14:54
Dr. Todd Grande
Рет қаралды 393 М.
Boxplots & Outliers in SPSS - Identify and Deal with Outliers (4-8)
12:05
Research By Design
Рет қаралды 96 М.
Why mahalanobis distance is incredibly powerful for outlier detection
18:48
Selva Prabhakaran (ML+)
Рет қаралды 2,2 М.