A Two Step Transformation to Normality in SPSS

  Рет қаралды 137,013

Gary Templeton

Gary Templeton

Күн бұрын

This video shows how to transform continuous variables toward normality in SPSS. This approach retains the original series mean and standard deviation to improve the interpretation of results. Use of this method can be justified by citing the following published paper:
Templeton, G. F. (2011). A Two-Step Approach for Transforming Continuous Variables to Normal: Implications and Recommendations for IS Research. Communications of the Association for Information Systems, 28, pp-pp. doi.org/10.177...

Пікірлер: 268
@mikemmoon
@mikemmoon 9 жыл бұрын
Wow! I have been trying every transformation under the sun for several of my variables for 2 straight weeks with no luck. This is like magic. Now I just might finish my PhD dissertation by the end of the summer after all. Many thanks!
@abigailfulton9185
@abigailfulton9185 5 жыл бұрын
For anyone who doesn't know you use the series mean and standard deviation that he uses in the video. IT WORKS AND HE SAVED MY LIFE!
@chukwuemekaemenekwe746
@chukwuemekaemenekwe746 9 жыл бұрын
Gary your tutorial just saved my day. Been struggling with different transformation techniques. Seeing yours just brightened my day. Danke!!!
@oscaronam7862
@oscaronam7862 8 жыл бұрын
Thanks a lot Gary, I had been struggling to normalize my skewed data but when I used the two steps in your paper and video that you explain clearly, my data is now normal - confirmed by Kolmogorov-Smirnov and Shapiro-Wilk tests. Very helpful video!
@gftempleton
@gftempleton 8 жыл бұрын
+Oscar Onam That's great, Oscar. Good luck.
@aristrolltle8580
@aristrolltle8580 2 жыл бұрын
Sometimes, the KS test and SW test refutes the normality hypothesis, although the skewness and kurtosis values are ok.
@gftempleton
@gftempleton 8 жыл бұрын
People often ask why their sample size is reduced by 1 when using this technique. The reason this happens is as a result of the first step, the values range from 1/n to 1. All values must be a fraction for step 2 to work, so it skips over the 1 (associated with the biggest value). In order to fix this, you should replace the missing value (the result of applying step 2 to the 1) with 1-(1/n). For example, if you start with a sample of 1,000, the Two-Step will likely result in a sample of 999. To use the missing record, you'll need to find it (it's the "1" value resulting from the first step among all cases). Replace the 1 with 1-(1/1000), or 1-.001, or .999. This won't change results much, but will ensure every case is used. Of course, I'd put a small note in the paper about any transformation step needed.
@Marie-sh6zm
@Marie-sh6zm 7 жыл бұрын
Hi i would like to ask, how many times am i allowed to normalized the same data? thanks in advance!
@gftempleton
@gftempleton 7 жыл бұрын
In my opinion, there are no rules as long as you report exactly what you have done. Let the reviewers or advisors help you.
@amadeo3844
@amadeo3844 6 жыл бұрын
Is there an automated way to do this replacement? I have over one-hundred columns of data ranked and each one has a value of "1.00". Do i manually need to go in and find the 1.00 in each column and change it to .999?
@arturogarcialomeli5745
@arturogarcialomeli5745 3 жыл бұрын
You save my day, thanks a lot, i think fi you want to obtain the mean and de standard deviation you need to process you data before apply this method, you are gonna be cited in my thesis!!!!!
@asmaae1993
@asmaae1993 3 жыл бұрын
Hi Mr Arturo Do you have any idea about how we can transform back the data from this form when we want to report the results!! Thank you
@gftempleton
@gftempleton 9 жыл бұрын
When using this technique in research, it may help in the peer review process to cite the published article referenced at the end of the video: Templeton, G.F. 2011. "A Two-Step Approach for Transforming Continuous Variables to Normal: Implications and Recommendations for IS Research," Communications of the AIS, Vol. 28, Article 4.
@charlottekik6475
@charlottekik6475 8 жыл бұрын
Thank you so much!
@gftempleton
@gftempleton 8 жыл бұрын
You're welcome, Charlotte!
@melissacagle3707
@melissacagle3707 5 жыл бұрын
Hi Gary! Should I transform my variables like this for conducting a Principle Component Analysis in order to form an index? Thank you for your video!
@veronicawong9023
@veronicawong9023 3 жыл бұрын
Thank you so much...its 2021 now ur video save my life !
@davine1301
@davine1301 3 жыл бұрын
@@veronicawong9023 orang mana?
@khalidrehman4202
@khalidrehman4202 6 жыл бұрын
thank u so much...this video is too much informative..my data are not normally distributed but after watching this video..i apply this procedure.now my data was normal.
@shahmahmood3908
@shahmahmood3908 9 жыл бұрын
My question is, from where you wrote the value for the second question mark (?) and from the third question mark (?). You didn't show that from where you got the series mean that copied and past from notepad and the standard deviation. I will really appreciate if you could help me in this regards. IDF.NORMAL(RDistanc,?,?)
@norwegianresearchtraininginsti
@norwegianresearchtraininginsti 4 жыл бұрын
did you find out the answer to this your question you share with me. He has mentioned he copied them from Notepad that is what I have heard
@juliabachmann639
@juliabachmann639 4 жыл бұрын
@@norwegianresearchtraininginsti you can find it in his paper, it says: To accomplish Step 2 in Excel, use the NORMINV() function, having the following syntax: NORMINV(Step 1 result, imposed mean, imposed standard deviation) Where, Step 1 result = the result of Step 1, which must be in probability form Imposed mean = mean of the variable resulting from the transformation (!!) Imposed standard deviation = standard deviation of the resulting variable (!!)
@norwegianresearchtraininginsti
@norwegianresearchtraininginsti 4 жыл бұрын
@@juliabachmann639 I will check that his paper Iam interested in that method
@doancongthanh93
@doancongthanh93 4 жыл бұрын
@@juliabachmann639 I think it's the mean of the variable that has been transformed. You can see these values in the Histogram chart at 2.33
@juliabachmann639
@juliabachmann639 3 жыл бұрын
@@doancongthanh93 ahh okay, I see. thank you- that's very helpful!!
@marhan2757
@marhan2757 9 жыл бұрын
I guess, the interpretation does not change much because of this transformation, i.e. st. deviations and means stay the same, while kurtosis and skewness significantly improve. Also, this technique solves the problem with outliers (that are actually not). thanks a lot for such a great solution!
@alexjimenez9940
@alexjimenez9940 3 жыл бұрын
What a great solution! Thank you very much Gary for your help!
@aaljumaili
@aaljumaili 4 жыл бұрын
This video has great value. Thank you so much, Gary for saving my day
@ellieking4132
@ellieking4132 5 жыл бұрын
I was really struggling to work out how to make my data normally distributed in order to do my analysis and this video has saved me. Thank you so much for taking the time to share this method with is, and answer our queries! I really appreciate it :)
@riderho1
@riderho1 5 жыл бұрын
Ellie King where did he get the second and third value ? (,?,?)
@ellieking4132
@ellieking4132 5 жыл бұрын
@@riderho1 My understanding (and what I used) is the second and third values are the mean and SD of the variable that you are transforming.
@learningwithms8293
@learningwithms8293 2 жыл бұрын
Thanks a lot, Dr. Templeton, It is really helpful, I used mentioned process and found it to be useful not only for me but also for my entire department.
@jjabb
@jjabb 4 жыл бұрын
Thank you. This saved my life. I had been struggling with numerous transformations, but did not work. It worked for me. I also used the 1-(1/n). You get a citation from me.
@FooodConfusion
@FooodConfusion 3 жыл бұрын
Wow so easy Loved your way of explanation simple and to the point
@jaimeb.384
@jaimeb.384 7 жыл бұрын
What mean and standard deviation are you using? It is not clear in the video.
@tarignassr6690
@tarignassr6690 6 жыл бұрын
Hi, You didn't show that from where you got the series mean that copied and past from notepad and the standard deviation.
@dashama
@dashama 9 жыл бұрын
Loved your video! Blessings and Love, Dashama
@smrutimokal7452
@smrutimokal7452 4 жыл бұрын
Thanks Gary for the wonderful video and the article. I always have trouble when normalising the data since transformation like log doesn’t usually work.. but this is great .. simply wonderful. Thank u again
@gftempleton
@gftempleton 4 жыл бұрын
I'm glad it helped, Smruti.
@gftempleton
@gftempleton 4 жыл бұрын
Awesome to hear that it worked for you, Smruti. Good luck on your research.
@sasali6727
@sasali6727 7 жыл бұрын
Gary, I ran several times the procedure on both SPSS and EXCEL using the same data set. Apparently, the outputs are inconsistent. Not sure what might cause the difference. I double checked the formula as well described on your paper. Here is the excel formula: To get the percent rank =IF(B4="","",IF(PERCENTRANK(B$2:B$50,B4)=1,0.9999,IF(PERCENTRANK(B$2:B$50,B4)=0,0.0001,PERCENTRANK(B$2:B$50,B4)))) To get the inverse of the Cumulative Normal Distribution =IF(B115="","",NORMINV(B115,0,1)) Running data set with replaced outliers with mean and on the original data produce some significant changes. So, replacing outliers with means doesn't look a reliable method to apply. Now, I am thinking to Winsorize my original data? Do you have any recommendation on it to not miss any single outlier? My data is both hugely negatively skewed and has outliers. They make it hard to figure what is the best way to do. I am think to Robust Statistics as well given my data. Any thought on that? Huge thanks.
@kanya1998
@kanya1998 8 жыл бұрын
much appreciated Mr. Gary, it works perfectly well!
@gftempleton
@gftempleton 8 жыл бұрын
+mahirwe anthony Great and good luck!
@kamalpreetrakhra8071
@kamalpreetrakhra8071 9 жыл бұрын
Hello Dr. Templeton, I am a PhD student and have found that using the technique mentioned greatly improves the skewness and kurtosis value. However data is still not normally distributed. I have also tried log10 and loge transformations. Is there anything else that I can use? I dont have the option of dichotomizing the data. Please can I have the copy of your article for further details
@tsedesiree
@tsedesiree 2 жыл бұрын
Hi Kamalpreet, I got the same problem too. I applied Dr. Templeton's technique but my data is still skewed. Did you figure out how to transform the data into normal distribution? I want to perform a 3way anova so can't just use KW test. Appreciate if you could get back, thanks:)
@duckhunterforex7577
@duckhunterforex7577 2 жыл бұрын
@@tsedesiree Same here, hope you already found the answer and would share it with us all, looking forward. Thank you in advance
@10VGomez
@10VGomez 2 жыл бұрын
Hi, I'm also interested in normalizing a variabe. I have used: ln(x), log(10), 1/x, sqrt(x) and this method but nothing works.. I have heard about johnson transformation method. I haven't tried yet, but it said this method works almost always since it finds an optimal function that normalize your data. Let me try and I will tell you, If somebody knows how to use this method in spss please share the info =)
@fatosakbulut3171
@fatosakbulut3171 4 жыл бұрын
Hi Mr.Templeton. I have 8 groups of data to analyze. Some of them are normal and some of them are not. Should I implement your method for all of the groups in order to compare them in one -way Anova analysis? Please please help...
@MrFantastic161
@MrFantastic161 8 жыл бұрын
Thank you so much for this! I'm currently doing my dissertation and the non-normal data kind of shot me in the foot for the proposed analytical methods. Much appreciated!
@sasali6727
@sasali6727 7 жыл бұрын
Wow, Thanks Gary. This is a great method. I used Log10 and square root transformations to normalized the distribution of my data. None of them worked, and my data was still negatively skewed. I used this two-step transformation and it worked great. At first, I have a hard time finding the corresponding mean of the series and SD. After looking at the paper referenced, I found out that you have two choices. You can either put 0 as a mean and 1 as a SD for the arguments of the function (e.g. IDF.NORMAL(a new ranked series,0,1) or put the mean and SD of the original series to maintain the unit of data. When I put the original mean and SD of the series, I worked just fine and data looked normal. However, when I replaced the parameters with 0 and 1, SPSS returned no value. Not sure why. I also noticed that my sample size didn't reduce by 1 after the transformation. My sample size is small (50). Are you supposed to lose one sample after the transformation? Am I doing it in a wrong way not seeing this result?
@gftempleton
@gftempleton 7 жыл бұрын
So you went from standardized normal (0 and 1 parameters) to normalized (original mean and sd) and back to standardized? Did you save variables with the same names? That may be the problem. I've used the technique en masse and have never heard of that. It seems like you need to make sure you use unique variable names. Low sample size would help retain all records. I don't know what the threshold is. Do you know what the fix is if it does become a problem? Replace the max value (1) in the results of Step 1 with 1-(1/n).
@sasali6727
@sasali6727 7 жыл бұрын
Given the problems with SPSS, I actually ended up using the excel formula you provided in the paper which I found much easier for data with many variables (33 in my case). It worked just fine and all my variables except one are normally distributed now (that non-normal variable looked normal too me, but Shapiro-Wilk test was significant after all ). I was able to retain all my sample (50) using the excel formula. What do you think about that? I also went ahead and ran outlier testing giving a g value of 2.2. To my surprise, most variables have at least two outliers. That's surprising as I replaced all outliers with the mean from the original data before running the transformation in excel. The g factor 2.2 is usually considered pretty large to retain almost all data. That's very interesting to see such a pattern. But I still trust this data with outliers better than my original non-normally distributed data unless I am missing something significant here.
@gftempleton
@gftempleton 7 жыл бұрын
I perform the Two-Step in Excel often - there is no difference as far as I can tell - the same two steps are readily available. I would never replace an outlier with the mean! You're changing the data and may be suppressing or hiding results. If your data is sufficiently normal, don't worry about outliers. And, your data doesn't have to be perfectly normal. If it is terribly non-normal and you use non-parametrics, your data isn't normal anyway. For example, if you use Spearman's rank, the data is transformed to uniform (not normal).
@sasali6727
@sasali6727 7 жыл бұрын
Thanks for taking time and providing the great pointers. The scale of my data is continuous. I don't remember where I saw, but replacing outliers with mean was explained as a reliable method. Unfortunately, my data is pretty skewed, having skewness value of .9 or something like that. My sample size is also pretty small (45), so, not missing even one sample is important. What I would do, I would put back all outliers in their place and run everything all again. That's a pain, but I am curious to see the difference.
@sasali6727
@sasali6727 7 жыл бұрын
Gary, I ran several times the procedure on both SPSS and EXCEL using the same data set. Apparently, the outputs are inconsistent. Not sure what might cause the difference. I double checked the formula as well described on your paper. Here is the excel formula: To get the percent rank =IF(B4="","",IF(PERCENTRANK(B$2:B$50,B4)=1,0.9999,IF(PERCENTRANK(B$2:B$50,B4)=0,0.0001,PERCENTRANK(B$2:B$50,B4)))) To get the inverse of the Cumulative Normal Distribution =IF(B115="","",NORMINV(B115,0,1)) Running data set with replaced outliers with mean and on the original data produce some significant changes. So, replacing outliers with means doesn't look a reliable method to apply. Now, I am thinking to Winsorize my original data? Do you have any recommendation on it to not miss any single outlier? My data is both hugely negatively skewed and has outliers. They make it hard to figure what is the best way to do. I am think to Robust Statistics as well given my data. Any thought on that? Huge thanks.
@saro4761
@saro4761 7 жыл бұрын
Thanks Gary for this absolutely great video.
@franciscosanchez-narvaez9474
@franciscosanchez-narvaez9474 6 жыл бұрын
Thank you Gary, your tutorial is very clearly and helpful
@gftempleton
@gftempleton 6 жыл бұрын
Thanks, Francisco!
@aishasebunya2675
@aishasebunya2675 4 жыл бұрын
Gary, thank you so much. This is awesome. All other methods failed for my work. I really appreciate this and will of course cite you :)
@gftempleton
@gftempleton 4 жыл бұрын
I'm glad it worked for you, Aisha. Good luck on your research.
@alice-nckucsielee8265
@alice-nckucsielee8265 3 жыл бұрын
@@gftempleton Thank you so much but a lot of people are asking for the true of mean and STD mystery XD
@gftempleton
@gftempleton 3 жыл бұрын
@@alice-nckucsielee8265 I'm not sure I understand your question. Units are interpreted as "normalized x." I hope that helps.
@alice-nckucsielee8265
@alice-nckucsielee8265 3 жыл бұрын
@@gftempleton Hi Gary, thanks for replying. I meant the mean and the standard deviation we have to put into the quote at 1:38. What should we put into it? the original mean and STD or after step1 transformation mean and STD :)?
@l.briant3537
@l.briant3537 6 жыл бұрын
Great video Gary, thank you. Just like below, I have some of my variables having an "out of range" error: >At least one of the arguments to the IDF.NORMAL function is out of range. The >first argument (probability) must be positive and less than one. The third >argument must be positive. The result has been set to the system-missing >value. Why is this the case?
@sepiahell1417
@sepiahell1417 2 жыл бұрын
same :( can anyone help plsssssss
@lanuit9733
@lanuit9733 4 жыл бұрын
Thank you so much for this video!!! You saved my life! 감사합니다. Thanks again!
@gftempleton
@gftempleton 4 жыл бұрын
Glad to help - good luck!
@mostafajerari7560
@mostafajerari7560 2 жыл бұрын
Thank you for your effort. I would like to know how to achieve normality of several variables at once (not one by one). Thanks for another time.
@mohammedkhalid9799
@mohammedkhalid9799 6 жыл бұрын
THANK YOU, Mr. Gary
@NoeWanKenobi
@NoeWanKenobi 7 жыл бұрын
Thank you so much for you easy and helpful explanation! You really saved my life (and thesis, which are the same thing right now) :P
@oliviapenaramirez4379
@oliviapenaramirez4379 4 жыл бұрын
Thanks Gary. But every time I run the trasnformation, this error appears. >At least one of the arguments to the IDF.NORMAL function is out of range. The >first argument (probability) must be positive and less than one. The third >argument must be positive. The result has been set to the system-missing >value. What should I do? Where do I copy the mean and the standard deviation? Thanks in advance
@gftempleton
@gftempleton 4 жыл бұрын
Step 2 will not work on 0's or 1's. If the problem is a 0, convert it to 1/n as an estimate. If the problem is a 1, convert it using 1-(1/n) to estimate.
@Afra.Rezagholizadeh
@Afra.Rezagholizadeh 4 жыл бұрын
I did all the steps several times and used my own data series' mean & standard deviation, his numbers , and 0,1 but every time this error appears: >At least one of the arguments to the IDF.NORMAL function is out of range. The >first argument (probability) must be positive and less than one. The third >argument must be positive. The result has been set to the system-missing >value. what should I do? when I used 0,1 and my numbers despite of this error series of new data appeared as normalized data but I don't know they are reliable or not...
@gftempleton
@gftempleton 4 жыл бұрын
Did you use three arguments? The syntax for the second step requires 1) the result of step 1 (this is in fractional or probability form, 2) mean, 3) standard deviation.
@Afra.Rezagholizadeh
@Afra.Rezagholizadeh 4 жыл бұрын
@@gftempleton thanks for your answer. yes I did. for both data series this error came up but also new columns was added to my spss worksheet! I'm gonna use them but considering these errors I don't know how reliable are they...
@madiharazzam1098
@madiharazzam1098 7 жыл бұрын
this is not for Lickert Scale. How to transform data of Lickert Scale.please help
@abigailfulton9185
@abigailfulton9185 5 жыл бұрын
Where do you get the series mean from and the standard deviation? Please can anyone help!
@gftempleton
@gftempleton 5 жыл бұрын
Calculate the mean (average) and standard deviation from the original data. Use those in the second step if you want to approximate original units.
@andrifadillahmartin8074
@andrifadillahmartin8074 9 жыл бұрын
Thanks Gary.. Very much appreciated
@rraj3167
@rraj3167 4 ай бұрын
This is great!!
@linyuliao3417
@linyuliao3417 6 жыл бұрын
Thank your for sharing this video. Can I ask a question. How is the first step related to the second step?
@OmisileKehindeOlugbenga
@OmisileKehindeOlugbenga 7 жыл бұрын
Thanks a lot for saving my say. But you did not mention initially that I would need a 1-(1/n) transformation before the final inversion. Thanks all the same.
@monicaalas4421
@monicaalas4421 5 жыл бұрын
Once we have normality, how can I run a regression with the original data (taking into account the normalized data) so that I can use it in my predictive model?
@riderho1
@riderho1 5 жыл бұрын
Monica Alas hi monica,mind sharing how did u obtain the predictive model ? And the criteria taken into consideration like Correlation matrix,etc?
@elijahd.spragueph.d8905
@elijahd.spragueph.d8905 9 ай бұрын
Can these steps be used after taking ordinal questions and converting them to scale in SPSS?
@TheEmanuelDaniel
@TheEmanuelDaniel 4 жыл бұрын
Thank you! Very useful and clearly explain.
@ker329
@ker329 Жыл бұрын
Hi Gary. I followed all steps but I got warning 4940 at least one of the argument in idf normal function is out of range! would you know why?
@ellieseager589
@ellieseager589 2 жыл бұрын
You just saved me. Thank you!
@pauls1571
@pauls1571 7 жыл бұрын
Hi Gary. First, thanks for your informative video. I was dealing with a few very non-normal distributions, and this method worked wonderfully in normalizing the data. That said, I have one question for you, the answer to which I cannot seem to figure out. Namely, in the video description, you note that, "This approach retains the original series mean and standard deviation to improve the interpretation of results." However, I have not found this to be the case. Although the means and SDs for the transformed variables are quite similar to he original series' means and SDs, they are not perfectly retained. At least, this was true in cases where a value of 1 was generated after completing the first (fractional rank-order step), even when I used the formula, to replace the 1, that you mentioned in your response to a comment, below (i.e., replace the 1 with 1-(1/n)). Any clarification here would be much appreciated. Thanks again for your informative video.
@gftempleton
@gftempleton 7 жыл бұрын
Another reason they aren't exactly the original mean and standard deviation is because of inflated frequencies (stacks of the same value) that are some distance from the mean. If there were no 'same values" in the dataset, the resulting mean and standard deviation would be the exact mean and standard deviation and the original set. The approach "tries" to do that at least.
@gftempleton
@gftempleton 7 жыл бұрын
1-(1/n)) is a close approximation that allows researchers to lose a record. Consider it part of the procedure - just like the first two steps. You may be right, that may cause the mean and standard deviations parameters to vary slightly. I don't think it would affect interpretations much. Sample size is a bigger issue in a lot of cases. This should be up to the researcher to decide.
@linduchyable
@linduchyable 8 жыл бұрын
i saw this video before and it was helpful i followed your steps and the results as it occured i posted them to you in the first comment i posted the result at your friend's channel James Gaskin and he recommended this video for me pleas help i don't know what to do:(
@amalhussein9960
@amalhussein9960 4 жыл бұрын
Thanks Gary Templeton for this informative video. After doing the two steps how can we interpret the output of the regression analysis
@gftempleton
@gftempleton 4 жыл бұрын
Not original units, but normalized units Example If you transform assets to normal and put it in an equation, it is interpreted as normalized assets It's the same as with any transformation
@haziziesa4534
@haziziesa4534 2 жыл бұрын
Dr Templeton - the mean and SD where come from? Do you get it from the original data (not normal one)
@gftempleton
@gftempleton 2 жыл бұрын
Yes - both the mean and SD come from original data.
@mahanesti3990
@mahanesti3990 5 жыл бұрын
Hi Mr. Templeton, thank you for your transformation into normality method. What we should call this transformation method ?
@gershomhabile7215
@gershomhabile7215 2 жыл бұрын
Very useful information but I'm getting lost where you are copying the mean and standard deviation as such I'm stuck. Kindly help where to copy the mean and the standard deviation. You only mentioned that you copy from your notepad, but what about me, where do I copy from? I'm stuck, someone help asap please.
@franck-paulinpehnmayo7148
@franck-paulinpehnmayo7148 2 жыл бұрын
It's from the original data
@fatimabezbiq4212
@fatimabezbiq4212 Жыл бұрын
We make transformation only for dependent variable? Or for all variables of our model?
@gftempleton
@gftempleton Жыл бұрын
There is no rule when you are trying to satisfy the assumptions of the test. Only you should report all procedures.
@fatimabezbiq4212
@fatimabezbiq4212 Жыл бұрын
@@gftempleton Thank you
@porscheboddicker1443
@porscheboddicker1443 9 жыл бұрын
Why do we need to do 2 steps? Can't I just use the fractional rank? For example, my BMI variable was skewed and we wanted to do a GEE with that. Can't I just use the fractional rank as my new BMI?
@DrJolly
@DrJolly 9 жыл бұрын
Thanks, extremely helpful!
@ensarifadi457
@ensarifadi457 6 жыл бұрын
Dear Gary, Can you please tell me what are the implications of using this technique on likert scale. for instance, I have used a likert scale in which 1 is strongly disagree and 7 being strongly agree. Does it inverse the relation or what ? Thanks
@naghmesheibani3877
@naghmesheibani3877 2 жыл бұрын
awesome awesome awesome
@ibrahimsaid28
@ibrahimsaid28 6 жыл бұрын
Thank you for posting it; then Which method is better to normalize data ?; and what if all methods (log; ln; sqrt; trunc fail to normalize my data?
@rameshprasad2341
@rameshprasad2341 3 жыл бұрын
Kindly discuss respected sir how to put value of 2nd ? And 3rd ?.
@lllv1989
@lllv1989 8 жыл бұрын
This is an amazing method. I'm wondering if there's an added value to winsorizing or otherwise capping variables before the transformation. Some of the variables I have clinical variables for which have a case or two have extreme outliers and are also non-normal. Using the means and standard deviations for these variables seems a little weird to me because the Ms and SDs before winsorizing don't seem within the range of values usually seen in my patient population. If I winsorize before transforming, the Ms and SDs seem a little more representative... Am I completely off here?
@zeric_raiz
@zeric_raiz 7 жыл бұрын
Thank you very much for such an instructional article and for the follow up video. I've been able to normalize my data following your method but still have a doubt. I've a non-normally distributed variable "SOM" (metric) which assumes values for the years 2002 and 2012 (nominal variable named SAMPLE only assuming the value '1' for 2002 and '2' for 2012 collected samples). I'm now able to 'globally' normalize my SOM variable with the 'Two-Step Transformation' BUT when I now do a ANALYZE --> DESCRIPTIVE STATISTICS --> EXPLORE with a split file or a 'Factor List' by the variable SAMPLE and re-analyze the normality tests I state that only the 2012 samples are normally distributed with the 2002 not being normally distributed and don't know how to resolve this. I'm stating this particular case but I also need to split my variable SOM even further (i.e. "date collected AND soil type" or "Date collected AND soil type AND cultivation system", etc). Or is this a non-issue because the 'global variable SOM' is now normally distributed? I'm having this issue recursively and simply cannot find the answer to this problem. If you find the time to enlighten me on the issue it will help a lot. Thanks anyway for such a great transformation.
@j-m.s.6646
@j-m.s.6646 6 жыл бұрын
Would using this method after the fact be considered a linear-linear regression, a log-log or what? Also, would transforming the variables post-processing back to say a scale from one to ten be considered good practice in terms of easing interpretation?
@fizzaabidi.3094
@fizzaabidi.3094 11 ай бұрын
Everything is ok but after using this method it give me outlier what should i do
@afaquehussain7678
@afaquehussain7678 4 жыл бұрын
Qno.1 I have 3 dependent variables. Two of them are in range of normal skewness value i.e. +1 to -1 and have kurtosis in range of +3 to -3, but the third remaining dependent variable is not in normal range of skewnes or kurtosis. I want to transform that variable with square root transform to run parametric tests. So the question is, Can I transform that one variable only and run parametric test on the variables or I should transform all three variables before doing test? should I transform all three variables together even the two of them are already normally distributed? will it create problems to transform only one non normal variable? q.no.2 Can I infer and interpret my data for normality on the basis of skewness and kurtosis only rather than gooing for shapiro wilk test?
@jayaprakashsalian1804
@jayaprakashsalian1804 6 жыл бұрын
how do we get the value back from transformed data i.e after i perform transformation i do regression using normal value now after the result i need to know how to get the actual data from transformed data. For logarithmic transform we use the base to get the value back how do we do it here
@aminfarzaneh8142
@aminfarzaneh8142 8 жыл бұрын
Thank you so much for the method. I can not nomalize my data. Can I have the data set you used?
@aishwarypawar7728
@aishwarypawar7728 5 жыл бұрын
Hello, I have 4 non-normal variable in my dataset. Do I need to individually perform these steps for each of 4 non-normal variables? or is there any other method??
@hidaghasemi6538
@hidaghasemi6538 Жыл бұрын
Very helpful,Thank you
@eda1976bdy
@eda1976bdy 8 жыл бұрын
Yes it is a great explanation but i have try on my own data but unable to get normality..i've used log10 and sqrt. the results still the same...a bit changes but no changes on normality. what to do. ple advice.tq
@schummanr
@schummanr 8 жыл бұрын
Thanks for the video and the reference, Prof Templeton. When computing the Fractional Rank of some of my variables I end up having a value =1 (highest value on that variable), which then creates an "out of range" error on the IDF.Normal function as the range of values it accepts is 0 to less than 1. This does not happen with all variables, but just some. Any hints as to why this happens and how to address this in this transformation to normality would be appreciate it. Thanks
@amadeo3844
@amadeo3844 6 жыл бұрын
I too would like to know how to correct this problem, as this results in missingness in the data that I would like to avoid.
@l.briant3537
@l.briant3537 6 жыл бұрын
Hi Fidel Vila, I think I've worked it out. I think there must be some rounding errors, which means that the probability (first argument) ends up being interpreted as being out of range (it has to be within 0 and 1). I'm not sure how this happens (perhaps the calculations for the mean and SD need to be to more significant places, but I've tried this and it doesn't remove the error), but I have worked out a fix which is a bit of a "fudge": Say you have a variable X to be normalised, with mean MEAN and standard deviation SD. Lets suppose you have conducted the fractional rank and made a variable RX. You then do the following: >COMPUTE X_norm=IDF.NORMAL(RX/1.001,MEAN,SD). >EXECUTE. Dividing RX by 1.001 ensures that the variable is kept within the allowed range. (Although I repeat: I am not sure why it is interpreted as being out of range - as far as I can see, my variables all fall into 0 and 1, so it must be to do with rounding errors for the mean and SD calculations). Hope this helps!
@ellieking4132
@ellieking4132 5 жыл бұрын
@@l.briant3537 Thank you SO SO much for this! I was genuinely despairing about the method not working, and your solution worked perfectly!!!
@sameeral-abdi6870
@sameeral-abdi6870 9 жыл бұрын
Thanks Gary It is just awesome. I have one inquiry: How can I transform back from the step 2 to the original data? For example, After I did the two steps I got I mean of (-0.007).So, How am I going to report that?!
@farhaniqbal3421
@farhaniqbal3421 4 жыл бұрын
Thanks Gary. One Question is about the shapiro test.Once I transformed my variables, all improved in terms of skewness and kurtosis. However, the shaprio-wilk test still shows non-normal distribution (p
@gftempleton
@gftempleton 4 жыл бұрын
Feel free to try other transformations (e.g., natural log, power transformations, truncating, winsorizing). However, that is time consuming. If you can find a statistical package that uses Box-Cox, which is tests many different power options, that may be a good use of time. However, reviewers of your work could also tolerate that you attempted a normality transformation that improved the situation. Worst case, you'll have to use non-parametric procedures (which, coincidentally utilize transformations - usually ranking).
@Belcebub69
@Belcebub69 7 жыл бұрын
Thanks, Gary, it is very helpful what you have done. Are you maybe aware, if there's any critical peer review or papers out there regarding this method of yours? Thanks, for the answer.
@gftempleton
@gftempleton 7 жыл бұрын
This paper has been peer reviewed. It will be published in print in early August: aaajournals.org/doi/abs/10.2308/isys-51510?code=aaan-site
@pungozeng3860
@pungozeng3860 8 жыл бұрын
Thank you and could you specify the series mean and std. dev are of the original var (i.e.: the market cap), right?
@radina3737
@radina3737 6 жыл бұрын
First of all thank you very much for this approach saves me a lot of time and effort. My question is: I have a dependent variable measuring "click intention" which can be measured from 0 to 100. After normalizing the data however I get 3 negative results and 2 above 100. Is it acceptable to keep it this way? Thank you very much!
@user-xk4ho7ph3o
@user-xk4ho7ph3o Жыл бұрын
Hello, I followed the steps above. And one of Fractional rank value was 1, and it would be a missing data(no data shown) after the transformation . I don’t know how to solve the problems 😅 Look forward to your reply!
@user-xk4ho7ph3o
@user-xk4ho7ph3o Жыл бұрын
I got that answer in previous comment! Thanks!
@289993
@289993 8 жыл бұрын
By series mean and standard deviation, do you mean the mean and SD of our original variable or of the fractional rank? Sorry if this is a silly question! I am a bit lost and am running out of time. Thank you!
@gftempleton
@gftempleton 8 жыл бұрын
Original variable - this will retain the original units of analysis. You can do that or use 0 for the mean and 1 for the standard deviation. This will standardize all your variables so they have equal weighting.
@genniebaello1712
@genniebaello1712 5 жыл бұрын
@@gftempleton hi! would it greatly affect ANOVA if I use the original units or just 0 & 1 for mean and sd?
@minaorang5094
@minaorang5094 5 жыл бұрын
Thank you for the video and your paper! I used the two-step method for my non-normal data, and all turned to normal distributions! The only concern remained is that if I am allowed to use this method for my data, which are drawn based on 4- and 5-point Likert scales??! I read at your article to use this method mostly for higher levels (up to 100)! I would appreciate it if you could tell me whether I can use this method for 4- and 5-point Likert scales or not! Thanks in advance!
@fatihaelagri7753
@fatihaelagri7753 3 жыл бұрын
Hello My question is, i have a serie whose distribution does not follow the normal distribution, I tried the logarithmic transformation on Eviews but the p-value of jarque-bera is always lower than 0.05 So what transformation to do in Eviews?
@gftempleton
@gftempleton 3 жыл бұрын
Eviews has each step. The first is a fractional rank (rank represented in proportions) and the second is a normal inverse function. It appears "Normal (Gaussian)" is showing you this here: www.eviews.com/help/helpintro.html#page/content/mathapp-Statistical_Distribution_Functions.html
@mosun6390
@mosun6390 3 жыл бұрын
Great one!!!
@diananazaryan
@diananazaryan 9 жыл бұрын
Thank you soooooooo much! Particularly for the decent reference. Couldn't download the article, though, any other sites I can take it from?
@MrLunadecancer
@MrLunadecancer 7 жыл бұрын
Sorry, but when I try to aply the formula 1-(1/n), the result is the same o minus another value
@risausa4796
@risausa4796 2 жыл бұрын
Hi Gary! Thanks for this video. Where did you get the value for the MEAN and STANDARD DEVIATION?
@gftempleton
@gftempleton 2 жыл бұрын
Two options: 1) the original variable mean and standard deviation or 2) 0 for mean, 1 for standard deviation (z-scores).
@oumelkhirmoulay1416
@oumelkhirmoulay1416 Жыл бұрын
Thank u very much for this video pkease i have a question about the nature of this transformation i want to write a sentence to explaine a methods of transformation like this : "the data were arcsin transformed" please if i use this method RV.Normal. what i say
@gftempleton
@gftempleton Жыл бұрын
Just say it was transformed to normal using a two-step procedure described in Templeton (2011). The full reference is at the end of the video.
@fong9615
@fong9615 9 жыл бұрын
wow nice techniques! Thanks! I have one question. Can I use this as for an already normal data because I am using paired sample t test and i think that this should be applied in both data for equal comparisons. And another more, does this techniques is great for observation? because it seems so perfect. Thanks!
@spz145
@spz145 9 жыл бұрын
Hi, thanks for the share. I tried the method, and it works to normalize the dataset, however, why the sample size is reduced after the procedure? For example, why the sample size reduced from 6843 to 6842 above? Would that affect the conclusion?
@alauddinmohammad1517
@alauddinmohammad1517 6 жыл бұрын
Please give us a reasonable answer why the sample is decreasing after two-step process. Why the missing figure is coming? How can we interpret this problem in research paper? Thank you in advance.
@HucheshBudihal
@HucheshBudihal 8 жыл бұрын
Thank u sir for this video. I refer your paper but i am facing a small problem that i do in both excel, SPSS and manually (percentile rank= 1-(rank of xi/n)) as given in the your article, i got different values in step one and step two when i used each methods your mentioned in article. I think values should same but i am not getting. Please response me its help me a lot...
@abhigilli
@abhigilli 5 жыл бұрын
Thanks a ton for this! This was of great help! I had a quick question, and I would be really grateful if you could help me out with this. Tried this on 7 of my variables. 4 of them got transformed, but 3 of them still haven't. Does this mean that they cannot be transformed or is there another way to do this? The sample size is 50, and I am using the Q-Q plot and Shapiro Wilk to test for normality. Thanks in advance, Any help would be greatly appreciated. Thanks again!
@greensea6398
@greensea6398 4 күн бұрын
Thanks
@AmrArafat
@AmrArafat 4 жыл бұрын
Can you perform this method in stata, please? Thanks
@steven-el3sw
@steven-el3sw 4 жыл бұрын
Thank you for this video Gary, it was very helpful. Just to clarify: we are using the mean and standard deviation from the original, non-normal data...correct?
@gftempleton
@gftempleton 4 жыл бұрын
That is one option; the other is to standardize, using mean=0 and sd=1
@rockfortpete
@rockfortpete 8 жыл бұрын
What do you mean by series main?
@gftempleton
@gftempleton 8 жыл бұрын
+pete entjade Mean=average - the average of all values in a variable
@Davao420
@Davao420 6 жыл бұрын
the original Marketvalue or the fractionally ranked one?
@SHSATNewYork
@SHSATNewYork 5 жыл бұрын
This two-step process has three step major issues. One: by using ‘Function group,’ Inverse DF,’ and data from notepad such as ‘series mean and SD’ you transformed the data. Can we use 0 and 1 as series mean and SD as you claim in your paper? Two: after transforming, you will reduce sample size by 1. That means if you have five Three: We have another major issue with this transformation. Now, you can do statistical tests on the transformed data and here is the big problem. Reporting mean and standard deviation in the 'transformed unit' is not the purpose of almost every research. How do you back-transform your result after this ‘two-step transformation’ to explain the results in original data? If you cannot back-transform the results, this is not acceptable in research. Please explain how to deal with all these three issues, thank you.
@gftempleton
@gftempleton 5 жыл бұрын
The three items you list are in no way "major issues:" 1) simply standardizes the data. Another option is to use the original series mean and standard deviation; 2) there is a simple fix - impute missing value using 1-(1/n); 3) this is an issue with every transformation, including the natural log and probabilities; the Two-Step is the only approach that allows the researcher to use the original mean and standard deviation as arguments to the result will emulate the original units.
@linduchyable
@linduchyable 8 жыл бұрын
hello, i have a problem i need a help with.Is the process of removing outliers from a variable more than one time considered manipulating or changing the data?i have loans for public. its mean .17093 st.dv .955838 skewness 7.571 kurtosis 61.436 most of the cases of this loan is an outliers after several times of ranking and replacing the missing values with the mean i reach this output mean .2970 stdv .22582 skewness 2.301 kurtisos 3.885 and it ends ub to be positively skewed. i dont know what to do shall i keep it this way or take thevery first one or do i have to continue knowing that the percentiles 5, 10, 25,50 and 75 ends up with the same number.2072. And i still have to do the regression please help:(
@carloschaveznava7882
@carloschaveznava7882 9 жыл бұрын
Great video and explanation...make more!!! If I have questions can I write you an email?
@norwegianresearchtraininginsti
@norwegianresearchtraininginsti 4 жыл бұрын
My question is the same like Shah 4 years ago where did you get the values you included series mean and standard deviation
@franck-paulinpehnmayo7148
@franck-paulinpehnmayo7148 2 жыл бұрын
From the original data.... Mean and SD
@researchory
@researchory 8 жыл бұрын
Thanks Gray. How to interpret coefficients after converting dependent variable using IDF.normal function? for example, if one unit increase in the independent variable, how it is affecting the dependent variable? Thanks,
@gftempleton
@gftempleton 8 жыл бұрын
Assuming you transform using the series mean and standard deviation, Interpret exactly the same as you would original units. I would note that you normalized the original units. Alternatively, you can transform using mean=0 and sd=1 and interpret as standardized normal original units.
@chenlin920
@chenlin920 7 жыл бұрын
Very helpful video!
@RusiantiSugio
@RusiantiSugio 6 жыл бұрын
Thank you for your nice video
@gftempleton
@gftempleton 6 жыл бұрын
You are welcome, Rusianti!
How to Correct Data that Violates the Parametric Assumption of Normality
24:58
Teaching a Toddler Household Habits: Diaper Disposal & Potty Training #shorts
00:16
Men Vs Women Survive The Wilderness For $500,000
31:48
MrBeast
Рет қаралды 94 МЛН
나랑 아빠가 아이스크림 먹을 때
00:15
진영민yeongmin
Рет қаралды 19 МЛН
А ВЫ ЛЮБИТЕ ШКОЛУ?? #shorts
00:20
Паша Осадчий
Рет қаралды 6 МЛН
How to Use SPSS: Transforming Skewed Data Sets
17:01
Biostatistics Resource Channel
Рет қаралды 130 М.
Transforming Nonnormal Data in SPSS
15:24
Patrick Manapat
Рет қаралды 4,2 М.
SPSS:  Skew and Kurtosis  (Non-Normal Distributions) - 3 different ways
11:44
Test of normality and data transformation in SPSS
6:08
Brandon Schamp
Рет қаралды 181 М.
The Genius Behind the Quantum Navigation Breakthrough
20:47
Dr Ben Miles
Рет қаралды 515 М.
Teaching a Toddler Household Habits: Diaper Disposal & Potty Training #shorts
00:16