Dealing with an outlier - Winsorize

  Рет қаралды 60,977

how2stats

how2stats

Күн бұрын

Пікірлер: 24
@JakeSlater3
@JakeSlater3 8 жыл бұрын
Excellent video, thank you. I also hope this catches on more over time--we are throwing away a lot of extreme yet valid data.
@alejandrodominguezrodrigue8431
@alejandrodominguezrodrigue8431 6 жыл бұрын
Thanks for the video. I also wondered why the plus 1, it could be nice if you explained why. However, reading the scientific literature I find that there are more studies that do this. In my understanding, it makes sense. Because if the extreme values are replaced by the highest value, then the extreme value modified and the original high value will look the same. Therefore, you will not know in your data how many extreme values had been changed, but with the option of +1 you still can see this in the analysis. I recommend the following article: Wainer H. Robust statistics: A survey and some prescriptions. Journal of Educational Statistics 1976;1:285-312
@pooyax61
@pooyax61 6 жыл бұрын
Actually, in Winsorizing you are altering a "percentile" of your data not always outliers reporting by 3IQ. For instance, you may have 10 data points and you want 1st Winsorize. It means that only "one" extreme number will be changed no matter. in fact, that number could be an outlier or not. Or you may had more than one outliers but 1st Winsoize replaces the largest one. It is a bit tricky :) In this video, the outlier and winsorized number was the same but as I mentioned you may have more outliers.
@Crystal29121997
@Crystal29121997 4 жыл бұрын
@@pooyax61 Hi! Do you have any papers that i can cite for this? I would like to use this method:)
@nicolasupton7965
@nicolasupton7965 8 жыл бұрын
Hi, thanks for the great video! One question though, do you have any references to support winsorizing the outlier value to the outer percentile +/-1 rather than just changing it to the value itself? In other words, are there any papers I could reference that would support changing the value of 19 to 9 rather than 8? Thanks for any help!
@ShynyMagikarp
@ShynyMagikarp 5 жыл бұрын
Still wondering this. I have never heard of this before?
@aazeerahamed8028
@aazeerahamed8028 4 жыл бұрын
It is realy useful! and expecting more from you.
@sarahhatcher6781
@sarahhatcher6781 6 жыл бұрын
Do you have a reference for the Windsorize +1 method described in this video?
@dianamintal8480
@dianamintal8480 4 жыл бұрын
Hi Sarah - Although I cannot give you a reference or publication for this method, I can confirm that I had studied exactly the same method for winsorizing (the next highest/slowest data point value +/-1) on my statistical course in Hungary.
@sandorattilapabar6773
@sandorattilapabar6773 3 жыл бұрын
@@dianamintal8480 ez megnyugtató :D magamtól találtam ezt a módszert és jó látni, hogy használják.
@Some1rishGuy
@Some1rishGuy 6 жыл бұрын
Really helpful video. I'm just wondering how you would report any winsorizing that you have done to the dataset?
@sarahkrahenbuhl5654
@sarahkrahenbuhl5654 7 жыл бұрын
Really clear and informative - thanks
@baharehghanbarikondori1965
@baharehghanbarikondori1965 3 жыл бұрын
Really helpful , thank you
@TR-rx5ej
@TR-rx5ej 2 жыл бұрын
Is it possible to highlight extreme outliers (when they are too many) in data set (data view) of spss to remove them?
@Isuppose12
@Isuppose12 4 жыл бұрын
Thank you so much! Now I just need to found a reference for winsorize +1
@d.l.563
@d.l.563 7 жыл бұрын
Very interesting video! It was very useful!! Can you give me the page number of the text shown in the video, and the references of Dixon, 1960 and 1957? I would like to cite them Thanks a lot!!
@stephensanjaya6548
@stephensanjaya6548 4 жыл бұрын
How to winsorize first and 99th percentile?
@Isuppose12
@Isuppose12 4 жыл бұрын
I have a question: after some digging, it is apparently that Winsorization should be symmetrically apply to both ends. So what happens if I only have one outlier at one end but not the other? Would it still be a valid approach to modify a score at one end?
@how2stats
@how2stats 4 жыл бұрын
I have never seen it suggested in a peer reviewed publicatoin to Winsorize on both sides of the distribution. If you have one or more outliers on only one side, then deal only with those outliers on that one side. That's my recommendation.
@PRO_PRASHANT
@PRO_PRASHANT 4 жыл бұрын
You have other techniques to perform in that case, other than winsorization
@naeemchak
@naeemchak 5 жыл бұрын
Thanks a lot. However, the link in the book doesn't open this video. Perhaps, you need to edit the link in the pdf.
@how2stats
@how2stats 5 жыл бұрын
The links in my how2statsbook are hosted by my how2statsbook channel: kzbin.info/door/low2dWW7y1uO4kK4SxwsjA
@johnwilson9391
@johnwilson9391 8 жыл бұрын
I have measured the response times each participant took to respond to 24 items, however, only the times of the correct responses of each participant were considered, thus leaving me with distinct number of response times for each participant. Some participants have taken more time than one would expect to answer the question, which appeared to be related to how difficult the task was for them. Thus, I do not want to eliminate the outliers (I used the z-scores and identified results 3SD above the mean as outliers). I was thinking about winsorizing the outliers, however I have many doubts on how to do it. 1. Does one always have to winsorize both extreme ends? Because in my case only the above the mean results were considered outliers. 2. How does one conduct the winsorization in an uniform way for all participants when they have different numbers of response times that will be analysed? Whilst one participant may have given 20 correct responses, another may only have answered correctly to 13 or 7 questions. How can I set that I will only winsor 10 % of the data for example, when the number of responses is so different? In the end all I want to use from this task is the mean of the response times of each participant. Thank you.
@Oncus2
@Oncus2 5 жыл бұрын
But why is it better to add+1 instead of just transforming it to the next highest value?
The A to Z of dealing with Outliers | Data Preprocessing | Data Science
14:12
Six Sigma Pro SMART
Рет қаралды 1,6 М.
Cat mode and a glass of water #family #humor #fun
00:22
Kotiki_Z
Рет қаралды 42 МЛН
REAL or FAKE? #beatbox #tiktok
01:03
BeatboxJCOP
Рет қаралды 18 МЛН
Detecting and dealing with outliers
8:08
educresem
Рет қаралды 62 М.
Dealing with Outliers (part 2)
3:43
how2stats
Рет қаралды 20 М.
Normality Test with Likert Scale Data in SPSS
20:06
Ugochukwu Chinonso Okolie
Рет қаралды 22 М.
Stata - How to winsorize your data
6:15
Steffen's Classroom
Рет қаралды 20 М.
Boxplots & Outliers in SPSS - Identify and Deal with Outliers (4-8)
12:05
Research By Design
Рет қаралды 96 М.
The Right Way to Detect Outliers   Outlier Labeling Rule (part 2)
5:05
How To Know Which Statistical Test To Use For Hypothesis Testing
19:54
Amour Learning
Рет қаралды 815 М.
How to Winsorize Outliers in R
6:55
Jay Does Data
Рет қаралды 2,7 М.
How to detect outliers in SPSS
7:52
how2stats
Рет қаралды 292 М.