How to Identify and Treat Outliers in Stata | Stata Tutorial

  Рет қаралды 38,088

The Data Hall

The Data Hall

Күн бұрын

Пікірлер: 57
@Mat-mt8pk
@Mat-mt8pk 4 жыл бұрын
Methods of finding outliers 1:14 #1. Sorting 2:52 #2. Box Plot 6:04 #3. Extremes 10:05 #4. Histogram 10:50 #5. Spike Plot 11:42 #6. Zscore Treatment 13:07 #1. Keep outliers 13:42 #2. Correct error 14:23 #3. Winsorization 19:06 #4. Trimming
@thedatahall
@thedatahall 4 жыл бұрын
Thanks for the efforts
@gonout8402
@gonout8402 2 жыл бұрын
You have explained everything that my professor taught me in 2 months in just 20 minutes and it's is much more understandable and useful. Thank you very much
@thedatahall
@thedatahall 2 жыл бұрын
😄
@rouniktalukdar872
@rouniktalukdar872 2 жыл бұрын
Amongst the nicest video lecture that I have come across on this topic.. Thanks a lot. please keep uploading more contents on STATA.
@thedatahall
@thedatahall 2 жыл бұрын
Thanks for the appreciation
@wilsonahinful5127
@wilsonahinful5127 2 жыл бұрын
This is all that I have been looking for, thanks very much indeed
@addisugetahun1441
@addisugetahun1441 3 жыл бұрын
Thank you for your nice and clear lecture in identifying and treating outliers.
@alphadie2012
@alphadie2012 3 жыл бұрын
Clear and concise explanation. Thank you
@tomaxow
@tomaxow 3 жыл бұрын
Really well done and explained
@thedatahall
@thedatahall 3 жыл бұрын
Thanks
@jemalhassen2841
@jemalhassen2841 5 ай бұрын
It a very helpful video. Thank you!
@thedatahall
@thedatahall 5 ай бұрын
Thanks. Keep sharing
@jibrilyero2263
@jibrilyero2263 11 ай бұрын
Great job 🎉
@danishjunaid1659
@danishjunaid1659 3 жыл бұрын
Very well explained
@thedatahall
@thedatahall 3 жыл бұрын
Thanks
@isaacasante4060
@isaacasante4060 2 жыл бұрын
Awesome video. Could you please do a similar one using panel data.
@thedatahall
@thedatahall 2 жыл бұрын
Sure will make a video on that
@shafiqullahyousafzai15
@shafiqullahyousafzai15 3 жыл бұрын
Thanks from Afghanistan
@korneliuslanggason5477
@korneliuslanggason5477 3 жыл бұрын
thank you for the explanation.
@thedatahall
@thedatahall 3 жыл бұрын
Thanks
@yilebesaddisu5314
@yilebesaddisu5314 3 жыл бұрын
Thank you dear, very helpful!!
@lottet1945
@lottet1945 3 жыл бұрын
Thank you for this clear explanation! Do you have a video on Cook's distance and Mahalanobis distance in Stata by any chance?
@thedatahall
@thedatahall 3 жыл бұрын
Thanks for watching the video. Unfortunately i currently dont have video on this. I will see if in future i might add this. But if u r interested in spss then there are videos on KZbin
@aibannongspung1765
@aibannongspung1765 2 жыл бұрын
Thank you so much for this insightful video !! Suppose I want to trim the top and bottom 0.1 % of the distribution .How do I write the command ?
@thedatahall
@thedatahall 2 жыл бұрын
I have never tried with decimals but the command will look like winsor2 variablename, trim cut(0.1 99.9)
@thedatahall
@thedatahall 2 жыл бұрын
Let me know if it works
@RafiaAli-n8e
@RafiaAli-n8e 11 ай бұрын
Hi, hope you are doing great. Can you share the link of multivariate outliers, I am not able to find it?
@thedatahall
@thedatahall 11 ай бұрын
Thanks for your kind words. Unfortunately we haven't made any video on multivariate outliers. I will add that in my todo list
@RafiaAli-n8e
@RafiaAli-n8e 11 ай бұрын
It would be highly appreciated.@@thedatahall
@AhaNYS
@AhaNYS 3 жыл бұрын
Thank you for the video! I have a question, I want to use ssc extremes among subcategories. How can I apply this extremes for every subcategory??
@thedatahall
@thedatahall 3 жыл бұрын
U can try bys category: extremes etc etc
@badiahahmed2085
@badiahahmed2085 4 жыл бұрын
Thank you for your great video. I have a question please, After using the Winsorization, can I take the logarithm for some variables? Thank you.
@thedatahall
@thedatahall 4 жыл бұрын
Yes you can take log after winsorization. But be advised that after taking log the interpretation of coefficient changes to percent change. I am soon going to make a video on functional forms, so if u dont have the idea on interpretation after taking log then that video will help.
@badiahahmed2085
@badiahahmed2085 4 жыл бұрын
@@thedatahall Thank you for your response, that will be great. MANY THANKS
@tranglephuong1896
@tranglephuong1896 Жыл бұрын
Can you give me the dataset you run in video?
@thedatahall
@thedatahall Жыл бұрын
unfortunately i have misplaced the data and do file for this specific video.
@shrinjoy1234
@shrinjoy1234 3 жыл бұрын
How do we use winsor command if we want to replace outliers with Q3+1.5 IQR Can we use winsor command to handle outliers of multiple columns in one go? Please advise.
@thedatahall
@thedatahall 3 жыл бұрын
it is not possible using winsor or winsor2 command. you will have to write code for it. one way is to create a variable that will store the value of Q3+1.5iqr and then u can use that to replace in your main variable
@atiyaabdulkarim716
@atiyaabdulkarim716 3 жыл бұрын
A quick question, if we use sort function, will it allign all other observations in other variables? For eg. If we Sort by price, but we have other variables on age education and i.d. No. So after sorting by price, would it keep track of age and education with respect to i.d. after sorting or only one variable would be sorted not others, this can create problems, No?
@thedatahall
@thedatahall 3 жыл бұрын
In stata the sort comment will keep tract of all variables and sort them simultaneously. The whole row will move and not the specific column of price.
@thedatahall
@thedatahall 3 жыл бұрын
Sort only sorts in accending order, there is another command gsort -price so now it sort in descending
@atiyaabdulkarim716
@atiyaabdulkarim716 3 жыл бұрын
Can you tell us/take us through calculator functions in stata (syntax for exponent and complex function)
@thedatahall
@thedatahall 3 жыл бұрын
Sure, u want me to make a video on arithmetic etc functions in stata?
@atiyaabdulkarim716
@atiyaabdulkarim716 3 жыл бұрын
@@thedatahallthank you for getting back to me. I am a medical student and i have to use calculate function in stata to generate a new variable. My problem is that some components are used in exponent form, if you look at MDRD equation to define chronic kidney disease or CKD EPI equation, you will see serum creatinine levels, age are entered in the formula. My specific question is if i want to use this information from some variables in my data set, how can i do this. I tried exponent function but my calculations appear to be incorrect and it seems i am not following the right steps. I would highly appreciate if you could make a video or may be if you can give me a feedback.
@thedatahall
@thedatahall 3 жыл бұрын
What command did u used, if u used exp() function then thats to invert log... If u email me the equation at info@thedatahall.com and might be some sample data or the command u have used i will look into it. If u wanted to take power e.g. square of a number then u do gen newvariable=oldvariable^2
@thedatahall
@thedatahall 3 жыл бұрын
I searched for mdrd equation but i am not sure i found the right one
@atiyaabdulkarim716
@atiyaabdulkarim716 3 жыл бұрын
@@thedatahall thank you for getting back to me, here is the link: patient.info/doctor/estimated-glomerular-filtration-rate-gfr-calculator Normal creatinine values range between 0.6 to 1.2 mg/dl...so one can use values at higher end or perhaps old age and see what is the filteration rate....
@alfinasintiya7477
@alfinasintiya7477 3 жыл бұрын
saya tidak dapat menggunkan "extremes" adakah solusinya?
@thedatahall
@thedatahall 3 жыл бұрын
i just used extreme, its working fine with me what error u are getting? saya hanya menggunakan "extremes", berfungsi dengan baik dengan apa ralat yang anda dapat?
Stata Tutorial: Intro Data Cleaning with Panel Data
22:07
Mike Jonas Econometrics
Рет қаралды 42 М.
Data management: How to identify and replace unusual data values
3:04
黑天使只对C罗有感觉#short #angel #clown
00:39
Super Beauty team
Рет қаралды 36 МЛН
“Don’t stop the chances.”
00:44
ISSEI / いっせい
Рет қаралды 62 МЛН
From Zero to Simulator in Six Weeks | Unreal Fest 2024
39:23
Unreal Engine
Рет қаралды 3,2 М.
Outlier detection using STATA
13:15
KnowHow
Рет қаралды 9 М.
The A to Z of dealing with Outliers | Data Preprocessing | Data Science
14:12
Stata - How to winsorize your data
6:15
Steffen's Classroom
Рет қаралды 20 М.
What should you do with outliers?
4:04
Cassie Kozyrkov
Рет қаралды 12 М.
How to  use Stata for Principal Component Analysis (PCA)
12:02
Finance Fundamentals
Рет қаралды 6 М.
Categorical Variables in Stata
7:56
SebastianWaiEcon
Рет қаралды 140 М.
PANEL DATA ANALYSIS USING STATA
1:33:44
CALWASS
Рет қаралды 11 М.
Testing assumptions of Logistic Regression using STATA
11:35
黑天使只对C罗有感觉#short #angel #clown
00:39
Super Beauty team
Рет қаралды 36 МЛН