Find Outliers in SQL

  Рет қаралды 11,315

Absent Data

Absent Data

Күн бұрын

Пікірлер: 24
@drivetrainerYT
@drivetrainerYT 2 жыл бұрын
Just went thru this channel's uploads. Many hours of binge watching are queued now. "Thanks" 😀 Great videos!
@carpoolify
@carpoolify 2 жыл бұрын
Great tutorial, many thanks. Would be interested in the box plot method!
@ronenTheBarbarian
@ronenTheBarbarian 2 жыл бұрын
Another great lesson, thanks!
@LinaPosada-f1l
@LinaPosada-f1l Жыл бұрын
hello! thank you so much for making this video! how would I find the outliers within a specific group within the overall select? for example if I am finding outlier numbers in each state etc.
@linuxbrad
@linuxbrad 2 жыл бұрын
Beautiful explanations! How did you generate your normal distribution bell curve? (Syntax please?) Thank you!!!
@absentdata
@absentdata 2 жыл бұрын
If you want to create a bell curve. A great way is to use Python. You can create a KDE plot using the seaborn package!
@AP-dw6nf
@AP-dw6nf 2 жыл бұрын
Thank you
@--ShivaS
@--ShivaS 2 жыл бұрын
Gr8 video✌️❤️
@zhangmr7955
@zhangmr7955 2 жыл бұрын
Good course, but I cannot find "web_data"? would you please provide the link as other datasets?
@CorenMare
@CorenMare 2 жыл бұрын
Good video.
@pawlowski6132
@pawlowski6132 2 жыл бұрын
Awesome. I can put this to good use this week. However how did you calculate the outlier thresholds. Plus or minus 2.576? That wasn't clear.
@absentdata
@absentdata 2 жыл бұрын
The thresholds are based on the Z-score that give us the probability of how much of our data is within that Z-score. The 2.576 would be 95.5% of your data. So anything outside of that or higher than that score would be an outlier. This 2.57 also aligns to 3 standard deviation from the mean. You don't need to calculate the threshold. You set the threshold at 90, 95, 99 percent and you find the z-score matches this.
@Drasticism-y2s
@Drasticism-y2s Жыл бұрын
hello, newbie here. Would like to know what stdev should i base my values on. I have this task where i need to find the outliers in a set of amounts. Should i base in on the 2.576 or the 1.96 or 1.645. or should i test in on these three stdev values?
@absentdata
@absentdata Жыл бұрын
Well this is up to your business domain and use-case. How important is it for you to identify outliers? What impact do these outliers have on your data? What is the distribution of your data? These are all questions you should be asking yourself and your stakeholder. Then you can determine what threshold is best for your case.. For example In, In medical domain 95% or 1.96 may be too conservative. There is no one size fits all approach to setting this threshold.
@abcxyzncl
@abcxyzncl 2 жыл бұрын
How can I draw a line using PostgreSQL?
@absentdata
@absentdata 2 жыл бұрын
Can you explain what you mean by draw a line?
@binu1455
@binu1455 Жыл бұрын
hey ..why z-score value is taken as 2.57 not 2 or 3 .could you please clear that
@absentdata
@absentdata Жыл бұрын
These values correspond to the percentile values on Z table. I chose 2.57 because it represents 99.49%. This is an arbitrary way to set the threshold for your outliers.
@binu1455
@binu1455 Жыл бұрын
@@absentdata okay is this value 2.57 is same for all datasets or for this particular dataset. how to calculate that value
@jacquetrahan8481
@jacquetrahan8481 2 жыл бұрын
Why do some use mean and others use median?
@absentdata
@absentdata 2 жыл бұрын
Median is going to less influenced by outliers than mean.
@philtoa334
@philtoa334 2 жыл бұрын
Nice.
@ngocbao2436
@ngocbao2436 8 ай бұрын
Thank you
@absentdata
@absentdata 8 ай бұрын
You're welcome
SQL Interview Query for Data Analyst
29:51
techTFQ
Рет қаралды 59 М.
Quando A Diferença De Altura É Muito Grande 😲😂
00:12
Mari Maria
Рет қаралды 45 МЛН
BAYGUYSTAN | 1 СЕРИЯ | bayGUYS
36:55
bayGUYS
Рет қаралды 1,9 МЛН
1% vs 100% #beatbox #tiktok
01:10
BeatboxJCOP
Рет қаралды 67 МЛН
Find Outliers with Python- 4 Simple Ways
12:01
Absent Data
Рет қаралды 9 М.
WINDOW FUNCTIONS | Advanced SQL
25:47
Mo Chen
Рет қаралды 19 М.
Data Analysis End to End | Project Analytics | Beginner Friendly
1:30:32
How to Detect and Remove Outliers in the Data | Python
24:13
Hackers Realm
Рет қаралды 46 М.
The Ultimate Guide to Data Cleaning in SQL
9:16
Team Acho
Рет қаралды 13 М.
What do Russians want in 2025? | BBC News
7:58
BBC News
Рет қаралды 646 М.
Quando A Diferença De Altura É Muito Grande 😲😂
00:12
Mari Maria
Рет қаралды 45 МЛН