Just went thru this channel's uploads. Many hours of binge watching are queued now. "Thanks" 😀 Great videos!
@carpoolify2 жыл бұрын
Great tutorial, many thanks. Would be interested in the box plot method!
@ronenTheBarbarian2 жыл бұрын
Another great lesson, thanks!
@LinaPosada-f1l Жыл бұрын
hello! thank you so much for making this video! how would I find the outliers within a specific group within the overall select? for example if I am finding outlier numbers in each state etc.
@linuxbrad2 жыл бұрын
Beautiful explanations! How did you generate your normal distribution bell curve? (Syntax please?) Thank you!!!
@absentdata2 жыл бұрын
If you want to create a bell curve. A great way is to use Python. You can create a KDE plot using the seaborn package!
@AP-dw6nf2 жыл бұрын
Thank you
@--ShivaS2 жыл бұрын
Gr8 video✌️❤️
@zhangmr79552 жыл бұрын
Good course, but I cannot find "web_data"? would you please provide the link as other datasets?
@CorenMare2 жыл бұрын
Good video.
@pawlowski61322 жыл бұрын
Awesome. I can put this to good use this week. However how did you calculate the outlier thresholds. Plus or minus 2.576? That wasn't clear.
@absentdata2 жыл бұрын
The thresholds are based on the Z-score that give us the probability of how much of our data is within that Z-score. The 2.576 would be 95.5% of your data. So anything outside of that or higher than that score would be an outlier. This 2.57 also aligns to 3 standard deviation from the mean. You don't need to calculate the threshold. You set the threshold at 90, 95, 99 percent and you find the z-score matches this.
@Drasticism-y2s Жыл бұрын
hello, newbie here. Would like to know what stdev should i base my values on. I have this task where i need to find the outliers in a set of amounts. Should i base in on the 2.576 or the 1.96 or 1.645. or should i test in on these three stdev values?
@absentdata Жыл бұрын
Well this is up to your business domain and use-case. How important is it for you to identify outliers? What impact do these outliers have on your data? What is the distribution of your data? These are all questions you should be asking yourself and your stakeholder. Then you can determine what threshold is best for your case.. For example In, In medical domain 95% or 1.96 may be too conservative. There is no one size fits all approach to setting this threshold.
@abcxyzncl2 жыл бұрын
How can I draw a line using PostgreSQL?
@absentdata2 жыл бұрын
Can you explain what you mean by draw a line?
@binu1455 Жыл бұрын
hey ..why z-score value is taken as 2.57 not 2 or 3 .could you please clear that
@absentdata Жыл бұрын
These values correspond to the percentile values on Z table. I chose 2.57 because it represents 99.49%. This is an arbitrary way to set the threshold for your outliers.
@binu1455 Жыл бұрын
@@absentdata okay is this value 2.57 is same for all datasets or for this particular dataset. how to calculate that value
@jacquetrahan84812 жыл бұрын
Why do some use mean and others use median?
@absentdata2 жыл бұрын
Median is going to less influenced by outliers than mean.