How to Detect and Remove Outliers in the Data | Python

  Рет қаралды 45,974

Hackers Realm

Hackers Realm

Күн бұрын

Пікірлер: 70
@pankajgoikar4158
@pankajgoikar4158 2 жыл бұрын
You are amazing bro. Don't have words to thank you. you have cleared my many concepts. Lots of love from UK and god bless you. 😊
@HackersRealm
@HackersRealm 2 жыл бұрын
Thank you so much for your kind words ❤️
@BLG120
@BLG120 2 ай бұрын
u r indian
@asadnaeem123
@asadnaeem123 5 ай бұрын
Amazing tutorial. Bro, you made my day. Lots of love from Pakistan.
@HackersRealm
@HackersRealm 5 ай бұрын
Glad to hear that!!!
@YousafSulaiman
@YousafSulaiman 5 ай бұрын
You are from Pakistan !! Amazing !😀
@Hollysattic
@Hollysattic 13 күн бұрын
Your videos helped me so much! Thanks a lot🎉
@HackersRealm
@HackersRealm 12 күн бұрын
Glad it was helpful!!!
@kennylouries410
@kennylouries410 Ай бұрын
Thank you sir. Clearcut explanation.
@leekhon108
@leekhon108 2 ай бұрын
Thank you very much for the tutorial, it is easy to understand and we explained ☺☺
@negusuworku2375
@negusuworku2375 11 ай бұрын
This is very helpful. Excellent.
@HackersRealm
@HackersRealm 11 ай бұрын
Glad you liked it!!!
@codename_Rahul
@codename_Rahul 11 ай бұрын
This video helped me a lot. Thanks!
@HackersRealm
@HackersRealm 11 ай бұрын
Glad it was helpful!!!
@durgak2587
@durgak2587 3 ай бұрын
Thank you so much❤️..this was very helpful 🤜✨
@HackersRealm
@HackersRealm 3 ай бұрын
@@durgak2587 glad you liked it!!
@ArniFuentes
@ArniFuentes 5 ай бұрын
Thank you so much!!!. A question: in what type of distributions can the box plot be used? For example, if the data follows a uniform distribution, does it make sense to find outliers? What do you recommend me?
@HackersRealm
@HackersRealm 5 ай бұрын
You can use box plot and check if there are any outlier for any distribution. If there is some outliers, do the processing, if not ignore it.
@ArniFuentes
@ArniFuentes 5 ай бұрын
@@HackersRealm thanks for your answer
@debangshubarua5345
@debangshubarua5345 Жыл бұрын
Good vedio... Do i need check for all the numeric columns one by one and perform capping operation??????
@HackersRealm
@HackersRealm Жыл бұрын
You can use a loop to do it for all numeric columns at once...
@DJnaidu22
@DJnaidu22 8 ай бұрын
really a great explanation
@HackersRealm
@HackersRealm 8 ай бұрын
Glad you liked it!!!
@titi-cu8dx
@titi-cu8dx Жыл бұрын
What about dealing with categorical columns in the context of outliers?
@HackersRealm
@HackersRealm Жыл бұрын
I don't think there will be outliers in categories
@Fighter_Believer_Achiever
@Fighter_Believer_Achiever 7 ай бұрын
Thank you very much sir!!
@vietttt0104
@vietttt0104 2 жыл бұрын
Greate Tutorial!! Thanks a lot!! I have a question that How could I do it with the whole dataset? not a single one
@HackersRealm
@HackersRealm 2 жыл бұрын
you can iterate the columns and process the whole data
@aniketlode4808
@aniketlode4808 2 жыл бұрын
@@HackersRealm So to iterate it we will be using for loop passing each column name as I??
@HackersRealm
@HackersRealm 2 жыл бұрын
@@aniketlode4808 yeah
@ocraking
@ocraking 6 ай бұрын
what an amazing video
@DJnaidu22
@DJnaidu22 8 ай бұрын
Bruh I have a doubt..... please explain briefly..... These three techniques are used for trimming or capping outliers in the dataset...... But why don't we use only z-score to find outliers. Then what's the diff between these three techniques??
@ShubhamPlays
@ShubhamPlays 12 күн бұрын
good video
@massoudkadivar8758
@massoudkadivar8758 11 ай бұрын
Thank you so much, I have a question, do we need to do this process for each column one by one?
@HackersRealm
@HackersRealm 11 ай бұрын
yes, that's correct, you can use loops to automate this.
@yvkvlogs
@yvkvlogs 2 ай бұрын
Can we use any one method that is enough to remove Outliers 😊
@SylvanAnugrahSyahPutra
@SylvanAnugrahSyahPutra 4 ай бұрын
Hi. if the data distribution is not normal, it's okay to use z-score ? or we should use IQR ?
@HackersRealm
@HackersRealm 4 ай бұрын
we should use IQR
@sushmitarawat6438
@sushmitarawat6438 Жыл бұрын
Too good....and simple thanks a lot☺️🙏🏼
@HackersRealm
@HackersRealm Жыл бұрын
Glad you like it sushmita!!!
@sushmitarawat6438
@sushmitarawat6438 Жыл бұрын
@@HackersRealm could you suggest some paid internship which I can start off with the very next month
@HackersRealm
@HackersRealm Жыл бұрын
@@sushmitarawat6438 For ML based internship, it's better to compete in hackathons or contest to get internship.. You could checkout hackerearth, techgig, etc., for that
@sushmitarawat6438
@sushmitarawat6438 Жыл бұрын
@@HackersRealm ok
@nishah4058
@nishah4058 3 ай бұрын
can u pls tell what can be outliers in textual data like comment etc..and how we can remove that outliers in textua data?
@HackersRealm
@HackersRealm 3 ай бұрын
you could use text embedding and have a cluster, anything that is far of the cluster might be a outlier
@nishah4058
@nishah4058 3 ай бұрын
@@HackersRealm not getting you .. can u pls eloborate.
@adityachoudhari3596
@adityachoudhari3596 2 жыл бұрын
Yo bro I m also learning ai and ml concepts I just need to work one some project or get the training in this Plz tell me if you can help
@HackersRealm
@HackersRealm 2 жыл бұрын
check the iris dataset analysis project in the playlist for start
@mohamads9759
@mohamads9759 8 ай бұрын
Very Great.
@HackersRealm
@HackersRealm 8 ай бұрын
Glad you liked it!!!
@СулейманК-г8ы
@СулейманК-г8ы 4 ай бұрын
can somebody please explain from where we get 1.5 in the IQR method? why exactly 1.5?
@santoryuu989
@santoryuu989 2 жыл бұрын
what do you think is the best method out of these three ?
@HackersRealm
@HackersRealm 2 жыл бұрын
You can use any method as it's producing similar results, but instead of deleting samples, trim it in the range
@Sachinnani019
@Sachinnani019 Жыл бұрын
8:35 outliers=26
@karthika8610
@karthika8610 Жыл бұрын
Which method is the most preferred?
@HackersRealm
@HackersRealm Жыл бұрын
It's not about preference, it depends on where and which use case you're trying to solve
@madhulikasuman2803
@madhulikasuman2803 8 ай бұрын
@@HackersRealm if there are 40% outlier then ?
@HackersRealm
@HackersRealm 8 ай бұрын
@@madhulikasuman2803 it depends on the nature of data, need to understand the domain, and see why this is the case. We could do some data transformation like log transformation to change it
@Serene__Soul98
@Serene__Soul98 2 жыл бұрын
Hii..my dataset has 19 columns and at least 10 colums shows outliers.. So do I have to perform this process for every column each time?
@HackersRealm
@HackersRealm 2 жыл бұрын
Yes it's better to do the process in a loop and fix it for better results
@avashchand9623
@avashchand9623 2 жыл бұрын
@@HackersRealm Can you kindly show this process too. Searching for it everywhere can't find it.
@HackersRealm
@HackersRealm 2 жыл бұрын
@@avashchand9623 what process you're referring?
@aniketlode4808
@aniketlode4808 2 жыл бұрын
@@HackersRealm I think he is asking for the process of looping the columns
@nihalkausar2215
@nihalkausar2215 8 ай бұрын
Pls after I have handled each column outlets how do I save it and which data frame should I continue using
@Niyati_11
@Niyati_11 Жыл бұрын
My df is empty while finding the outliers. Any idea why it is so?
@HackersRealm
@HackersRealm Жыл бұрын
which cell you faced the issue?
@vlog_fiast
@vlog_fiast 5 ай бұрын
@ricesweat9951
@ricesweat9951 Жыл бұрын
why you decided to use residual sugar as a column to find outliers? any tips and tricks on which columns should be used to find outliers within the dataset?
@HackersRealm
@HackersRealm Жыл бұрын
we can use boxplot or violinplot to find the outliers. You can see some dots outside the line which can be considered as outliers.
How to perform Label Encoding for Categorical Attributes | Python
5:14
Cat mode and a glass of water #family #humor #fun
00:22
Kotiki_Z
Рет қаралды 42 МЛН
Chain Game Strong ⛓️
00:21
Anwar Jibawi
Рет қаралды 41 МЛН
Finding an outlier in a dataset using Python
16:24
Krish Naik
Рет қаралды 198 М.
The A to Z of dealing with Outliers | Data Preprocessing | Data Science
14:12
Six Sigma Pro SMART
Рет қаралды 1,7 М.
How to handle imbalanced datasets in Python
11:48
Data Professor
Рет қаралды 55 М.
Normalization Vs. Standardization (Feature Scaling in Machine Learning)
19:48
Isolation Forest for Outlier Detection within Python
14:40
Andy McDonald
Рет қаралды 31 М.
Toxic Comment Classification | Multi Label | NLP | Python
45:05
Hackers Realm
Рет қаралды 1,3 М.