The A to Z of Missing Value Treatment | Data Preprocessing in Python | Data Science

  Рет қаралды 550

Six Sigma Pro SMART

Six Sigma Pro SMART

Күн бұрын

Пікірлер: 7
@meetuchandra7688
@meetuchandra7688 Жыл бұрын
Thank you for this in-depth tutorial. Your complete playlist is awesome and helped clear so many of my concepts.
@prosmartanalytics
@prosmartanalytics Жыл бұрын
Thank you! Glad it helped!
@rachitmakhija9703
@rachitmakhija9703 Жыл бұрын
thank you for this hands on tutorial !!
@janaosama6010
@janaosama6010 11 ай бұрын
how will we know if the variables are independent or dependent on each other to choose the right method to fill missing values? is there a methods that we can use to know that? and if there, what are those methods .. thanks in advance
@prosmartanalytics
@prosmartanalytics 11 ай бұрын
Generally the features captured for a given context tend to have some dependence, that's why some models are called Naive because they assume features to be independent. Therefore, in practice algorithmic approaches are superior compared to the typical mean and median based treatments. A lot of the preprocessing steps are connected, you may refer to our playlist titled EDA and data preparation. Hopefully most of your queries will get answered.
@ryymtareq5027
@ryymtareq5027 11 ай бұрын
@@prosmartanalytics can we calculate correlation to know if the features are dependent or independent?
@prosmartanalytics
@prosmartanalytics 11 ай бұрын
@@ryymtareq5027 Pearson's correlation for numerical variables which are linearly associated, but it focuses on two variables at a time, pairing all possible combinations. Alternatively, we can also look at VIF because it checks entire independent feature space. Likewise, we can apply Chisq test of indepedence for categorical variables. Please note, these are not the pre-requisites for applying algorithmic imputers.
Handle Missing Values: Imputation using R ("mice") Explained
22:47
DataExplained
Рет қаралды 65 М.
Beat Ronaldo, Win $1,000,000
22:45
MrBeast
Рет қаралды 158 МЛН
Complete Guide to Cross Validation
29:49
Rob Mulla
Рет қаралды 59 М.
Imputation Methods for Missing Data
8:05
Sundog Education with Frank Kane
Рет қаралды 21 М.
Machine Learning for Time Series Anomaly Detection with Python | Isolation Forest Tutorial
31:25
Professional Preprocessing with Pipelines in Python
21:48
NeuralNine
Рет қаралды 66 М.
Exploratory Data Analysis with Pandas Python
40:22
Rob Mulla
Рет қаралды 515 М.
Hands-on Support Vector Machines | Data Science Case Study in Python
18:20
Solving real world data science tasks with Python Pandas!
1:26:07
Keith Galli
Рет қаралды 1,5 МЛН