The A to Z Complete Guide to Data Preprocessing | Data Pre-processing in Python | Data Science

  Рет қаралды 737

Six Sigma Pro SMART

Six Sigma Pro SMART

Күн бұрын

Пікірлер
@gumshuda24
@gumshuda24 Жыл бұрын
This is pure gold! Thanks for sharing the profound insights picked from the applied knowledge from the AI-ML industry.
@prosmartanalytics
@prosmartanalytics Жыл бұрын
Thank you! We are glad you liked it.
@janaosama6010
@janaosama6010 11 ай бұрын
is removing the duplicates in data done before or after handling the missing values
@prosmartanalytics
@prosmartanalytics 11 ай бұрын
Removing duplicates could turn out to be a bit tricky. Ideally, we should remove duplicates only if each row in the dataset has a unique identifier and that identifier itself is duplicate e.g. we know two employees can't have the same employee id, so based on this, we can remove duplicates or suggest corrections. However, two employees can have the same age, same education, same location, and same salary, as long as these are two different employees we don't want to remove duplicates. Once these points are checked and if it is found that duplicate records are just data entry errors, we can remove duplicates before removing missing values. Basically, this is hygiene, not even data preprocessing. Hope it helps!
@younesgasmi8518
@younesgasmi8518 Жыл бұрын
Whene i have positive or negative infinity values ..Can I replace it with NaN an after that transfert it to normal values using median or mean strategy?
@prosmartanalytics
@prosmartanalytics Жыл бұрын
Good question. First we should find out why a value would have become infinity e.g. we might have derived a ratio variable. It could be infinity because of division by zero? Second, what are the other feature values like in such rows where some features are attaining infinity and how many such values and rows are present in the data? You may refer to our tutorial on outlier treatment for the choice of imputation techniques.
Мясо вегана? 🧐 @Whatthefshow
01:01
История одного вокалиста
Рет қаралды 7 МЛН
Sigma Kid Mistake #funny #sigma
00:17
CRAZY GREAPA
Рет қаралды 30 МЛН
“Don’t stop the chances.”
00:44
ISSEI / いっせい
Рет қаралды 62 МЛН
ML Was Hard Until I Learned These 5 Secrets!
13:11
Boris Meinardus
Рет қаралды 352 М.
How I'd Learn AI in 2025 (if I could start over)
17:55
Dave Ebbelaar
Рет қаралды 1,2 МЛН
The 8 AI Skills That Will Separate Winners From Losers in 2025
19:32
R vs Python
7:07
IBM Technology
Рет қаралды 345 М.
How He Got $600,000 Data Engineer Job
19:08
Sundas Khalid
Рет қаралды 196 М.
Audio Data Processing in Python
19:52
Rob Mulla
Рет қаралды 176 М.
Read Giant Datasets Fast - 3 Tips For Better Data Science Skills
15:17
Python Simplified
Рет қаралды 55 М.
Мясо вегана? 🧐 @Whatthefshow
01:01
История одного вокалиста
Рет қаралды 7 МЛН