How to Solve Multi Class Imbalance Problem using SMOTE in Machine Learning ?? || PYTHON

  Рет қаралды 7,887

Datahat -- Simplified AI

Datahat -- Simplified AI

Күн бұрын

Пікірлер: 11
@afmonsalves
@afmonsalves 11 ай бұрын
Is curious how after the first SMOTHE application, when equaling the amount of data to 4 thousand for all categories, multiple points of the first and second categories are created in a space where they are not supposed to be (see scatter plot). That seems like a lot of noise for our final model. Could you explain the behavior of the algorithm in this case?. when you created a thousand points for those categories it looks better behaved.
@datahat642
@datahat642 11 ай бұрын
Hey @afmonsalves, what you are saying is absolutely correct that there is additional noise when we increase the number of points further. This happens because SMOTE tries to fill in the neighboring regions which may or may not overlap with the other regions. The objective of SMOTE is primarily to approximate the data points in the close proximity and the sampling_strategy is more of an experiment to determine the best case. Hope this helps!!
@lohjjoo8333
@lohjjoo8333 4 ай бұрын
Hi there, may I know is this same as upsampling?
@datahat642
@datahat642 4 ай бұрын
Yes, it is one of the techniques of upsampling
@hatembouhjar
@hatembouhjar 2 ай бұрын
Aren't you supposed to sample only on X_train and y_train ?
@datahat642
@datahat642 2 ай бұрын
If you are working with the train test splits, then u can sample on the training data
Человек паук уже не тот
00:32
Miracle
Рет қаралды 3,3 МЛН
Não sabe esconder Comida
00:20
DUDU e CAROL
Рет қаралды 62 МЛН
Will A Basketball Boat Hold My Weight?
00:30
MrBeast
Рет қаралды 140 МЛН
Creating synthetic data with categorical variables (SMOTE-NC)
6:19
EZ Data Science
Рет қаралды 2,2 М.
Handling Imbalanced Datasets   SMOTE Technique
24:32
DataMites
Рет қаралды 50 М.
The Sad Reality of Being a Data Scientist
8:55
Samson Afolabi
Рет қаралды 91 М.
Exploratory Data Analysis with Pandas Python
40:22
Rob Mulla
Рет қаралды 491 М.
How to use SMOTE, Borderline SMOTE, ADASYN to handle class imbalance
12:56
Mastering Hyperparameter Tuning with Optuna: Boost Your Machine Learning Models!
28:15
Человек паук уже не тот
00:32
Miracle
Рет қаралды 3,3 МЛН