XGBoost Model in Python | Tutorial

XGBoost Model in Python | Tutorial | Machine Learning

Рет қаралды 32,627

Күн бұрын

Пікірлер: 43

@mattc3738 3 жыл бұрын

How does this not have more views!? Excellent video, EXACTLY what I needed to finish my project at work. This video could have saved me 10 hours of head scratching if I had seen it sooner.

@harsh1kumar 3 жыл бұрын

Thanks Matt. I am glad to know that the video helped.

@ThePaintingpeter Жыл бұрын

I cannot overstate the fact that this video is really clear and terrific. Absolutely fantastic effort on your part. Thank you very much for doing this

@lxkhati4272 3 жыл бұрын

all the advanced terms are simply described. Thanks, Harsh.

@dehumanizer668 3 жыл бұрын

Exactly what I needed. Explained very clearly. Thank You.

@saisarath623 Жыл бұрын

Really nice video and explanation Harsh

@kiranchowdary8100 3 жыл бұрын

Good video sir , Thanks for making videos and educating us

@MrLordmaximus 2 жыл бұрын

This is a a very well explained video !

@mosherchtman Жыл бұрын

More videos [like this] that teach optimization of all the parameters in the model, please

@riskamulliani3390 9 ай бұрын

Thank you sir🙏, vidio ini sangat membantu 😊

@harshchoudhary279 2 жыл бұрын

This video covers a lot of thing in short time

@alexandergawrilow6255 3 жыл бұрын

Thank you for the great content. I'm wondering why don't you use early_stopping_rounds during grid search? That way you could set num_trees to a fixed big number (like you did later when building the final model) and don't have to grid search over it. Also, using your approach you probably overfit during grid search (due to the high number of estimators) and only get the best parameters when using all of the 1000, 2000 or 3000 trees. In the final model, due to the fact that you use early_stopping_rounds, a different number of estimators will be used and therefore the optimal hyperparamters from the grid search are probably not the optimal hyperparameters for the final model. What do you think about it?

@harsh1kumar 3 жыл бұрын

Hey Alexander, thank you for this good question. You are right, ideally we would want to use something like early_stopping_rounds during grid search. As far as I know, this feature is not available while performing grid search using sklearn. Grid search will check values of all the parameter combinations that have been specified. You are also right in stating that there will be difference in estimators that we get from grid search and from using early_stopping_rounds in the final model. I consider grid search as an initial estimate of what hyperparameters would give better results, but the final model can have slightly different values. Thank you for your interesting question :)

@romaljaiswal8 2 жыл бұрын

Disliking this video because it’s too good and I don’t want others to know abt it 😂😂

@Sam98961 2 жыл бұрын

Thanks for the video! Great learning experience.

@milanchetry1168 10 ай бұрын

eval_metric throws error, can anyone suggest me the reason?

@v1hana350 2 жыл бұрын

How can parallelization work in the Xgboost algorithm? Please explain it with an example

@fscode5021 2 жыл бұрын

in my project only i get 45% in training and 44 in testing. What do you think i can do to get better accuracy please.

@pradyutmazumdar1441 2 жыл бұрын

i have a doubt……during cross validation where we choose which model to use i am getting some accuracy but after hyperparameter tuning the accuracy jumps by 2 % Is this normal? This is in XGboost

@henilshah6962 2 жыл бұрын

How do you do it for Multiclass classification?

@LLoBBHa 2 жыл бұрын

Great video thank you!

@vbcsaransekar9058 3 жыл бұрын

I appreciate your effort.

@AkshayDudvadkar 2 жыл бұрын

Just wanted to know whether EDA, feature selection is not needed for XGboost ?

@harsh1kumar 2 жыл бұрын

EDA should be done irrespective of the model. Feature selection can also help removing unnecessary complexity in the model. But the benefit for techniques like XGBoost is that it can take in large number of features and give importance to the relevant ones. I would advice doing first iteration with all possible features and then remove features with lower importance, while monitoring model performance metrics.

@v1hana350 2 жыл бұрын

I have a question about the Xgboost algorithm. The question is how parallelization works in the Xgboost algorithm and explain me with an example.

@jrobledo25 2 жыл бұрын

Nice video! Thank you so much! One pair of doubts, is there a way to download the notebook with outputs from Kaggle? Is it possible to train models like XGBoost with GPU? because the last time I tried there, the debugger suggested that it was only possible with sequential models like neural networks.

@saumyen1 2 жыл бұрын

I have a question. What are the two classes here that are being separated.

@harsh1kumar 2 жыл бұрын

We are trying to identify which customers will make a specific transaction in the future. These customers will be tagged as 1 in the data. For more details see here www.kaggle.com/competitions/santander-customer-transaction-prediction/overview

@ratishr6003 3 жыл бұрын

Thank you, this was explained really well. I'm working on a scorecard model with over 400 variables, can we use 'from xgboost import plot_importance' to print out the important features post hyper-parameter tuning and training the model and then re-run the model with subset features?

@shivankarora1264 2 жыл бұрын

Hi I'm working on the same Please help me, with what approach you did Thanks