Machine Learning Tutorial Python - 21: Ensemble Learning - Bagging

  Рет қаралды 91,955

codebasics

codebasics

Күн бұрын

Ensemble learning is all about using multiple models to combine their prediction power to get better predictions that has low variance. Bagging and boosting are two popular techniques that allows us to tackle high variance issue. In this video we will learn about bagging with simple visual demonstration. We will also right python code in sklearn to use BaggingClassifier. And oh yes, in the end we have the exercise for you, as always!
Code: github.com/codebasics/py/blob...
Exercise: github.com/codebasics/py/blob...
⭐️ Timestamps ⭐️
00:00 Theory
08:01 Coding
22:25 Exercise
Do you want to learn technology from me? Check codebasics.io/ for my affordable video courses.
🌎 Website: www.codebasics.io/
🎥 Codebasics Hindi channel: / @codebasicshindi
#️⃣ Social Media #️⃣
🔗 Discord: / discord
📸 Instagram: / codebasicshub
🔊 Facebook: / codebasicshub
📱 Twitter: / codebasicshub
📝 Linkedin (Personal): / dhavalsays
📝 Linkedin (Codebasics): / codebasics
🔗 Patreon: www.patreon.com/codebasics?fa...
❗❗ DISCLAIMER: All opinions expressed in this video are of my own and not that of my employers'.

Пікірлер: 103
@codebasics
@codebasics 2 жыл бұрын
Check out our premium machine learning course with 2 Industry projects: codebasics.io/courses/machine-learning-for-data-science-beginners-to-advanced
@ritvijmishra4727
@ritvijmishra4727 2 жыл бұрын
Thank you so much sir for this ML playlist. Your explanations are simple, exact, and extremely easy to follow. The method that you use of first familiarizing us with theory, then with a practical example and then an exercise is really effective. Looking forward to more of such videos in your ML series. Thanks once again, sir.
@justin.c249
@justin.c249 Жыл бұрын
so far this is the best explanation on bagging technique I found on KZbin! Great Work
@malikhamza9286
@malikhamza9286 2 жыл бұрын
Thank you so much Sir for teaching us a lot of things. I was searching for here and there for Ensemble learning and you video just showed up. You are life saver. Thanks a lot!!
@paulkornreich9806
@paulkornreich9806 2 жыл бұрын
This exercise was a challenge. Thank you. By just taking pure z of the set, some outliers were missed. Basically, all the outliers were the 0s for blood pressure and cholesterol. With those eliminated, I got significantly higher scores than the solution. All bagged models gave a similar 86% accuracy. The biggest jump from non-bagged model to bagged model was the Decision Tree which went from 79% accuracy without bagging to 86% with bagging. Also, I did the exercise several months after this video (was posted - not sure when it was made), so the libraries (especially SVC) may have improved (in their defaults).
@SinarJourney
@SinarJourney 5 ай бұрын
This channel is golden, i really like how you explain the concepts until execution to practical coding.
@jasonwang-wg8wu
@jasonwang-wg8wu 8 ай бұрын
this was nice and straightforward, and the quip about "copy and paste" was hilarious
@khanshian2077
@khanshian2077 2 жыл бұрын
Your tutorial series are teaching me a lot Sir. These are such well organized. You have made these so easier to learn and understand. Hats off to your hard work. A blind follower of you, Sir. Loads of love.
@codebasics
@codebasics 2 жыл бұрын
Thanks for your kind words Khan shian 🙏
@aniketmlk6
@aniketmlk6 9 ай бұрын
Thanks a lot for your awesome series!
@siddheshmhatre2811
@siddheshmhatre2811 Жыл бұрын
One of the most underrated playlists for ML . I wish lots of student will join ❤
@elahe4737
@elahe4737 2 жыл бұрын
That was clearly describe what is the bagging method, I wish you had a video about Boosting as well
@sagarkumarbala5257
@sagarkumarbala5257 2 жыл бұрын
Just as expected...awesome!!!
@PP-tc1zp
@PP-tc1zp 2 жыл бұрын
Thank you sir for a very good explanation. Those examples are very good to training write a code and cause strong motivation.
@SamjhoPaiseKo
@SamjhoPaiseKo 2 жыл бұрын
Explanation is very easy...well understood....👍👍👍
@nachiketgalande8125
@nachiketgalande8125 Жыл бұрын
Thankyou SIR! for this amazing playlist on machine learning
@danialsoleimany259
@danialsoleimany259 5 ай бұрын
The most helpful video for bagging , thank you
@shafinhossain3810
@shafinhossain3810 2 ай бұрын
Thanks a lot for providing very necessary and important contents!
@bruhjoem
@bruhjoem 27 күн бұрын
great video, best video on the topic
@kushanmadusankha5227
@kushanmadusankha5227 Жыл бұрын
Awesome 🔥 Appreciate your effort bro
@1itech-Learn
@1itech-Learn Жыл бұрын
Thank you so much sir for this ML playlist
@bravelionable
@bravelionable 2 жыл бұрын
You are the best! Thank you
@SanjeevKumar-nc2rt
@SanjeevKumar-nc2rt 2 жыл бұрын
Concept cleared 🥰😍😍
@Koome777
@Koome777 7 ай бұрын
My results of the exercise: svm standalone 0.8, after bagging 0.8, Decision Tree standalone 0.65, after bagging 0.79. Bagging helps improve accuracy and reduce overfitting, especially in models that have high variance. Mostly used for unstable models like Decision Trees
@rahulranjan8682
@rahulranjan8682 2 жыл бұрын
Thumbs up! You cannot learn swimming by seeing. Who takes the pain of providing excercise. When i was trying to learn in the beginning this was what i wanted. But atleast good some1 is providing it now.
@JoyFay
@JoyFay 2 жыл бұрын
Thanks for the tutorial
@slainiae
@slainiae 4 ай бұрын
Using standalone model I got a better score than if I used SVC with bagging: 0.902 versus 0.851 Using standalone decision tree I got 0.782 versus 0.84 with bagging. Bagging helps reduce the overfitting (high variance) caused by decision trees by averaging multiple decision trees.
@abebebelew2056
@abebebelew2056 8 ай бұрын
It us very helpful video to do my research project.!!!!!
@sanskaragrawal5074
@sanskaragrawal5074 2 жыл бұрын
Excellent expanation sir.The whole series has been exceptional. Had one query -'How can reduction the size of data set decreasee variance .Decreasing no of features might decrease it,but how decreasing no of training examples can decrease it
@pranav2901
@pranav2901 2 жыл бұрын
thank you very much for this video
@DrizzyJ77
@DrizzyJ77 2 ай бұрын
Thanks code basics
@dataguy7013
@dataguy7013 2 жыл бұрын
Best explanation, EVER!!
@codebasics
@codebasics 2 жыл бұрын
Glad you think so!
@atur42
@atur42 Жыл бұрын
good work really
@bea59kaiwalyakhairnar37
@bea59kaiwalyakhairnar37 2 жыл бұрын
You have to do outlier detection because the max is much higher than that of 75% value
@upendrar9323
@upendrar9323 2 жыл бұрын
Hi Dhaval! Simple & useful explanation as always. Keep doing more videos. However, @11:40 I believe we have to first do train test split & then we should perform standard scaling operation instead of doing the other way. Aren't we running into the problem of data leakage if we do standard scaling on all the data points without train & test split? Let me know your thoughts. Thanks!
@codebasics
@codebasics 2 жыл бұрын
Hey yes the dataset that I showed at the top is actually training dataset. I think I mentioned that in video.
@LamNguyen-jp5vh
@LamNguyen-jp5vh 2 жыл бұрын
Hi, can you explain further the difference between bagging and bagged trees. I don't really understand the explanation in the video. Thank you so much for your help! Your videos are amazing.
@kmishy
@kmishy Жыл бұрын
thanks sir
@anmoldubey3375
@anmoldubey3375 2 жыл бұрын
Thankyou for making such a clear video in bagging and RF. I have one doubt in RF, whe RF does rows and feature sampling so in feature sampling, some of the DT might not get relevant features or not even the features we might wanna use, so doest this affect accuracy and not let us get the result that we want. Ps i know this is a lot of writing!!!!!
@rajagopalk9760
@rajagopalk9760 2 жыл бұрын
Good presentation and preparation; easy to understand. I wish to get a clarification that, why the term "resampling with replacement" is used instead of "sampling with replacement". Is there incidental or there is any specific reason? Thank you.
@bit-colombo5595
@bit-colombo5595 Жыл бұрын
Hi sir can make a video on how to combine classifiers like decision tree, random forest ,naive bayes and svm and get a colleciive result, like a weighted output
@komalparab7175
@komalparab7175 2 жыл бұрын
Waiting for your NLP series . Please Please make it.
@vikranttripathi2258
@vikranttripathi2258 2 жыл бұрын
Thank you for this wonderful explanation. I have a query here. We scaled X but everywhere we use X in cross_val_score. Could you please explain why we scaled X?
@OjciecDyktator
@OjciecDyktator 4 ай бұрын
I think in the case of this dataset it doesn't matter much. Scaling was applied just in case and in cross_val_score for both X and X_scaled the results are very similar for different models. But yes. More logical to use X_scaled later.
@60pluscrazy
@60pluscrazy 2 жыл бұрын
Random forest explanation is superb 👌
@codebasics
@codebasics 2 жыл бұрын
Glad it was helpful!
@ankitjhajhria7443
@ankitjhajhria7443 Жыл бұрын
why are we fitting our model on X,y then what is the use of x_train and y_train and no use of scaling also if we are trainning our model on original X and y ?
@ogochukwustanleyikegbo2420
@ogochukwustanleyikegbo2420 11 ай бұрын
I also learnt that bagging doesn't do so much in increasing the performance of the model apart from lowering the variance.
@juniorvela4614
@juniorvela4614 2 жыл бұрын
Is there any tutorial on multi class classification with Deep learning "NLP" (Keras)
@_muhammadshahrukh_
@_muhammadshahrukh_ 2 жыл бұрын
You can have as many classes as your problem requires in your DNN or NN. Here’s a playlist from code basics for deep learning projects, check out the potato diseases classification course: kzbin.info/aero/PLeo1K3hjS3ut2o1ay5Dqh-r1kq6ZU8W0M
@AkaExcel
@AkaExcel 2 жыл бұрын
Dear @codebasics We would be very grateful if you could teach Stacking tutorial please.
@codebasics
@codebasics 2 жыл бұрын
Point noted
@ogochukwustanleyikegbo2420
@ogochukwustanleyikegbo2420 11 ай бұрын
My results after completing the exercise svm: Standalone 0.82 Bagged model 0.87 Decision Trees: Standalone 0.79 Bagged model 0.84
@bhanusri3732
@bhanusri3732 6 ай бұрын
Why during cross validation using original unscaled X instead of X scaled? Does it not affect accuracy?
@snehasneha9290
@snehasneha9290 2 жыл бұрын
by using the df. describe() how can we decide is it necessary to do outlier removal or not please can anyone help me for my question
@arumoysaha4449
@arumoysaha4449 2 жыл бұрын
I have a question related to K fold cross validation, how do we select the optimize K value for the cross validation technique?
@emekaobiefuna4509
@emekaobiefuna4509 2 жыл бұрын
It's selected arbitrarily. Mostly 5 and 10 are mostly used. The larger the number of K the higher computational expensive the process gets.
@diptopodder1011
@diptopodder1011 Жыл бұрын
How to train multiple file and then provide them label for individual file and classify a file?
@Freeak6
@Freeak6 Жыл бұрын
Shouldn't you use X_train in the cross-validation calls?
@tarunmohapatra5734
@tarunmohapatra5734 2 жыл бұрын
I am waiting for Boosting and Xgboost methods sir
@ayushgupta80
@ayushgupta80 3 ай бұрын
base_estimator is renamed as estimator.
@firedek3208
@firedek3208 2 жыл бұрын
👍👍
@mohannads2757
@mohannads2757 2 жыл бұрын
Where can I find the Standard Scalar explanation?
@nitinpednekar8872
@nitinpednekar8872 Жыл бұрын
Sir, I don't see any time series forecasting model videos, request to upload videos for the same
@laxmiswetha2510
@laxmiswetha2510 2 жыл бұрын
Sir,I have completed my bachelor's degree .so next which course is better for programming and coding .can you explain me .
@snehasneha9290
@snehasneha9290 2 жыл бұрын
@ 21: 50 in cross Val score you x and y why not x_train and y_train can anyone explain this
@gargisingh9279
@gargisingh9279 2 жыл бұрын
Sir can you please share the Machine learning playlist which starts from tutorial 1 . I am not bale to figure the previous tutorials
@sourav_basu
@sourav_basu 2 жыл бұрын
kzbin.info/aero/PLeo1K3hjS3uvCeTYTeyfe0-rN5r8zn9rw
@tejaswiganti8098
@tejaswiganti8098 2 жыл бұрын
I dont know why i am getting output and mean as 1 while using DesicionTreeClassifier and RandomForestClassifier, I have tried with different values but value is same and not getting the exact reason. Can you guys tell me where i have made mistake:|
@arianrahman4840
@arianrahman4840 2 жыл бұрын
Can we get update on your DL series ?
@troubution
@troubution 2 жыл бұрын
Hi, Dhaval I hope you are doing well, I have a query in this, at step 35 you have provided input as X and y to the model. What if you have provided input as X_scaled instead of X, i think accuracy might be different.
@joybhwmck
@joybhwmck 2 жыл бұрын
@codebasics can you please tell us, if we can use X_scaled instead of X in our models. Your videos are great... More power to you...
@sachinnp3526
@sachinnp3526 2 ай бұрын
Sir, u simply use logistics regression
@SohamPaul-xy9jw
@SohamPaul-xy9jw Жыл бұрын
My bagging model score came out to be : 0.8027, SVC : 0.8804, Decision Tree : 0.804
@jayaprasanthr3241
@jayaprasanthr3241 2 жыл бұрын
How to decide on classifier or regressor after data split ?
@its_kumar
@its_kumar 2 жыл бұрын
It depends on your target value if it is continuous use regression, if it is categorical use classification
@deb2918
@deb2918 2 жыл бұрын
sir, why most of the example you give is based on classification .. please give us some example on regression problems..
@venkisayina6063
@venkisayina6063 2 жыл бұрын
Sir can you do an video for java sir plz sir
@pranav2901
@pranav2901 2 жыл бұрын
when the boosting will be uploaded ?
@PP-tc1zp
@PP-tc1zp 2 жыл бұрын
Thank you for your courses I have differen code to detsct otluiers. This code also works very good. It is more simple. Best Regards '''Q1 = df.quantile(0.25) Q3 = df.quantile(0.75) IQR = Q3 - Q1 outlier_condition = ((df < (Q1 - 1.5 * IQR)) | (df > (Q3 +1.5 * IQR))) df3 = df[~outlier_condition.any(axis=1)] df3.shape'''
@kotakarthik-fq5cp
@kotakarthik-fq5cp Жыл бұрын
bagging svc gave a far better result than bagging decision tree
@ayanibA1805
@ayanibA1805 2 жыл бұрын
Don't we need to write code from scratch ??? Simply using sklearn libraries is enough for a ds????
@kameshyuvraj7969
@kameshyuvraj7969 2 жыл бұрын
Sir is there any chance of getting job into ds field who have no coding experience and have exp in non it field i.e electrical field? Please help me out sir?
@codebasics
@codebasics 2 жыл бұрын
Yes it is possible. But if course you need to learn coding and other skills to pass an interview
@kameshyuvraj7969
@kameshyuvraj7969 2 жыл бұрын
@@codebasicssorry I have not say I been 10 years in electrical field due to financial crisis when I was passout sir so but in carona time I have been learn pthhon for ds by using pandas ,matplotlib,scipy,skit libraries
@vikassengupta8427
@vikassengupta8427 4 ай бұрын
Sir, i clicked the link without trying the exercise, my laptop is coughing right now, what can i do sir, nowww😢😢😢
@ManusaiSRKian
@ManusaiSRKian 24 күн бұрын
9:37
@venkisayina6063
@venkisayina6063 2 жыл бұрын
Sir i want to do java certification, can you plz help me sir. I have zero percent knowledge sir. From basics.
@duonghi6986
@duonghi6986 4 ай бұрын
12:04
@venkisayina6063
@venkisayina6063 2 жыл бұрын
I have zero percent knowledge in software side sir Can u plz help me sir
@abdulds9764
@abdulds9764 2 жыл бұрын
Sir... Could you please upload a video about Boosting technique
@codebasics
@codebasics 2 жыл бұрын
Yes that one is coming next
@its_kumar
@its_kumar 2 жыл бұрын
Most important tool for programmers *copy and paste* 👌 😂😂
@devesh_upreti
@devesh_upreti Жыл бұрын
SVC score without bagging 0.87 DecisionTreeClassifier score without bagging 0.76 SVC score with bagging 0.867 DecisionTreeClassifier score with bagging 1.0 Drastic improvement in Decision Tree Classifier
@SohamPaul-xy9jw
@SohamPaul-xy9jw Жыл бұрын
How are you using SVC with bagging. Can you send the code of that?
@BathingAfrican
@BathingAfrican 2 жыл бұрын
I hope you’re feeling better I remember you saying you were sick inshallah
@codebasics
@codebasics 2 жыл бұрын
I am healthy as a horse my friend. Thanks for your care and concern 🙏
@nikhilraj93
@nikhilraj93 Жыл бұрын
I tried clicking on soultion Now I have fever
@nastaran1010
@nastaran1010 6 ай бұрын
I hope you see my questions you never response to my questions. why you didi not fit "BaggingClassifier' with '(x_train,y_train)', in exercise?
@omsaichand752
@omsaichand752 2 жыл бұрын
Your tutorials are not properly structured and are not learner centric!
@coder_62
@coder_62 Жыл бұрын
O my gad, my computer get fever for 1 month wkwkwwk. Btw thank you sir for your clear explanation.!!!
@sahilgundu5338
@sahilgundu5338 Жыл бұрын
Im not able to see top row, all column headings in names in CSV file downloaded from kaggle - pima-indians-diabetes.csv Am I doing any mistake while downloading?
No empty
00:35
Mamasoboliha
Рет қаралды 9 МЛН
Summer shower by Secret Vlog
00:17
Secret Vlog
Рет қаралды 13 МЛН
Machine Learning Tutorial Python 12 - K Fold Cross Validation
25:20
Predict The Stock Market With Machine Learning And Python
35:55
Dataquest
Рет қаралды 657 М.
Machine Learning for Everybody - Full Course
3:53:53
freeCodeCamp.org
Рет қаралды 6 МЛН
AdaBoost, Clearly Explained
20:54
StatQuest with Josh Starmer
Рет қаралды 743 М.
Bootstrapping Main Ideas!!!
9:27
StatQuest with Josh Starmer
Рет қаралды 443 М.