What is AdaBoost (BOOSTING TECHNIQUES)

Рет қаралды 338,591

Күн бұрын

Пікірлер: 234

@ashisharora9649 4 жыл бұрын

Adaboost (Adaptive Boosting) Adaboost combines multiple weak learners into a single strong learner. This method does not follow Bootstrapping. However, it will create different decision trees with a single split (one depth), called decision stumps. The number of decision stumps it will make will depend on the number of features in the dataset. Suppose there are M features then, Adaboost will create M decision stumps. 1. We will assign an equal sample weight to each observation. 2. We will create M decision stumps, for M number of features. 3. Out of all M decision stumps, I first have to select one best decision tree model. For selecting it, we will either calculate the Entropy or Gini coefficient. The model with lesser entropy will be selected (means model that is less disordered). 4. Now, after the first decision stump is built, an algorithm would evaluate this decision and check how many observations the model has misclassified. 5. Suppose out of N observations, The first decision stump has misclassified T number of observations. 6. For this, we will calculate the total error (TE), which is equal to T/N. 7. Now we will calculate the performance of the first decision stump. Performance of stump = 1/2*loge((1-TE)/TE) 8. Now we will update the weights assigned before. To do this, we will first update the weights of those observations, which we have misclassified. The weights of wrongly classified observations will be increased and the weights of correctly classified weights will be reduced. 9. By using this formula: old weight * e performance of stump 10. Now respectively for each observation, we will add and subtract the updated weights to get the final weights. 11. But these weights are not normalized that is their sum is not equal to one. To do this, we will sum them and divide each final weight with that sum. 12. After this, we have to make our second decision stump. For this, we will make a class intervals for the normalized weights. 13. After that, we want to make a second weak model. But to do that, we need a sample dataset on which the second weak model can be run. For making it, we will run N number of iterations. On each iteration, it will calculate a random number ranging between 0-1 and this random will be compared with class intervals we created and on which class interval it lies, that row will be selected for sample data set. So new sample data set would also be of N observation. 14. This whole process will continue for M decision stumps. The final sequential tree would be considered as the final tree.

@vivektyson 4 жыл бұрын

Thanks man, a summary sure is nice. :)

@pavangarige2521 4 жыл бұрын

Thanks bro..

@bhargavasavi 4 жыл бұрын

Step 12, on how the buckets are created ...need to see that..But very nice summary

@kiran082 4 жыл бұрын

Great Job Ashish.Thanks for the detailed explanation it is really helpful.

@shindepratibha31 4 жыл бұрын

There are few points which I want to check. Please correct me if I am wrong. 1) I think the total error is sum of weights of incorrectly classified samples. 2)New sample weight for misclassified: old weight * e performance of stump and for correctly classified sample: old weight * e (-performance of stump). 3)There is no final sequential tree. We are predicting output based on the majority votes of base learners.

@pankaj3856 4 жыл бұрын

My Suggestion will be that first arrange your playlist, so that we do not get confused of topics

@adityadwivedi9159 2 жыл бұрын

Bro if someone is doing this much for free then u should also adjust a little

@omprakashhardaha7736 Жыл бұрын

@@adityadwivedi9159 ♠️

@NKARINKISINDHUJA Жыл бұрын

Adding in playlist will lot more benefit to him onlyy

@TheDestint Жыл бұрын

He already has a machine learning playlist. It has everything sorted. Khud kuch tumlog ko research karna nhi hota hai sab kuch pakaaa hua chahiye

@World-vf1ts 4 жыл бұрын

This was the longest 14min video I have ever seen.... The content of the video is much much more than the displayed duration of video Thanks a lot sir

@bhavikdudhrejiya852 3 жыл бұрын

This is a in-depth process of ad boosting algorithm. Great explained by Krish Sir. Thank you for making such a wonderful video. I have jotted down process step from this video: This iteration is performed until all misclassification convert into correct classification 1. We have a dataset 2. Assigning equal weights to each observation 3. Finding best base learner -Creating stumps or base learners sequentially -Computing Gini impurity or Entropy -Whichever the learner have less impurity will be selecting as base learner 4. Train a model with base learner 5. Predicted on the model 6. Counting Misclassification data 7. Computing Misclassification Error - Total error = sum(Weight of misclassified data) 8. Computing performance of the stumps - Performance of stumps = 1/2*Log-e(1-total error/total error) 9. Update the weights of incorrectly classified data - New Weight = Old Weight * Exp^performance of stump Updating the weights of correctly classified data - New Weight = Old Weight * Exp^-performance of stump 10. Normalize the weight 11. Creating buckets on normalize weight 12. Algorithm generating random number equals to number of observations 13. Selecting where the random numbers fall in the buckets 14. Creating a new data 15. Running 2 to 14 steps above mentioned on each iteration until it each its limit 16. Prediction on the model with new data 17. Collecting votes from each base model 18. Majority vote will be considered as final output

@omsonawane2848 Жыл бұрын

thanks so much for the summary.

@yuvrajpawar4177 5 жыл бұрын

Watched all your videos but still always eager every day for next topic to learn

@lohithv5060 3 жыл бұрын

Each and every topics are there in your channel on DS,ML,DL and which is explained clearly.Because of you many of the students learn all these kinds of stuff, thanks for that.I assure no one can explain like this with such a content💯. once again thank u so... much....

@RonaldoRewind-cr7 Жыл бұрын

kzbin.info/www/bejne/faiaemidbtN3Y6s&ab_channel=MixHits

@bhargavasavi 4 жыл бұрын

Krish was mentioning 8 iterations for selecting the records for the next learner...there are really 7 records...it will choose a random bucket 7 times...and since the max weighted values mostly will be present in the larger bucket size, probability of rand(0,1), most of the time the maximum bucket will be choosen.....Genius technique!!

@bhargavasavi 4 жыл бұрын

Sorry , I will take that back...0.07 +0.51+0.07+0.07+0.07+0.07+0.07+0.07=1, so there are 8 records, so it makes sense...its 8 iterations

@karangupta6402 4 жыл бұрын

One of the best explanations of AdaBoost if I have seen so far... Keep up the good work Krish :)

@sandipansarkar9211 4 жыл бұрын

Great video once again. plies don't forget to watch it once more as things are getting a little bit more complicated. I will watch the same video again but not today. tomorrow. Thanks

@raghavendras5331 2 жыл бұрын

@Krish Naik : Thank you very much for the video. Concepts are clearly explained and it is simply Excellent. One thing I wanted to highlight is --- In the Adaboost, final prediction is not the mode of the prediction given by the stump's. It is that value, whose group's total performance say is high

@aination7302 4 жыл бұрын

Indian KZbinrs are the best. Always! To the point and clear explanation.

@rahulalshi1093 Жыл бұрын

At 8:13 3rd record is incorrectly classified, so shouldn't the updated weight value of 3rd instance be 0.349

@SUNNYKUMAR-vk4ng 2 жыл бұрын

now i got better understanding of ensemble techniques, thanks sir

@sitarambiradar941 2 жыл бұрын

One of the best explanatory video of AdaBoost. thank you sir!!

@teslaonly2136 4 жыл бұрын

You should have gotten more views for this video. Your explanation is excellent

@RonaldoRewind-cr7 Жыл бұрын

kzbin.info/www/bejne/faiaemidbtN3Y6s&ab_channel=MixHits

@sergeypigida4834 2 жыл бұрын

Hi Krish! Thanks for the quick and clear explanation. At 11:42 you missed one thing. When we got a new collection of samples we need give all samples equal weights again 1/n

@RonaldoRewind-cr7 Жыл бұрын

kzbin.info/www/bejne/faiaemidbtN3Y6s&ab_channel=MixHits

@gowtamkumar5505 4 жыл бұрын

Why we need to do exactly 8 interactions and how the randome values will come?

@TheOnlyAndreySotnikov 2 ай бұрын

Basically, besides a lot of "basically," it's a good explanation.

@MatheoXenakis-r9y 8 ай бұрын

You just adaboosted my confidence my guy

@username-notfound9841 5 жыл бұрын

Do a comparison b/w ADABOOST and XGBOOST. Also, Proximity matrix in Python, Sklearn does not have it inbuilt.

@maitriswarup2187 3 жыл бұрын

Very crisp n clear explanation, sir

@KirillBezzubkine 4 жыл бұрын

dude u r good at explaining. Found your channel after watching StatsQuest

@sushantrauthan5704 4 жыл бұрын

They both are legendary teachers

@nikhiljain4828 3 жыл бұрын

And one tries to copy from other😀

@abhijeetsoni3573 4 жыл бұрын

Krishna, thanks for these videos, could you please make XGBoost , CATBoost and Light GBM videos too..It will be great help from you Thanks in advance :)

@sandeepsandysandeepgnv 4 жыл бұрын

Hi krish can you explain what is the difference between ada boosting and XG boosting. Thanks for your efforts

@ritikkumar6476 4 жыл бұрын

Hello sir. Just a request. Please upload some explanation videos regarding different algorithms like Lightgbm and Catboost etc.

@nikhiljain4828 3 жыл бұрын

Ironically it is so very similar (from start till end) to Josh starmer video on Adaboost. 😀

@somnathbanerjee2057 5 жыл бұрын

@8:30 minutes of the video, it should be 0.349 for an incorrectly specified classifier. As we got updated weight for the correctly specified classifiers. I love your teaching. Adore.

@rishibhardwaj398 4 жыл бұрын

This is really good stuff. Great job Krish

@michaelcornelisse117 2 жыл бұрын

Thanks for this explanation, it's the best I've come across! It really helped me understand the fundamentals :)

@KirillBezzubkine 4 жыл бұрын

8:25 - u should have updated SAMPLE #3 since it was incorrect.

@owaisfarooqui6485 4 жыл бұрын

take it easy bro.....it's just for the sake of explanation ........ BTW human makes mistakes .........

@ashwinshetgaonkar6329 2 жыл бұрын

thnaks for this accurate and energetic explaination

@HirvaMehta01 2 жыл бұрын

the way you simplify things!!

@madeye1258 3 жыл бұрын

@13.34 doesn't the end classification is done by adding the total say of a stomp per classification and finding which classification has the highest total say,or is it the majority vote ?

@ananyaagarwal6504 3 жыл бұрын

Hi Krish, great video, it would helpful if you could give us a more intuitive explanation of why does adaboost really work

@__-de6he 2 жыл бұрын

Unfortunately, there wasn't an explanation of an underlying idea. Just technical details.

@dafliwalefromiim3454 4 жыл бұрын

Hi Krish, You are saying at around 50 secs... "Most of this particular record will get trained with respect to this particular base learner.".. records don't get trained with respect to a learner. A learner gets trained ON the records. Also you have sentences like, "This base learner gives wrong records".. Do you mean the base learner mis - classifies these records ?

@muntazirmehdi7299 4 жыл бұрын

yes please this is confusing

@gnavarrolema Жыл бұрын

Thank you for this great explanation 👍

@jasonbourn29 Жыл бұрын

Thanks sir your vedios are great but ,one request please arrange it in order

@sonumis6626 2 жыл бұрын

Adaboost in summary: Unlike Random forest, Adaboost combines weaker learners (Decision Trees in a sequential manner) The decision trees (DT) in AdaBoost are single split/one depth on nature and are called decision stumps (DS) To develop a single base learner, it first compares information gain of each DT based on each of the feature and selects the DT with information gain/entropy/Gini impurities. This becomes the week learner. This method does not follow Bootstrapping. The number of decision stumps it will make will depend on the number of features in the dataset. Suppose there are M features then, Adaboost will create M decision stumps. Following are the steps in Adaboost: 1. A new sample weight matrix will be used to assign weight to each observation. for N number of records, the initial weight will be 1/N. 2. To generate the first base learner/week learner (BS), M decision stumps are generated for the M number of features. Based on their information gain, best DS is selected. 3. From this DS, total error (TE) is calculated based on the misclassification of samples by that DS. If total misclassification is T, TE=T/N where N is number of samples. 4. Based on TE, its performace score(PS) is calculated, PS= 1/2*log(base e)((1-TE)/TE) 5. Based on PS, new weights will be assigned to samples that are classified correctly and incorrectly. 6. New weight for incorrectly classified sample: old weight * (e**(PS)) 7. New weight for correctly classified sample: old weight * (e**(-PS)) 8. This will increase the weight of incorrectly classified samples and decrease the weight of correctly classified samples. Which means that the next BS classifier will have to give more importance in learning the incorrectly classified samples. 9.If the summation of the new weights are =! 1, we need to normalize the weight as : (new weight)/ summation of (all new weights) 10. Based on new weights, some buckets/ranges/classes of normalized weights are formed. These weights will be used to form the new sample set for classification be the next weak learner. 11. Based on some iterations for N number of times, and psudo randomly generated numbers between (0-1) the new samples are selected from the old sample list based on where it falls in the buckets of normalized weights. 12.The process between step (2-11) is repeated till the error reduces to the minimum. 13.During the testing of data, each data will be classified using the multiple BS, and a majority voting will be used to generate the final output. ps: Feel free to correct me if I made any mistake..

@ayesandarmyint-551 2 жыл бұрын

I thinks u did a great summary . but i think in No. 1 . 1/M (M= no of records in dataset )

@sonumis6626 2 жыл бұрын

@@ayesandarmyint-551 You are right. It should be records instead of features. Corrected it. Thank you.

@RonaldoRewind-cr7 Жыл бұрын

kzbin.info/www/bejne/faiaemidbtN3Y6s&ab_channel=MixHits

@tarunkumar-hc8dg Жыл бұрын

In adaboost final classification is depends on the performance of each stump so we cant say that majority voting is here for final prediction.

@satyaajeet Жыл бұрын

CJT - Condorcet Jury theorem will help in understanding how weak learners become strong learners.

@parthdhir5622 4 жыл бұрын

hey @krish can put videos for other boosting algorithms.

@heroicrhythms8302 3 жыл бұрын

thankyou krish bhaii !

@abilashkanagasabai3508 5 жыл бұрын

Sir please make a video about EDA(exploratory data analysis)

@Ilya_4276 4 жыл бұрын

this is the best explanation thanks a lot

@praneethcj6544 4 жыл бұрын

Here after creating new dataset containing error Where are we trying reduce the errors ? How are we deploying the errors found in stump 1 into stump 2 and how it clearly reduce ?

@bhargavasavi 4 жыл бұрын

After normalizing the weights and bucketing them -- Till here it should be fairly clear..... Here is the trick next... Since the max weighted values mostly will be present in the larger bucket size of the class intervals(in the above example 0.07 to 0.58) , probability of rand(0,1), most of the time the maximum bucket will be choosen....so the maximum bucket will have the wrong records. So when we got for 8 iterations, probability of sampling the wrong records is high. Hope my explaination helps :)

@theshishir24 4 жыл бұрын

@@bhargavasavi Could you please explain why 8 iterations? BTW Thanks for the above explanation :)

@Miles2Achieve 4 жыл бұрын

Suppose there are two wrongly classified record, then weight for those will be same and comes under the same bucket, in that case after eight iterations there will be more records for training or what if generated random number in iterations belongs to the same bucket for more than 1 time

@amitmodi7882 4 жыл бұрын

Thanks Krish for wonderful explanation. I have few questions regarding this video: 1. Will this not cause over fitting? If yes then how to overcome? 2. Where Adaboost is used in real time use cases?

@adityachandra2462 4 жыл бұрын

you always have a cross-validation technique for overfitting treatment, if I am not wrong!!

@parthnigam1782 3 жыл бұрын

Xgboost is used in today scenario since it is old and base of all

@kunal7503 3 жыл бұрын

best explanation ever

@chiranjeevibelagur2275 Жыл бұрын

After the first iteration when you spoke about the buckets, post that your explanation became a little ambiguous. If you are considering the Gini impurities or the entropy whichever of them, you would still have the similar information gain and the same feature gets selected and that feature would still classify the records in the same way (just as the 1st iteration) and hence the misclassifications would still remain the same. I think you have to get a bit of clarity on that and then could explain about the iterations post updating weight what exactly happens differently so that the misclassifications might go a Lil less or chances of Miss classification goes a Lil down. Other than that everything is fine.

@dmitricherleto8234 3 жыл бұрын

May I ask why we need to randomly select the number ranging from 0-1 to compare with class intervals instead just of choosing the misclassified record since we need to change the weights of the misclassified record?

@RashmiUdupa 4 жыл бұрын

you are our dronacharya :)

@abhisekbehera9766 2 жыл бұрын

Hi Krish Awesome tutorial on Adaboost.... just one question i have: how to calculate total error and performance of stump in case of regression and how does ensemble happen in this case

@i_amanrajput 4 жыл бұрын

really easily explained

@aditiarora2128 Жыл бұрын

sir plz make vedios on how we can use adaboost with CNNs

@ashutoshbhasakar 11 ай бұрын

Krish Bhaiya Amar Rahe !!

@papachoudhary5482 5 жыл бұрын

Thanks

@shaelanderchauhan1963 2 жыл бұрын

Question : when Second Stump is created, after creating a new data set will we reinitialize the weights or use the previous weights which were updated? I also watched statquest video where weights were reinitialized as they were in Beijing .

@armaanzshaikh1958 4 ай бұрын

We will reinitialize the weights for every stump

@smartaitechnologies7612 Жыл бұрын

nice one. even me as trainer felt it better.

@KirillBezzubkine 4 жыл бұрын

5:35- more often i see people use LOG base 2 (since information represented in BITS)

@annperera6352 3 жыл бұрын

sir please do a video to implement Adaboost. and CART.please Sir

@lakshmitejaswi7832 4 жыл бұрын

Good Explanation. At test time it will multiply terror and weight and then sum. Am i right?

@aafaqaltaf9735 3 жыл бұрын

explained very well.

@tanmayisharma5890 3 жыл бұрын

I wish you made a video on Gaussian mixture models

@kiran082 4 жыл бұрын

Excellent Video Krish

@mfadlifaiz 4 жыл бұрын

why we must increase sample weight of the error prediction and decrease sample weight of true prediction?

@Jtwj2011 4 жыл бұрын

you are my lifesaver

@dibyanshujaiswal8333 3 жыл бұрын

Sir, the part where you explain about creating bins, with bin1=[0.07, 0.51], bin2=[0.51,0.58], bin3=[0.58,0.65] and so on. Post that how you got values 0.43 randomly and its purpose was not clear. Please explain.

@nagarajsundar7931 4 жыл бұрын

From 10:40 -- How the random value of 0.43, 0.31 is getting selected ? How are you telling that it will perform 8 iteration ? Im not getting that point. Can you please help me out on this ?

@deepakkota6672 4 жыл бұрын

Lot of us missed that, Thank you for bringing up. Can we get answer to this?

@arjunmanoharan5113 2 жыл бұрын

Any reason why decision stumps are used?. Can't we use trees with more depth for each iteration?.

@tonysimon4826 4 жыл бұрын

Just had one doubt, At 3:47 u had mentioned that for each feature there will be a tree created. But after 8 or 9 minutes after getting new sample weight and creating new data, how is the decision tree or week learner made? Like its not based on another feature f2 or f3 as mentioned in the beginning of the video..hence the doubt. Also is the new dataset creation an alternative method? Like without creating new dataset could we create the weak learner based on next useful feature along with the new weight?

@gowthamprabhu122 4 жыл бұрын

We create a tree (stump) for each of the features f1, f2 and f3. We then select the tree with lowest entropy or Gini and make it the basis for adjusting the sample weights. Post that we repeat the process and see again which of the three tress has the lowest Gini or Entropy and readjust the wights. My question is when does this process end?

@tonysimon4826 4 жыл бұрын

@@gowthamprabhu122 you mentioned that we repeat the process and find the tree. But after the first tree is made on feature 1(based on entropy or gini). Then a bootstrapped data is making is mandatory according to him! I had the doubt whether it's mandatory or optional. And to answer your question i think the process should end when all features are accounted provided they have a good amount of say

@rohitrathod8150 4 жыл бұрын

@@gowthamprabhu122 it will end when number of stumps equal to number of feature

@pranavbhatnagar804 4 жыл бұрын

Great Work Krish! Loving your work on ML algorithms. Can you please create a video or two on Gradient Boosting? Thanks again!

@sunnysavita9071 4 жыл бұрын

sir ,we also decrease the weight in xgboost algo??

@shadiyapp5552 Жыл бұрын

Thank you♥️

@pranavreddy9218 Жыл бұрын

Please complete the full problem sir, everywhere mentioning so and so, and closing the session...no one understood fully ADA boost from your session..

@nikhiljain4828 3 жыл бұрын

Krish, if the data had 7 records, how is your calculation of updated weights corresponding to 8 records. Also you mentioned to create a new data with 8 records. Looks like something very similar was explained in statsquest video. Copying is not bad but should be done with some cleverness.

@AnujKinge 3 жыл бұрын

Perfect explanation!!

@padhiyarkunalalk6342 4 жыл бұрын

Sir you are great. But I have doubts. 1)why we used decision tree as a weak learner in ensemble technique? 2)which types of ML models used for ensemble technique? 3)can we used only. Weak learners in ensemble technique? Plzzz sir help me to clear these douts. #th@Nk u

@joeljacob3957 3 жыл бұрын

The initial statement is bit confusing. You said the wrongly predicted data points will be sent to the next classifier and said if the next classifier also makes a wrong prediction, those data points will be moved forward, at this moment you pointed out bottom set of data points. So my question is, does the whole data set is forwarded or just wrongly classified data points? If only the wrongly classified data points are forwarded, then what's the point of using weight then?

@prashanths4455 3 жыл бұрын

U r too awesome Krish

@neilgurnani9204 4 жыл бұрын

At 5:00, shouldn’t the sum of the total always be 7? When you said 4 and 1 that only sums to 5?

@joeljoseph26 3 жыл бұрын

There is another node for the decision tree on the right side.

@arshaachu6351 Жыл бұрын

Sir..thanku for your class really helpful to me.Can you explain how adboost in face detection.. If you will see my message pls reply

@saikiranrudra1283 3 жыл бұрын

well explained sir

@abdulahmed5610 3 жыл бұрын

How do we do for Regression problem... How we calculate and update weights in Regression problem???

@mirjanamiljkovic7574 2 жыл бұрын

Did you get an answer? If yes, please, share.

@ellentuane4068 3 жыл бұрын

incredible as always !!!!

@sheinoo 5 ай бұрын

First you said only the records got errors will populated to the next model but last you said the selection works n times where each time one record being selected and on the next DT there will be n records as the first DT, so which is correct ? can someone clarify this part

@hemantdas9546 4 жыл бұрын

Sir please explain Adaboost Regression. Please Sir 🙏

@mohammedazeem3303 4 жыл бұрын

Please clarify on the random value which it selects for 8iterations before checking for buckets...... Anyone? How those random values are generated & whats the guarantee that it will lie in one of the buckets..?

@manukhurana483 4 жыл бұрын

e^.895 = 2.44 and 1/7*e^.895 = 0.35, e^-.895=0.408, and 1/7*e^-.895 = 0.058 your weights(incorrect). actual weight 1/2 log(6) = .389 => 1/7* e^-.389 => 0.20 and 1/7* e^-.389 => 0.096

@desperattw12 4 жыл бұрын

when selecting the first base model, are we passing some random sample to m models for calculating the entropy? since all of our base models are decision tree what is the right approach to calculate the entropy

@anatomnatureatomic3156 4 жыл бұрын

I don't get it why u selected (0.43) as random value.... Bcz the random values is selected from what range(x,y).And also if didn't get that 8 iterations formula.

@vivekkumar-ij3np 2 жыл бұрын

How to decide, how much iteration we can perform to select randomly data points for second decision tree. Does it depends on no. of rows. Plz reply someone.

@prachiraol7645 2 жыл бұрын

Can we use random forest as a base learner?

@guptarohyt 2 жыл бұрын

How do you find if an instance is incorrectly classified? If the Algorithm knows it then why it doesn't classify correctly first time?

@anoushk 3 жыл бұрын

In the updated weights you put 0.349 for the wrong record or was it correct?

@sumitgalyan3844 3 жыл бұрын

bhai stats k upar bhi videos bana de

@vishalkailaswar5708 4 жыл бұрын

Bro can u add this video to the playlists which you created, we could not find this video in playlists

@souravdey1086 4 жыл бұрын

What if the total error is larger than 0.5? Please try for error greater then 0.5.

@esakkiponraj.e5224 4 жыл бұрын

5:12 Could you explain Total error ? How it comes 1/7 ?

@akshatw7866 4 жыл бұрын

since there is just 1 error (misclassification) in the classification by that stump, we only have to add 1/7 to find the sum of errors.

@Raja-tt4ll 4 жыл бұрын

Nice Video

@pramodyadav4422 3 жыл бұрын

Hello Sir, I've a doubt related to selection of stump. As you said there will be M stumps for M number of feature. We will select 1 stump out of M stumps. This selection is based on Entropy/Gini Impurity, the lowest the better. So just in case we found stump with Feature1 have lowest entropy/gini we will select it as a base model to train and test. So does that means we are always going to select Feature1 stump throughout the whole process of Adaboost? and also that means the only 1 feature is used to predict? rest other features can be dropped?