I watched your videos to help through a data analytics degree and I'm now working in a job type similar to business analyst and looking back at these videos. Very easy to follow, punctual, and informative for getting the job done. Thank you
@bkrai2 жыл бұрын
You are welcome and god luck!
@askpioneer2 жыл бұрын
hello sir, your way of explaining is so simple and effective. made topic simple. i would like to add comment for all as well that i was getting error while using controls=ctree_control and after doing google and forum support , now i am able to run. and veiw tree. Great work sir.
@bkrai2 жыл бұрын
Thanks for the update!
@vijayarjunwadkar3 жыл бұрын
Take a bow sir! For the first time, I had full clarity on Decision Tree and it's usage! Thanks a lot for this superb tutorial, lucky to find your channel, stay blessed! 👌👍🙏
@bkrai3 жыл бұрын
Thanks for comments!
@animeshdevarshi7 жыл бұрын
Sir, I've been following lot of courses but never found something with so clarity. Thanks for posting these!
@bkrai7 жыл бұрын
Thanks for the feedback!
@ivanjcardona2 жыл бұрын
You really made it simple. I have been watching others tutorial, but not anymore. I already subscribed. Thanks a lot.
@bkrai2 жыл бұрын
You are welcome!
@user-uf5bk8zc7n4 жыл бұрын
Thanks Doc, after my 6 hrs class ...you went through all my confusions in just 18:43 mins. Such a worthy job!!!
@bkrai4 жыл бұрын
Thanks for your feedback and comments!
@ShivaKumarbudda4 жыл бұрын
Hi, video posted 4 years ago today has become a saviour for my internal assessment Thank you 😃
@bkrai4 жыл бұрын
Welcome! You may also find this recent one useful: kzbin.info/aero/PL34t5iLfZddvGr66DPf-L-sSJ50XNwN3K
@kabeeradebayo90147 жыл бұрын
Thank you again for these complete episodes. You have been of a great help to me "Rai". Please, I'd appreciate a complete episode on the ensembles, essentially, heterogeneous ensemble using DT, SVM etc. inclusive as the base classifiers. Comprehensive videos on ensembles are not common, in fact, I haven't come across any. It will go a long way If you could put something together on this. Thank you for your help!
@bkrai7 жыл бұрын
Thanks for the suggestion, I'll do it in near future!
@kabeeradebayo90147 жыл бұрын
Sounds really great. Looking forward to it. Can't wait!
@plum-ish66792 жыл бұрын
You are truly remarkable! The way you explain things is very simple to understand.
@bkrai2 жыл бұрын
Thanks for comments!
@sujitcap6 жыл бұрын
Sir, so much clarity ...How simple and easy you created ! Thank you .
@bkrai6 жыл бұрын
Thanks for comments!
@UmairSajid6 жыл бұрын
Hello Dr. Rai, thank you for a very informative video. One thing that I would like to add based on my limited knowledge: For a skewed class distribution such as in the data, it is more importance that the model is able to predict the abnormal cases then it is to predict normal cases. If we just look at the mis-classification error, then the model may be aligned towards the class with higher percentage of data. One way to avoid that is to reduce the disparity between the class types by over/under sampling techniques. Another way is to use the Area under the precision-recall curve as a measure of model evaluation. Your comments and feedback on this would be appreciated.
@bkrai6 жыл бұрын
That's correct. For more details about class imbalance problem, refer to this link: kzbin.info/www/bejne/fqCVfJ-sr8-Ynck
@wasafisafi6122 жыл бұрын
Thank you so much for your videos. I am learning everyday with them. May God bless you
@bkrai2 жыл бұрын
Thanks for comments!
@sudzbyte22155 жыл бұрын
This is a great example of decision trees. Thank you!
@bkrai5 жыл бұрын
Thanks for comments!
@ekfistek5 жыл бұрын
Dr Rai, thanks for your videos. I have them useful in explaining basic machine learning methods. Thank you!
@bkrai5 жыл бұрын
Thanks for comments!
@christan74346 жыл бұрын
Thank you Professor Rai for taking the time to show us the ropes. Regarding the mis-classification error table, may I know: what is the difference between that and the Confusion Matrix. I notice the calculation for "accuracy" is the same as the Confusion Matrix, simply "sum(diag(tab))/sum(tab)", but for Confusion Matrix, the Actual is on the vertical versus what you stated in video for Actuals in the horizontal. Thanks, and looking forward to more videos from you
@bkrai6 жыл бұрын
Both confusion matrix or mis-classification table are same.
@akshitbhalla8745 жыл бұрын
Your videos are honestly so amazing.
@bkrai5 жыл бұрын
Thanks for comments!
@tarapaider17298 жыл бұрын
Your videos are always very easy to follow!!
@bkrai8 жыл бұрын
+Tara Paider thanks for the feedback 👍
@shesadevsha19946 жыл бұрын
Hi Sir, I am so glad to see your all videos on related to machine learning in R, So request one thing if you share your datasets which you have used in your session that will be great
@bkrai6 жыл бұрын
You can get data file from the link in description area below the video.
@ehtishamraza26234 жыл бұрын
Really Great Explanation
@bkrai4 жыл бұрын
Thanks for comments!
@bkrai4 жыл бұрын
Also here is a link to more recent one: kzbin.info/www/bejne/iHTHpmOwZ7usqNk
@halyad43847 жыл бұрын
Very informative and easy to understand.Thanks for sharing such an useful video.
@bkrai7 жыл бұрын
Thanks for the feedback!
@hridayborah97505 жыл бұрын
very very clear and helpful. thanks tons
@bkrai5 жыл бұрын
Thanks for comments!
@DABANG1255 жыл бұрын
Sir, Greetings from the US, I have enrolled in the machine learning course through Udemy as well but your explanation super simple and easier to implement. Please do guide me with any book which I can use to practice more of such datasets
@bkrai5 жыл бұрын
Deep learning is the hottest topic currently within machine learning field. To get started with practical examples you can try: www.amazon.com/Advanced-Deep-Learning-designing-improving/dp/1789538777
@lorihearn68593 жыл бұрын
Is it only useful for numerical data? when all the independent variable are continuous? or it can be used for categorical ones too?
@bkrai3 жыл бұрын
It's useful for both. See this more detailed example: kzbin.info/www/bejne/bISwo517rKuch7s
@rakeshv63222 жыл бұрын
Thanks sir for detailed video..
@bkrai2 жыл бұрын
Most welcome!
@bonelwamnyameni7 жыл бұрын
This video as helped me a lot with my assignment, thank you so much.
@bkrai7 жыл бұрын
that's great!
@shaliniguha18226 жыл бұрын
Sir, it'd be really nice if you can make a blog explaining the output in more details. For instance, an explanation of the statistical parameters measured in the confusion matrix. Your videos are really helpful! :)
@bkrai6 жыл бұрын
Thanks for your comments and suggestion! You may find decision tree related explanations in following video too: kzbin.info/www/bejne/gGPEaqyMaNCfY68
@rithishvikram17595 жыл бұрын
wow thank you sir....!!!!sir please make video of entropy splitting creation calculation it is very useful sir
@bkrai5 жыл бұрын
Thanks for the suggestion, I've added it to my list.
@AmarLakel5 жыл бұрын
Thank you for your help and all your videos. It's help me a lot
@bkrai5 жыл бұрын
Thanks for your comments!
@nayeemislam81236 жыл бұрын
Sir, I have a few questions: 1. How do you find statistically significant variable after developing a decision tree model with all variables? Ho 2. Suppose all variables in a decision tree is coded as POOR, FAIR, GOOD, then how to find the probabilities of each (POOR, FAIR, GOOD) at non terminal nodes of the tree and also number of sample in each category? I need to show this in my plot. 3. What is the best approach in developing a decision tree model? Developing a model on the training data using K Fold Cross Validation OR Developing a model on training data and then going for cross-validation and pruning process using a function like cv.tree() which allows us to choose the tree with lowest cross validation error rate? Which method is better? 4. How to find out the value of the standardized importance of independent variables using CART in R?
@bkrai6 жыл бұрын
1. P-values on the tree indicate statistical significance. 2. You can find it only at the terminal node. 3. k-fold CV is always better to avoid over-fitting. 4. Higher a variable on the tree, more important it is. For variable importance you can also try this link: kzbin.info/www/bejne/mnvGnYF_g5KHhtE
@carlosfernandezgalvez30235 жыл бұрын
Hi! thank you for all your videos. I'd just like make a little comment: ctree function implements 'Conditional Inference Tree', not 'Clasification Tree'. In fact, it can develop clasification trees, but the fundamentals are different. Thank you for all the work you are doing! very usefull. Carlos
@bkrai5 жыл бұрын
Thanks for the update!
@takakosuzuki25145 жыл бұрын
Hi Dr.Rai. I encountered an error on #Misclassification part. I got the table for using the library(party), but I got "all argument must have the same length" when using the rpart() one. But if I use validate set with the rpart package, the table can be generated.
@bkrai5 жыл бұрын
Difficult to say much without looking a the code. But you can review your code again, there may be some typo.
@harishnagpal216 жыл бұрын
Nice video Bharatendra. One question.. you said that we need to optimize the model.... how to do that ie how to optimize our model! Thanks
@bkrai6 жыл бұрын
You can make changes to settings in 'control' to see what helps to improve the model. In the example, I used only 3 variables just for illustration, but you must start with all variables for a better performance.
@harishnagpal216 жыл бұрын
thanks :)
@rakeshvikhar2 жыл бұрын
I am a beginner.. could you help me understand if we can use linear/logistic regression todo the prediction here? I have referred your vehicle example and so got confused if we can use that model here.
@bkrai2 жыл бұрын
Yes, you can use logistic regression as response variable is of factor type. For more see: kzbin.info/www/bejne/d4fbaIqZZqiEbbs
@MatiToGuzior2 жыл бұрын
Greetings! I came back to this video after a while as it still seems to be the best one regarding Decision Trees out there. I have a quiestion regarding significance of variables. Do you have a video covering this subject? Any techniques I could apply while working on my Decision Tree? thank you
@bkrai2 жыл бұрын
You can use this link. For tree based methods, it provides variable importance plots to show which variables are important and which ones do not contribute much. kzbin.info/www/bejne/nnSvfICfj6eHqLc
@MatiToGuzior2 жыл бұрын
great video, everything explained step by step. I have a question tho. some of my data in the DB file is char and i keep getting an error "data class "character" is not supported". how can i include this data in my experiments?
@bkrai2 жыл бұрын
You change such variables to ‘factor’.
@MatiToGuzior2 жыл бұрын
@@bkrai omg thank you. so I can just use data$variableF
@bkrai2 жыл бұрын
yes that should work.
@sushantchaudhary20084 жыл бұрын
Thank you Dr Rai. I have a question about the tree pruning. Prior to the pruning some of the trees were able to classify patients as pathological but after pruning( by changing the control functions) none of the trees identify the pathological patients. If we were to specifically identify patients with suspected pathology how can we modify the control functions or the initial formula included in the "ctree()" function?
@oguzyavuz20104 жыл бұрын
let me ask, top of the variable of the picture is not dependent variable right? 5:46
@bkrai4 жыл бұрын
It's a independent variable.
@oguzyavuz20104 жыл бұрын
@@bkrai sir can i ask some simple questions about tree diagram if you do not mind. I leave it here my gmail adress: ogzhnyvzz@gmail.com
@vishalaaa14 жыл бұрын
ctree dont support the dates. I tried the dates converted from posix. Can you please suggest the parameter in ctree that resolved this problem ?
@bkrai4 жыл бұрын
Decision tree is not a good methods to work with dates. For dates you should use time series: kzbin.info/aero/PL34t5iLfZddt9X6Q6aq0H38gn-_JQ1RjS
@mohityadav82613 ай бұрын
nice explanation
@bkrai3 ай бұрын
Thanks!
@aditidalvi2556 жыл бұрын
Sir plz can u suggest a good book for beginners in machine learning to have basic knowledge of all statistical tools ??
@sallymusungu8983 Жыл бұрын
How do you remove ticks on the axes? Or realign the axis labels?
@fadedmachine6 жыл бұрын
You're the man. Keep up the great work!
@atiquerahman37667 жыл бұрын
Hi Sir, Your videos are really helpful.It has really helped me a lot, I have few doubts though.I have just started learning data science so these doubts may be naive. 1) On what basis we decide that we should put this much data into training, validation, and testing respectively? 2)Is there any criteria(such as r-square in regression models, Chi-square for logistic regression) for decision trees so that we can say how good our model is?
@bkrai7 жыл бұрын
1) one may experiment with different partitions such as 50:50, 60:40, 70:30, etc., and see what works best. There is no single partition ratio that will work well in all situations. 2) if your y variable is categorical, mis-classification error is used for model performance assessment.
@atiquerahman37667 жыл бұрын
Thank you, sir!!
@anananan36352 жыл бұрын
its just for numaric variables? is their another cod to charachter variabls
@bkrai2 жыл бұрын
Change character variables to factor variables before using this.
@MrCaptainJeeves8 жыл бұрын
love all your videos...Please keeping uploading
@bkrai8 жыл бұрын
+pradeep paul Thanks for your feedback!
@Twiste_Z5 жыл бұрын
i followed ur method with a dataset i created...its a simple one but the output is just printing the values of my dataset rather than plotting a tree and predicting...can u help me understand why
@bkrai5 жыл бұрын
Difficult to say much without looking at data and code.
@Fsp014 жыл бұрын
brilliant! thank you Dr
@bkrai4 жыл бұрын
You're most welcome!
@ricardobrubaker41092 жыл бұрын
How can we export the first tree prediction (View(predict(tree,validate,type="prob"))) into XL? When using a data frame they come out horizontally and unreadable.
@raymondjiii2 жыл бұрын
That was awesome but I found that with my dataset I get a completely different decision tree using the rpart package. Without rpart, the tree is what I expected it to be and with rpart - in some ways it's almost opposite. I'm only comparing the two trees with my training data.
@raymondjiii2 жыл бұрын
I think I know what the problem is - with rpart trees you only get a little "yes" and "no" marker on the root node. In my case "yes" goes to the left of the tree and "no" goes to the right of the tree. If I assume that direction is always the case then things are okay. I do wish that the "yes", "no" little while boxes were printed at every non leaf node so it's very clear which way the path is going. (I wonder if there's an option for that?) Thanks for the great video.
@bkrai2 жыл бұрын
See link below that has more detailed coverage: kzbin.info/www/bejne/bISwo517rKuch7s
@leolee6187 жыл бұрын
Thank you so much for your awesome video. I've learned a lot from it.
@bkrai7 жыл бұрын
Thanks for your feedback!
@bkrai7 жыл бұрын
Thanks for your feedback!
@mayankhmathur7 жыл бұрын
Nice explanation. thanks.
@TheIanoTube4 жыл бұрын
Would this work just as well if some variables were categorical? I.e. written in text but limited options Thanks for the video
@bkrai4 жыл бұрын
Yes, absolutely
@bkrai4 жыл бұрын
You may also try this link: kzbin.info/aero/PL34t5iLfZddvGr66DPf-L-sSJ50XNwN3K
@TheIanoTube4 жыл бұрын
Thank you, great channel. Subscribed!
@bkrai4 жыл бұрын
Thanks!
@bala4you018 жыл бұрын
Thank you, Dr. Roy for sharing simple and detailed explanation on Decision Tree. My query is can we plot ROC curve for Multiclass Data. (pROC package provides to calculate the AUC but I could not find how to plot ROC graph for multinominal data).
@bkrai8 жыл бұрын
At this time it only does it for binomial situation. You can now find roc curve video here: kzbin.info/www/bejne/r6GyYneGerCMfNE
@m.z.18095 жыл бұрын
how can we validate the accuracy or discriminatory from this model? i believe you can use the model outputs from train and validate to somehow calculate chi-square etc?
@bkrai5 жыл бұрын
You can validate the model built on training data with the help of validate data.
@satyanarayanajammala51298 жыл бұрын
very nice explanation keep it up
@bkrai8 жыл бұрын
thanks for the feedback!
@vairachilai35885 жыл бұрын
in confusion matrix(tab), the column is predicted data and row-wise actual data
@bkrai5 жыл бұрын
In this video I have used predicted data in row and actual in column for the confusion matrix.
@vairachilai35885 жыл бұрын
Kindly check it, (table(predict(tree),data$NSP), Then the output will be taken in the following way, column is predicted data and row-wise actual data
@bkrai5 жыл бұрын
Try this, it will make it more clear: table(Predicted = predict(tree), Actual = data$NSP)
@sachiniwickramasinghe19125 жыл бұрын
thank you ! so helpful !
@bkrai5 жыл бұрын
Thanks for comments!
@nagarajaraja25467 жыл бұрын
Hi sir , my s.nagaraj adiga your vedios are very simple to listen and it is easy to understand thank you very much .
@bkrai7 жыл бұрын
Thanks for the feedback!
@ronithNR8 жыл бұрын
hello sir its great video does the rpart uses gini index?
@bkrai8 жыл бұрын
It uses altered priors method.
@kartikchauhan28454 жыл бұрын
Sir how would you increase the number of nodes?
@bkrai4 жыл бұрын
You can change mincriterion and minsplit in the controls part for that.
@bkrai4 жыл бұрын
For a more recent one, see below: kzbin.info/aero/PL34t5iLfZddvGr66DPf-L-sSJ50XNwN3K
@ateendraagnihotri97443 жыл бұрын
Sir can you provide this dataset which you have used
@bkrai3 жыл бұрын
There is a link below this video.
@abhinavmishra77866 жыл бұрын
Hi sir nice explanation...learnt about ctree function. Can you please illustrate how we can tune the decision tree model?
@bkrai6 жыл бұрын
Around 7:30 point in the video tuning is shown using "mincriterion" and "minsplit".
@abhinavmishra77866 жыл бұрын
Bharatendra Rai my mistake sir...I mean pruning the decision tree
@bkrai6 жыл бұрын
You can do pruning by increasing values for "mincriterion" and "minsplit".
@abhinavmishra77866 жыл бұрын
Bharatendra Rai thank u for clarifying sir
@uhsay19866 жыл бұрын
Hi SIR , how do we apply test set to predict function where the target var have NA values ? As wen i run the function it says predictor must have 2 levels.
@bkrai6 жыл бұрын
You need to impute missing values before developing the model.
@sovon086 жыл бұрын
Sir, if you could create a video for how to calculate gini, KS using R that would be really great
@bkrai6 жыл бұрын
Thanks for the suggestion, I've added this to my list.
@OrcaChess6 жыл бұрын
Hello! I gave my decision tree 97 different features but the decision tree only picked one of these features to make his decision. Is that normal that it doesn't consider all the features for its decision?
@bkrai6 жыл бұрын
It runs with default setting. By making changes to default settings you may be able to make it include some more. But features that have very little impact on the response are unlikely to be included.
@DhingraRajan6 жыл бұрын
It can happen when one of the feature is the close predictor for y. Then that value is quite enough to predict the y alone.
@sudanmac49185 жыл бұрын
Sir what is the difference between rpart() and ctree(). And when to use it??
@bkrai4 жыл бұрын
It's just a different way to represent a tree. Note that both use the same algorithm.
@mahumadil8 жыл бұрын
I have a query and i tried to google it but I couldn't find any satisfactory answer against it. The question is what is the difference between ctree and rpart tree?
@bkrai8 жыл бұрын
+Mahum Khan Cree is a function within package called "party" for decision tree. Similarly rpart is a function within a package with the same name "rpart". Both are use for decision tree. I prefer party as it is said to be more accurate. If you search "party vs rpart' you can see many good explanations.
@satishbharadwaj95396 жыл бұрын
Sir, please post a video on Regression Splines, Polynomial Regression & Step Functions etc
@bkrai6 жыл бұрын
Thanks for the suggestion, I've added it to my list.
@ningrongye3397 жыл бұрын
Hi sir, Thank you for the video, it's very helpful! But I still not understand why your model could not predict the 3 model? If we you all the items could we predict more precisely? Thank you!
@bkrai7 жыл бұрын
That's correct! To obtain the final model we need to include all items and that will improve model performance.
@akkimalhotra268 жыл бұрын
dear sir, how can i get the data set that you are using
@bkrai8 жыл бұрын
your email?
@bkrai8 жыл бұрын
Actually I don't need email. You can get data from: sites.google.com/site/raibharatendra/home/decision-tree
@muhammadnurdzakki16054 жыл бұрын
Reading /Preparing csv data : 0:32 Decision Tree using rpart Package : 11:22
@bkrai4 жыл бұрын
Thanks!
@vishnukowndinya7 жыл бұрын
hi sir can u pls explain about pruning of tree. on what basis we do prune ?
@bkrai7 жыл бұрын
When you have decision trees that are too big, 'pruning' helps to reduce size of the tree by removing those parts that do not help much in correct prediction of the outcome. It helps to avoid over-fitting and improve prediction model accuracy.
@javeda7 жыл бұрын
Hi, I wanted to ask which is most appropriate software for conducting SEM along with moderation analysis, in case of categorical, nominal (binary and multinomial) and ordinal variables as outcome/dependent/endogenous variables ? P.S:The predictor variables are scale,nominal and ordinal variables. Regards
@kanhabira3 жыл бұрын
Thanks sir for this interesting video. I am facing a problem. My dependent variable is binary(0,1). When I run predict, the estimated values appear in in decimals despite remove "type". So, misspecification error is close to 1. Could you please suggest how I can get the predicted value as 0/1.
@romanozzie35307 жыл бұрын
Amazing, thanks
@uchenzei51605 жыл бұрын
When i try to create the missclassification table, it always gives me an error "all arguments have to be the same". Please what can i do ? I am new to data science
@neera8420065 жыл бұрын
I am also getting same error message
@dhavalpatel18435 жыл бұрын
You should always pass the model as the first argumnet in predict function. The second parameter should be a data frame of predictor variables only. You can specify type=”prob” as an extra argument to get probabilities of every factor of y. Either type=”class” directly gives you the class of predicted values. By default type argument is set up differently for every R version.
@bkrai5 жыл бұрын
Thanks for the update!
@raniash3ban3836 жыл бұрын
very helpful thanks
@bkrai6 жыл бұрын
Thanks for comments!
@bkrai6 жыл бұрын
Thanks for comments!
@subashinirajan28417 жыл бұрын
Hello sir, I'm implementing the same steps for my own set of data. But I am getting an error in the Misclassification part as "all arguments must have the same length". Will it be ok if you can check my code and let me know where I am going wrong? If it's ok for you then I will send you the code and data.
@bkrai7 жыл бұрын
yes send the code.
@subashinirajan28417 жыл бұрын
Thank you sir. To which email id I should send the code. My email id is subashinivec@gmail.com
@piyalichoudhury34936 жыл бұрын
like your videos... can you upload some on ensemble and AIC as well. will be very kind of you
@bkrai6 жыл бұрын
Thanks for comments and suggestion, I've added it to my list.
@anandsalunke1808 жыл бұрын
what if there are two target variables like NSP and some other. what deecision tree techniques to use?what will be the formula?
@bkrai8 жыл бұрын
You can make two separate trees.
@anandsalunke1808 жыл бұрын
how we will derive the formula?based on what atributes
@bkrai8 жыл бұрын
Decision tree algorithm will automatically choose the attributes or independent variables depending on the parameters such as minimum sample size for splitting, statistical significance, etc., that you choose.
@gebriadinda64058 жыл бұрын
Excuse me, sir. Can you help me? I tried this script into my data. i have 100 observation of 1383 variables. I got the result "Conditional inference tree with 1 terminal nodes" and "Number of observations: 83". However, i can't get the decision trees., i just get the histogram. Can you help me, sir? why it's happen?? Thank you, sir.
@bkrai8 жыл бұрын
+Gebri Adinda you can send data and I can look into it.
@aisha555ms20006 жыл бұрын
@@bkrai , Sir I get the same error , "Conditional inference tree with 1 terminal nodes" only histogram and number of observations=144..can you help?
@sudiptomitra3 жыл бұрын
A comparative analysis on pre/post pruning of model would have completed the tutorial on Decision Tree.
@ronithNR8 жыл бұрын
sir, could u make a video on Random forest.
@divyadamodaran539 жыл бұрын
what does the p value represents??
@bkrai9 жыл бұрын
+divya damodaran A p-value of 0.05 means 95% (1 - 0.05 = 0.95) confidence in concluding the variable to be statistically significant.
@divyadamodaran539 жыл бұрын
okay thankyou..
@VenkateshDataScientist7 жыл бұрын
R Studio doubt : I am building a predictive model with 1 million observations and having 15 variables .i am getting error like -" Can not allocate the vector of 432GB " or " Can not allocate the vector of 3.8 GB " I am using 16GB RAM .my file size is just 140MB . and i closed all the applications in my system .still error remains same . Any suggestions much appreciated..
@bkrai7 жыл бұрын
You can probably take sample for creating model with huge data. The difference between model based on a good sample and all data may not be significant. You can also try faster algorithms such as extreme gradient boosting: kzbin.info/www/bejne/raC5hYGth9d5fqc
@VenkateshDataScientist7 жыл бұрын
Bharatendra Rai sure sir ,I will try today
@aravindhp56125 жыл бұрын
Sir why you will give set.seed(1234) why you can't give set.seed(12345).can you pls tell
@bkrai5 жыл бұрын
It can be any number, but to get same samples use the same number next time too.
@sriharshabsathreya7 жыл бұрын
Sir,how to choose the Complexity parameter (CP Value)for Tree pruning ?
@kumarmithun27237 жыл бұрын
For this, you will have to build rpart model and then you can prune the tree basis on CP value(by printcp(rpart_model) and we choose cp value minimum to prune tree further )
@vishnukowndinya7 жыл бұрын
how cross validation is useful i pruning the tree ??
@bkrai7 жыл бұрын
When you develop different trees with different validation data, you can choose the one that has smaller size as well as better accuracy. This way you are able to prune decision tree.
@preeyank59 жыл бұрын
Thanks a ton!!
@bkrai8 жыл бұрын
+Preeyank Pable 👍👍👍
@tayabakhanum97078 жыл бұрын
sir please tell me about classical or crisp decision tree
@sriharshabsathreya7 жыл бұрын
Sir how can be decision tree can be used for variable selection
@bkrai7 жыл бұрын
Importance of a variable in the tree is reflected by it's position. For example, the one at top of the tree is the most important.
@raghul44577 жыл бұрын
hi, can u provide me the explanation of how over fitting occurs in decision tree?
@bkrai7 жыл бұрын
When terminal nodes have very small sample sizes, decision tree model is likely to have over-fitting. Due to small sample sizes, decisions arrived in the terminal node may not be very stable.
@ITGuySam8 жыл бұрын
Thank you for your video. I'd like to know that what do you mean "set.seed(1234)"? why don't use set.seed (2) or .. and do we can use "ifelse" instead of definition "pd"? which way is better?
@bkrai8 жыл бұрын
+Info A set.seed(1234) is just an example, you may use any other number. The idea is to reproduce results which any number can achieve. 'pd' was used for 'partitioning data' and it's just a name, you may use any other name, that will be fine too.
@caterinacevallos98226 жыл бұрын
Could you please explain me this a little bit more? pd
@bkrai6 жыл бұрын
You can go over this that has more detail: kzbin.info/www/bejne/l4SUgGt7nqx_msk
@atanunow7 жыл бұрын
getting error in #Misclassification error in testing data. it is prompting " all arguments must have the same length" Sir, please help me out.
@bkrai7 жыл бұрын
Probably there could be some mix up with training and testing data.
@atanunow7 жыл бұрын
Bharatendra Rai okay sir! Let me try once again ...if i get stuck again, can i share my codes here ?
@atanunow7 жыл бұрын
Bharatendra Rai sir, it was my fault, you were right .. Now it is working fine.
@sndrstpnv84198 жыл бұрын
may add more about CHAID trees
@bkrai8 жыл бұрын
Thanks! I'll keep it in mind.
@bharathjc47007 жыл бұрын
Hi sir,how far learning math of the algorithim needed?
@bkrai7 жыл бұрын
In business application you don't really need any math. It's more about how to correctly apply a method, and do interpretation of results to solve a business problem.
@bharathjc47007 жыл бұрын
Thanks sir for your valuable inputs
@Steamlala6 жыл бұрын
Dear Sir Thank you for your video. Can you do a tutorial on R where multiple tree base models ( Decision tree , Random Forest, Gradient Boosting, Logistic and etc..) comparing each other on the same chart using ROC to represent the visualization and split them by training vs validate data set? It would be a great help for this type of visualization especially presenting to management. Thank you !
@bkrai6 жыл бұрын
Thanks for comments and suggestion that I'll work on in near future. Meanwhile here is a link where you can quickly get ROC that plots and compares several methods such as decision tree, logistic regression, svm, random forest, etc., on the same ROC plot. kzbin.info/www/bejne/gGPEaqyMaNCfY68
@Steamlala6 жыл бұрын
Thank you Sir. The above youtube tutorial is really good. Looking forward on your awesome tutorial on comparison of multiple classification models comparison in one graph split between Train & validate.
@bkrai6 жыл бұрын
Thanks!
@saniamadoo55586 жыл бұрын
hello sir....can you plz make a tutorial on how to implement fpgrowth in Rstudio!!! its urgent! plz plz help!
@kapilkaramchandani54716 жыл бұрын
sir it is showing error as object 'tree' not found
@bkrai6 жыл бұрын
What code did you run?
@kapilkaramchandani54716 жыл бұрын
Ohh i did it, runned successfully thnxx for concern