Hi Dr. Bharatendra, I'm not sure why you split the dataset into "training" and "testing"?
@bkrai3 жыл бұрын
When we test performance of a model, we use data that was not used when building the model. So keeping 'testing' data separate later on helps to assess performance of the model.
@Lilian.Chidinma.Nwafor4 ай бұрын
You're just amazing sir. Thank you
@bkrai4 ай бұрын
You are most welcome!
@maehovland62222 жыл бұрын
here's a copy of the code (with my variables instead of Dr. Rai's) in case y'all want to just copy and paste set.seed(45) ind = sample(2, nrow(crash), replace = TRUE, prob = c(0.6,0.4)) training
@bkrai2 жыл бұрын
Thanks!
@pshani35123 жыл бұрын
Very clear and helpful..as always...Thank you very much Sir...!
@bkrai3 жыл бұрын
You're most welcome!
@drbanan41682 жыл бұрын
Thank you so much Dr you are very clear in your videos. I have questions please. After I make sure that the model is accurate using the training and testing datasets, do I use the whole data to generate the model for my project? Or do I report the testing/training results only? Another question, is the accuracy you tested is the same as goodness of fit? Is there a way that we get AIC for our models? Because I need to develop many models using different set of variables and reach for the best model using forward/backwards selection. Can I do that with multinom function? Thank you very much Dr.
@leonnorblad7583 жыл бұрын
Thank you the amazing video , but why did you use replacemnet in your samlpe? Don't you wanna run tests on other data than you used to create the model? Thank you!
@bkrai3 жыл бұрын
See link below at 42:30 point for details: kzbin.info/www/bejne/iHPSm6RmeaZ0iZo
@prashu259254 жыл бұрын
As always, amazing video sir...
@bkrai4 жыл бұрын
Thanks for comments!
@navthavanesan43432 жыл бұрын
Dr Rai, can you do a video or advise how we build in cross-validation or bootstrapping into your nnet mutlinom regression model code?
@YatiChoudhary3 жыл бұрын
Sir one doubt, what if our sample size/ observation is less than 100 say it is 80. Then can we use multi nominal logistic regression? (For qualitative study but wants to study the variables in detail) And is there a way to figure out how much train-test split on should do. Or if the model is possible without train-test split. Thank you 🙏
@bkrai3 жыл бұрын
When response is a factor type variable, you can use this. It's ok if you have 80. For splits you can try different one such as 60:40 or 70:30.
@YatiChoudhary3 жыл бұрын
@@bkrai Thank you so much, sir, for the patience in replying to all my questions. It has genuinely helped me in clarifying doubts and building my own model. I would be regular in learning more about statistical models from your KZbin videos. Thank You
@bkrai3 жыл бұрын
Thanks Yati!
@adityaupadhyaya64412 жыл бұрын
Do you have a video on multinomial mixed effects regression? Thank you!
@bkrai2 жыл бұрын
I've added it to my list of future videos.
@wereskiryan3 жыл бұрын
Fantastic video. Many thanks
@bkrai3 жыл бұрын
Many thanks!
@manikandankrishnakumar54304 жыл бұрын
Thanks for the video
@bkrai4 жыл бұрын
Welcome!
@fatikanabila21312 жыл бұрын
Hi, sir. Actually this is a very nice video, it so easy to understand with your explanation. but i wanna ask, how if i got a NaN values in standard erorrs summary? what should i do then? Thank you very much, I hope you see my comment and reply it.
@bkrai2 жыл бұрын
Likely because error is too small.
@fatikanabila21312 жыл бұрын
@@bkrai Then how to handle it?
@lianjek57884 жыл бұрын
Hi sir, thanks for the video, it is very clear! Why do we need to factor the independent variable from integer? Is there any problem if my all variables i.e. IV and DV both are integers? Thanks.
@bkrai4 жыл бұрын
For that you should use multiple linear regression.
@lianjek57884 жыл бұрын
@@bkrai Hi sir, sorry to bother you again, as like dependent variable i have to factor the independent variables as well to do the multinomial logisic regression? However, I factor both the DV and IV's and then after regression the only the categorial variable that I turned into factor….becoming insignificant. Other variables becomes highly significant. My multiple country datasets are showing the same type of findings… I am not sure am I in the right track? Please suggest me. Thanks.
@bkrai4 жыл бұрын
Using regular regression or logistic regression depends more on what type of DV you have and not that much on type of IV.
@jolojololo32213 жыл бұрын
Hi, I want to know if you can help me to find how to calculate R2 or pseudoR2 for my model
@thejuhulikal62903 жыл бұрын
Sir if i fit multinomial logistic regression.It is giving results as 0 power b to some variables in some comparison, what that mean? what can I do to get rid off that!
@bkrai3 жыл бұрын
Can you give a more specific example of what you are getting?
@Lilian.Chidinma.Nwafor4 ай бұрын
Good morning Dr. I wish i can get an urgent attention because I'm in a tight corner right now. Your videos have been helpul in my journey to data field. Please is there an alternative for 0.6, 0.4 prob split because i am getting "error in sample.int(x, size, replace, prob ): incorrect number of probabilities ". I also tried 0.7, 0.3, it gives same error
@bkrai4 ай бұрын
Thanks for comments! Check your code again, you should not get any error. Let me know if you still get error.
@Lilian.Chidinma.Nwafor4 ай бұрын
@@bkrai thank you for your quick response Dr. I honestly don't know the problem today. I have a research article to submit and I used 5 likert scale survey method to generate my data, hoping to analyze with multinomial logistic regression but keep getting error. I hope I get it tonight otherwise I will just so a normal descriptive statistics or correlation. Feel so frustrated.
@bkrai4 ай бұрын
We can do a quick zoom meeting where you can show me where you see a problem.
@Lilian.Chidinma.Nwafor4 ай бұрын
@@bkrai please can you drop a link or email so I can contact you
@bkrai4 ай бұрын
seemabharat@gmail.com
@adrianaroca91273 жыл бұрын
amazing video! it saved my life! - I am just worried that I get an error after I create mymodel saying : Error in `contrasts
@bkrai3 жыл бұрын
Make sure the response variable shows as 'factor'.
@deprofundis32934 жыл бұрын
Hi, I have a DV with 6 levels, although 2 of them only have a few observations each and likely need to be excluded. So, it'll probably have to be a DV with 4 levels. The problem is that I cannot partition my data because my sample size is too small. (This kind of data is extremely difficult to collect, and it took years to collect even what I did, so it's not possible to increase sample size). I saw that you recommended Random Forest to someone else with a similar DV, but Random Forest also requires partitioning of the data. Is there really no other way to do my analysis, given my rather small but extremely-hard-earned dataset?
@bkrai4 жыл бұрын
You can also explore this for oversampling: kzbin.info/www/bejne/fqCVfJ-sr8-Ynck
@rgemsph73393 жыл бұрын
Good day sir, what to do sir if my multinomial function converged after some iterations?
@bkrai3 жыл бұрын
That should be fine.
@sudanmac49184 жыл бұрын
Sir if we have 5 levels in dependent variable and having imbalanced data how to rectify it??
@dexterrity4 жыл бұрын
Are you able to combine any of the 5 levels of the dependent variable to make the data less imbalanced?
@bkrai4 жыл бұрын
That could be a good solution where you can group low frequency classes in to one probably calling them 'other'.
@sallu.mandya19954 жыл бұрын
hi sir , where i can get the medical data sets?
@bkrai4 жыл бұрын
I've now added the link. Data: goo.gl/MYgpLX
@sallu.mandya19954 жыл бұрын
@@bkrai dear sir , i mean different medical datasets
@sallu.mandya19954 жыл бұрын
@@bkrai to work on
@patriciageletkova97724 жыл бұрын
Hey sir, I don't understand why those numbers: 0,6 and 0,4. Could you help me, please? Thank you.
@bkrai4 жыл бұрын
It means 60% of the data will be randomly assigned to training data and about 40% to testing data.
@patriciageletkova97724 жыл бұрын
Dr. Bharatendra Rai, is it a necessary step? can I use this procedure if my dependent variable is only 0 and 1 (not 1, 2, 3)? or would it be better if I rewrote them to 1, 2? I have 225 observations. Thank you so much!
@bkrai4 жыл бұрын
If you have only 0 and 1 situation, use this link: kzbin.info/www/bejne/d4fbaIqZZqiEbbs
@Zizuzot3 жыл бұрын
Why do you take the number 222 for the seed?
@bkrai3 жыл бұрын
It's for reproducibility so that anyone partitioning the data has same train and test data.
@Zizuzot3 жыл бұрын
@@bkrai Yes I understand that, but I was wondering why specifically the number 222 :)
@bkrai3 жыл бұрын
There is no significance attached to 222. It could have been any other number as in other videos.
@kapiljhalani204 жыл бұрын
Dear Sir, I have 10000 Patients data, each patient has one CSV which contain 50,000 Rows and 36 columns. This is a multi label classification problem of disease name y = (0, 1,2,3,4) which maps to some diease name in blood cancer. Dear Sir, is there possibility to build modal on such data ? If yes, I would be very very happy if you just share with me URL Link or video or just idea. would be enough. Thank you so much in advance. Kapil Jhalani from Munich, Germany !!
@bkrai4 жыл бұрын
You can use this link: kzbin.info/www/bejne/mnvGnYF_g5KHhtE
@kapiljhalani204 жыл бұрын
@@bkrai Thank you for your reply. But I do not have single CSV but 10K CSV,'s each CSV has 50K rows and 36 columns where 50K rows represent one disease name. The example shown in the video was for one CSV. How to handle multiple CSV and multiple rows in machine learning classification ? I thank you again for your time and help. Looking forward to hearing from you. Kind regards, Kapil
@mohamedabdullah90614 жыл бұрын
sir in my project i had 207 depended variable what i do sir? pls help me
@bkrai4 жыл бұрын
And how may independent variables?
@mohamedabdullah90614 жыл бұрын
@@bkrai 0ne indepented variable which is user id
@bkrai4 жыл бұрын
I guess you may be referring to dependent variable as independent. Usually data have one dependent variables and several independent variables.
@mohamedabdullah90614 жыл бұрын
@@bkrai ya sir ..i have these type data what i do
@bkrai4 жыл бұрын
If the response variable is of factor type, you should be able to use this method.