Hi everyone, I have mistakenly mentioned that pvalue should be greater than 0.5. It should be 0.05.
@DasSquadBureauhalt2 жыл бұрын
Literally popped in my recommended 10 minutes ago. This is great, thank you!
@HackersRealm2 жыл бұрын
Hope it's helpful!!!
@javidhesenov7611 Жыл бұрын
Nice explanation thanks. Before i was watching scipy chi2. It was a little bit diffucult. But turns out sklearn chi2 is pretty straightforward and well explained in the website. Thanks for introducing it.
@HackersRealm Жыл бұрын
Glad it was helpful!!!😄
@isaackodera9441 Жыл бұрын
Just what I was looking for.
@HackersRealm Жыл бұрын
😄
@kennylouries410Ай бұрын
hello sir Excellent work....kindly share playlist link for previous video
@HackersRealmАй бұрын
Thanks. Which video you're referring?
@pradeeppaladi8513 Жыл бұрын
Hi Ashwin, your explanation is very good. I liked it & In fact, I have subscribed to your channel as well.
@HackersRealm Жыл бұрын
Glad you liked the video!!! I will try my best to share more videos like this!!!
@sravanirekha11 ай бұрын
Can we do label encoding if one of the features have more than 10 categories?
@HackersRealm11 ай бұрын
Yes, you can
@owurakuagyekum387111 ай бұрын
Please what will you do next after finding the chi-values and p-values and plotting the graph? How will you use this to analyse the data and come to a conclusion??
@HackersRealm11 ай бұрын
You can find the importance of the features and try to eliminate the rest if you have many features. Eg. 1000 features
@pradeeppaladi8513 Жыл бұрын
Hi Ashwin, I have a question. In the list of categorical variables that you have extracted, why have you added "Dependents" & "Credit_History". Are they not numerical variables? I just want to understand the basis behind adding them to the categorical variables list! An earliest response is highly appreciated.
@HackersRealm Жыл бұрын
if you check the data, dependents is category as it has a value 4+ which is a string and also credit history is a category similar to gender... only continuous values are considered for numerical
@pradeeppaladi8513 Жыл бұрын
@@HackersRealm Where can we find this dataset? Could you please share the link here?
@HackersRealm Жыл бұрын
@@pradeeppaladi8513 It's in the github repo and the link is in the description!!!
@kartikjha5704 Жыл бұрын
We need to label encode tge variables before applying this or it will work as it is ??
@HackersRealm Жыл бұрын
need to encode before applying
@DharmendraKumar-DS Жыл бұрын
Great explanation....can I use this technique with any dataset for regression?
@HackersRealm Жыл бұрын
This is mostly for categorial data...
@SWJ-MKhyathi6 ай бұрын
Hi, it's a beneficial video. But how can we use this chi-square for malware detection in Android application? could you please reply me?
@HackersRealm6 ай бұрын
could you please explain this with more detail like what are the attributes you're considering?
@joseluisbeltramone599 Жыл бұрын
¡Tremendous explanation! Thank you very much.
@HackersRealm Жыл бұрын
Glad you liked it!!!
@shuvamsingh401411 ай бұрын
my chi scores is giving nan values in array and the series attribute in pandas is also not working. could you please help me with my problem
@HackersRealm11 ай бұрын
Are you using different dataset or same?
@shuvamsingh401411 ай бұрын
different dataset @@HackersRealm
@69nukeee Жыл бұрын
Thank you! This video was very clear and very insightful to check. I do only have a quick question which isn't still clear to me: what is the null hypothesis H0? Is it maybe the hypothesis of some correlation between the categorical variables against the y target variable? If this is the case, then only variables Credit_history and Education result into having a p-value lower than 0.05, and hence they mean something (H0 valid) while the other dependent categorical variables are to be dropped (as their p-values are higher than 0.05, hence rejecting H0). Did I got it correctly? Anyway, really nice job, keep it up ;)
@AnasAbid-zm1lk Жыл бұрын
The end result is correct, however the reasons aren't, I think you have misunderstood the Chi2 Independance test, let me reclarify it for you: - H0: the target and the dependant variable are independant - H1: the target and the dependant variable are depandant The p-value is linked to the test statistic Chi2 (measure of distance between observed and expected results), the greater Chi2, the greater the distance and therefore the less likely that the variables are independant (if they were independant, observed results and expected results would be close and Chi2 small). Also, the greater the Chi2, the smaller the p-value. Therefore, to sum it up, if the p-value is small (0.05 is a common threshold), it means the independance is unlikely and that we reject H0, hence only keeping variables which p-values are lower than 0.05, since they are dependant to the target (and therefore useful).
@69nukeee Жыл бұрын
@@AnasAbid-zm1lk Thank you for getting back at me!
@Leanmonkeyvr38217 Жыл бұрын
p-value should be > .05 (No .5) to fail to reject Ho..
@HackersRealm Жыл бұрын
thanks for finding the mistake, I will update it!!!