Tutorial 33- Chi Square Test Implementation with Python- Hypothesis Testing- Part 2

  Рет қаралды 91,123

Krish Naik

Krish Naik

Күн бұрын

Пікірлер: 70
@GauravSharma-ui4yd
@GauravSharma-ui4yd 4 жыл бұрын
Great video. Please upload videos implementing various other tests
@roksanarezaei9608
@roksanarezaei9608 4 жыл бұрын
Krish, Thank you for the explanation. I have a question. Why didn't you use the P-Value and Chi_Square values that the contingency function provides and you calculated them separately? even the numbers you got are not the same.
@lenaara4569
@lenaara4569 Жыл бұрын
your explanations are really awesome! Thank you😊
@pramodkumargupta1824
@pramodkumargupta1824 4 жыл бұрын
Krish, Really nice video. what steps should we take after we perform these test. I have following question- 1. What should we do if two features are related with each another. Do we need to exclude one in feature selection? Or what should we do? 2. If there are independent, then we are good to take both features in our model for prediction?
@6shipra
@6shipra 4 жыл бұрын
I have the same questions to ask
@Rahul-gn7px
@Rahul-gn7px 4 жыл бұрын
if one feature can be derived or is highly dependent on another variable it would be wise to remove it for example age and birth date
@ankitayadav2690
@ankitayadav2690 3 жыл бұрын
Very nice explanation sir
@pranabmishra2609
@pranabmishra2609 4 жыл бұрын
Thanks, Explanation is Clear and Concise. Able to understand properly.
@amalsunil4722
@amalsunil4722 4 жыл бұрын
Guys just a tip here...u can simplify the process of obtaining the X2 statictic-> X2_statistic=(observed_values - estimated_values)**2/estimated_values X2_statistic=X2statistic.sum() and make sure observed_values and estimated_values are numpy arrays
@vanessaleiko
@vanessaleiko 4 жыл бұрын
Thank you so much for this explanation!
@kaifahmed316
@kaifahmed316 3 жыл бұрын
Grate explanation sir thank you 😊
@ajithshenoy5566
@ajithshenoy5566 4 жыл бұрын
Hey Krish , Love your videos. Kindly upload more videos in the machine learning pipeline section. The last one is feature selection . Interpretation and deployment videos would be largely appreciated.
@krishnaik06
@krishnaik06 4 жыл бұрын
Sure
@badiyabhargav8597
@badiyabhargav8597 3 жыл бұрын
Sir I have doubt.. At 11:42 u said that chi2_statistic should always be greater than critical value then only we retain null hypothesis but in the code our chi2_statistic value smaller than critical value in if condition u gave if(chi2_statisti>=critical value): print(reject ho and accept h1 there is a relation) else: print(retain ho there is no relation) I think we have to reject the ho null hypothesis if chi2_statistic is greater than critical value
@akashsoni5870
@akashsoni5870 4 жыл бұрын
Am I the only one who saw statquest by josh starmer? I am following statquest before krish Naik sir lecture...... believe me statquest is very good for indepth knowledge
@bhagyashreemohanta7826
@bhagyashreemohanta7826 4 жыл бұрын
Thank you so much... 🙂 Highly Obliged..... 🙏
@omkarpatil2854
@omkarpatil2854 4 жыл бұрын
hello Krish, awesome video series as always. if p-value is high then both samples are related to each other right? in your code, there is a condition where if p_value
@sandipansarkar9211
@sandipansarkar9211 4 жыл бұрын
thanks krish .great explanation
@mohammedahtesham2021
@mohammedahtesham2021 4 жыл бұрын
if p-value is high then both samples are related to each other right? in your code, there is a condition where if p_value
@louerleseigneur4532
@louerleseigneur4532 3 жыл бұрын
Thanks Krish
@srinathganesh6985
@srinathganesh6985 4 жыл бұрын
doesn't `scipy.stats.chi2_contingency` already return `p value` directly?
@hilmanrevisionery130
@hilmanrevisionery130 4 жыл бұрын
this is what im confused, it already returns at [1] index (0.925417020494423) from chi2.contigency result, then why we should recalculate the p value. hopefully someone can explain
@varunupadhyay1576
@varunupadhyay1576 2 жыл бұрын
@@hilmanrevisionery130 Did you got it?
@AK-ws2yw
@AK-ws2yw 3 жыл бұрын
Hey Krish, I had a doubt. If i have 4 columns which have 2 character type data. Eg. Let 4 columns name be A,B,C,D and all these 4 columns are categorical data that is all 4 columns have Yes and No type data. My aim is to find whether all 4 columns have a Yes. Which Test should i go for in that case.
@PravinKumar-zc2eq
@PravinKumar-zc2eq 2 жыл бұрын
Hi, Krish ur videos are really helping me understand these concepts in a easy way thank you . Is there any possibility a video on ANOVA??
@biranchinath8428
@biranchinath8428 3 жыл бұрын
Thank you sir for your help.
@shobitjain9619
@shobitjain9619 4 жыл бұрын
Sir, can you Make videos on different different pairwise metrics in sklearn like cosine similarity, sigmoid krnel, rbf kernel etc..
@SHIVAMBAJPEYIMIM
@SHIVAMBAJPEYIMIM 4 жыл бұрын
Thank you so much, this makes my day:)
@abhinavsharma7291
@abhinavsharma7291 3 жыл бұрын
Krish, Thank You! Any video on ipynb file explaining ANOVA test ?
@solar_girl_here
@solar_girl_here 3 жыл бұрын
Amazing. Thanks
@lokanathshroff3301
@lokanathshroff3301 4 жыл бұрын
Sir, not able to see the big data playlist
@abhinaygupta8243
@abhinaygupta8243 3 жыл бұрын
suppose there are around 50 features in my data set so, should i do the chi square test for each of two features and same with others also , it will more time consuming...........or we will directly find the correlation as per pair plot and select one out of similar ones
@priyaduttbhatt5691
@priyaduttbhatt5691 3 жыл бұрын
simply perfect!
@pratikchatterjee5992
@pratikchatterjee5992 4 жыл бұрын
Hi Krish. Nice video. Where are the big data videos?
@mohammedahtesham2021
@mohammedahtesham2021 4 жыл бұрын
if p-value is high then both samples are related to each other right? in your code, there is a condition where if p_value
@AsifMarazi
@AsifMarazi 2 жыл бұрын
Instead of writing chi_square_statistic=chi_square[0]+chi_square[1].... for number of rows just replace this line with chi_square_statistic=chi_square.sum(), So you need not to worry about the writing all the rows in case of having more rows
@nanditasharma6766
@nanditasharma6766 4 жыл бұрын
Krish, you told there is a relationship & one will have some effect on another as they have relation. So we have to consider one or both variable?????????? if we consider one then it will definitely get effected as they are related with each other... considering one will give miss effect on target then????
@Balubindass
@Balubindass 4 жыл бұрын
Hi Krish Naik, I-am following you channel and it is very clear and easily understandable. After your z test and T test video, i tried doing some hypothesis test. Here is my example and would need your help if i am doing it wrong. I have a file with 5000 rows And i am considering as a population and i have assuming hypothesis. Null Hypothesis as age 30 This is one tailed test. So here is my question do i need create a sample from population or else i need filter age >=30 and consider it as sample? And if z score table 1.694 and z test gave 3.54 the do i need reject null hypothesis? Please kindly help me.
@akshayvishnukishore2282
@akshayvishnukishore2282 3 жыл бұрын
Question: why did we calculate the p-value again? cant we just use the p-value returned from the chi2_contingency() ?
@amansinghrathore8308
@amansinghrathore8308 3 жыл бұрын
+1
@varunupadhyay1576
@varunupadhyay1576 2 жыл бұрын
@@amansinghrathore8308 Did you got it?
@rajulshakya4899
@rajulshakya4899 4 жыл бұрын
Nice video
@mohammedahtesham2021
@mohammedahtesham2021 4 жыл бұрын
if p-value is high then both samples are related to each other right? in your code, there is a condition where if p_value
@sushantshekhar8082
@sushantshekhar8082 4 жыл бұрын
Krish, please upload similar implementation video for Anova test aslo
@MJAYRECORDS
@MJAYRECORDS 4 жыл бұрын
Hi Krish the video is good can u tell me the solution for the chi square test coding for marital status and different education level problem
@AkshayDudvadkar
@AkshayDudvadkar 3 жыл бұрын
What do we do when we have multiple categorical columns ??
@anupamasonnad220
@anupamasonnad220 4 жыл бұрын
Hi Krish, If I have to figure out the association/ relation between more than 2 categorical variable , will that be done using Chi2? If I have to test the multicollinearity between more than 2 categorical variables, can we convert them into numeric and apply VIF?
@mohammedahtesham2021
@mohammedahtesham2021 4 жыл бұрын
if p-value is high then both samples are related to each other right? in your code, there is a condition where if p_value
@saumyamishra5203
@saumyamishra5203 3 жыл бұрын
plzz tell me what 1st 2 values in the result of function chi2_contingency is ....as i was thinking that 1st one is chi_statistics_value & 2nd one is p_value.???
@gurdeepsinghbhatia2875
@gurdeepsinghbhatia2875 4 жыл бұрын
too gud sir ,
@mohammedahtesham2021
@mohammedahtesham2021 4 жыл бұрын
if p-value is high then both samples are related to each other right? in your code, there is a condition where if p_value
@gurdeepsinghbhatia2875
@gurdeepsinghbhatia2875 4 жыл бұрын
@@mohammedahtesham2021 Dear Mohd , p_value is just a probability value that assures our result , the main result is in correlation , let us take an example , suppose we got correlation of +1 with p_value of 0.05 , then it means that the 2 variables have positive 1 correlation with 0.05 probability ie with 5% of accuracy , now why only 5 percentage for this u must see the hypothesis testing video of the Krish Sir , for further doubts mail me at gsbhatia111@gmail.com , if u feel , i hope my reply helps u thanks
@alextjflorida
@alextjflorida 3 жыл бұрын
Thank you for the video. It seems Python is not efficient in running statistical tests. You have to get one single test results by taking too many steps. Other software packages can do a better job in this department.
@PrinceKumar-eb8hd
@PrinceKumar-eb8hd 4 жыл бұрын
e to upar sea nikal gaya..koi nai..jab jarurat hoga...tab dubara sea research kiya jaye ga...vese thanq sir..
@Tejashri_Kate
@Tejashri_Kate 2 жыл бұрын
And how do we know if there is type1 or type2 error?
@vincetechclass3390
@vincetechclass3390 2 жыл бұрын
What about negative values?
@sushantrauthan5704
@sushantrauthan5704 4 жыл бұрын
Thanks for the video but i have a doubt , i've never really grasped the concept how of how you choose the hypothesis in some cases you choose NULL hypothesis for the motion and in some cases you choose the hypothesis against the motion.LIke how does that work?
@amalsunil4722
@amalsunil4722 4 жыл бұрын
yes it's very important...as we always assume the H0 hypothesis to true while testing/finding the p-value. H0: there's no significant difference (just do this for all cases...it can be btw 2 variables,a sample mean nd a given population mean etc)
@amitjajoo9510
@amitjajoo9510 4 жыл бұрын
Thanks
@mohammedahtesham2021
@mohammedahtesham2021 4 жыл бұрын
if p-value is high then both samples are related to each other right? in your code, there is a condition where if p_value
@dheerajkumark2268
@dheerajkumark2268 4 жыл бұрын
Sir while finding p value, can we give pdf instead of cdf
@vaibhavmohite468
@vaibhavmohite468 4 жыл бұрын
Can you explain goodness of fit test in python
@rayhankabir645
@rayhankabir645 8 ай бұрын
Please would you help me with this dataset
@minakshi_119
@minakshi_119 3 жыл бұрын
Can anyone please help me with Expected_Values=val[3], why here val[3] means..
@parikshitgurjar5545
@parikshitgurjar5545 3 жыл бұрын
Hello guys , Plese can anyone explain :- The Degree of freedom = 1 in the output what this "1" signifies.
@rushin3090
@rushin3090 2 жыл бұрын
can anyone send this playlist?
@amadoum.jallow620
@amadoum.jallow620 4 жыл бұрын
Sir, can you please recommend me a very good book for statistics.
@chetanmazumder310
@chetanmazumder310 4 жыл бұрын
What about anova ?
@manishbolbanda9872
@manishbolbanda9872 4 жыл бұрын
getting error for sns.load_dataset('tips') even though i have imported seaborn
@minakshi_119
@minakshi_119 3 жыл бұрын
Can anyone please help me with Expected_Values=val[3], what here val[3] means..
$1 vs $500,000 Plane Ticket!
12:20
MrBeast
Рет қаралды 122 МЛН
Chi Square (Category) | Feature Selection | Python
10:44
Hackers Realm
Рет қаралды 12 М.
ML Was Hard Until I Learned These 5 Secrets!
13:11
Boris Meinardus
Рет қаралды 357 М.
How to choose an appropriate statistical test
18:36
TileStats
Рет қаралды 155 М.
Stanford's FREE data science book and course are the best yet
4:52
Python Programmer
Рет қаралды 714 М.
How To Know Which Statistical Test To Use For Hypothesis Testing
19:54
Amour Learning
Рет қаралды 825 М.
Python for Data Analysis: Chi-Squared Tests
17:32
DataDaft
Рет қаралды 39 М.
Statistics made easy ! ! !   Learn about the t-test, the chi square test, the p value and more
12:50
Global Health with Greg Martin
Рет қаралды 2,2 МЛН
Chi Square test
7:16
Vectors Academy
Рет қаралды 1,6 МЛН