Great Video, Kudos! But, we should never say "we are accepting Null Hypothesis", we SHOULD Say it as "We are Failing to Reject Null Hypothesis", as there is always a possibility of error that lies with our sample data
@shreyaskulkarni76123 жыл бұрын
It's really true
@ammar462 жыл бұрын
Height and age doesn't follow Poisson distribution, it follows normal distribution.
@zijieliu66542 жыл бұрын
This is the best explanation of the T test on KZbin no doubt!!!
@nilupulperera4 жыл бұрын
Thank you Krish for the introduction of statistical tools in python. Now only I realized how to do comprehensive statistical analysis without depending on Microsoft Excel (and addon software apps) which has limited capabilities in the Data Science field.
@manikantasai7215 жыл бұрын
Very near to...100k sir congrats sir!
@ivancarrillo18894 жыл бұрын
Thanks for the explanation. By the way this is the second video from you I watched. I understood you much more clearly this time I guess because of the microphone (non native English speaker). Please keep it in mind.
@mallikharjunv68053 жыл бұрын
Thank you so much Krish ..excellent.
@Ghodkeshubham6cool2 жыл бұрын
Very great learnings✌🏻
@shrikantagrawal66424 жыл бұрын
Summary: One categorical variable One sample t-test Two categorical variable Chi square test One continuous variable T test Two or more continuous Correlation and then T-test variable One Continuous and one T test categorical which has two categories One Continuous and one ANNOVA Test or more categorical which has more than two categories Two variables and you want Two sample T test to compute if their means are different One variable and we have Paired sample T test created one more variable based on first variable by adding some proportion to it on time basis @Krish - Please suggest
@cinemascope88474 жыл бұрын
Shrikant Agrawal please answer this Krish as it makes sense for all of us
@narensingh6728 Жыл бұрын
Thanks bro
@SameerAli-nm8xn Жыл бұрын
You are great Sir, that's a lot.
@mommysboy80152 жыл бұрын
Great teaching skill
@pavanaramu7213 жыл бұрын
The previous video teaching was excellent. Present video need to include chalk board activity for better understanding......
@sumittagadiya34974 жыл бұрын
very good explanation sir, thanks a lot
@venkivtz99614 жыл бұрын
Hi Krish, your explanation and example is excellent. But a small correction to the conclusions at the end of the test. We have to conclude the statement with respect to the alternate hypothesis. We should never say that "we are accepting null hypothesis".
@sandipansarkar92114 жыл бұрын
watched the video for the second time and practices on to Jupiter notebook.Thanks
@harishkumar-zx6vg4 жыл бұрын
Y he used np. Random. Seed (6)?
@harshdewangan19513 жыл бұрын
@@harishkumar-zx6vg np.random.seed(n) is used to make the random number predictable, i.e., we will get same set of numbers whenever the code executed
@skumarr535 жыл бұрын
Thank you so much, sir, for the effort you are putting to educate us all. I want to make a career transition in Artificial Intelligence in computer vision and the NLP processing field. My question is do I need to be familiar with the ML concepts like feature engineering etc and ML algorithms or is it enough if I focus only on Deep Learning. I don't see much overlap between those two but both are treated as part of Data Science in the industrial setup.
@snehakoul98184 жыл бұрын
0
@adiflorense1477 Жыл бұрын
1:49 krish, i have question. what t-test we use to see difference in machine learning model?
@soujanyabagam20343 жыл бұрын
sir, what is the difference between mean that you are passing as an argument in possion distribution and mean you are calculating subsequently?
@shivangikdesai2 ай бұрын
+1
@001Debjeet5 жыл бұрын
btw congo on 100k silver incoming
@sandipansarkar92114 жыл бұрын
Awesome video Krish but don't forget to practice on Jupiter notebook.Thanks
@batulkhan97724 жыл бұрын
HEY, @ 7:07 minutes you're telling to reject the null, but the p_value is more than the 0.05
@swarajkumarsahoo47364 жыл бұрын
yeah, and the output shown is "We are accepting null Hypothesis"
@questforprogramming4 жыл бұрын
We fail to reject the null hypothesis, because it is > 5%. 74% >5%
@questforprogramming4 жыл бұрын
@@swarajkumarsahoo4736 he didn't run that cell at all. He ran previously means not the n video. So that output is wrong I guess and what he said is also wrong
@BlueSkyGoldSun2 жыл бұрын
Yes me to iam confused , did he make a mistake?
@06madhav5 жыл бұрын
Bhai, incredibly clear video. One doubt- how to go ahead with the hypothesis testing looking at the dataset? Means, how to decide whether any sort of such tests are required to be done on the dataset?
@mishuchugh17773 жыл бұрын
@madhav srimohan..did u got this?
@divyanshuaswal1843 Жыл бұрын
@@mishuchugh1777 did u got this?
@wealth_developer_researcher3 жыл бұрын
Sir, I have a doubt. On timeline 7:03 you said we reject null hypothesis in this case. But p_value > 0.05 and output is we are accepting null hypothesis. Please correct me if i am wrong
@kaifahmed3162 жыл бұрын
Same here
@BlueSkyGoldSun2 жыл бұрын
Iam also confused
@sh__-- Жыл бұрын
Please correct guys Accepting the Alternative hypothesis and rejecting the null hypothesis is the correct answer. Mistakes will happen sometimes😊 I am also a learner..👍
@louerleseigneur45323 жыл бұрын
Thanks Krish
@rambaldotra22213 жыл бұрын
Thanks a lot Sir
@Himanshusingh-ep1hc3 жыл бұрын
@10:35 the p value 1.1390 which is greater than 0.05 but still its printed rejecting null hypothesis ?
@Abhishek-st4mu3 жыл бұрын
same here, @10.00 i confusing on that statement, how can 0.05 is greater than 1.139
@sumitmaiti22184 жыл бұрын
Great explanation sir... It clears the understanding of the concepts... I have just one doubt: How are we selecting which statement to be the Null Hypothesis and which one for the Alternate Hypothesis? Because based on that and the p value, we would come to the conclusion.... Thanks :)
@samerrkhann4 жыл бұрын
Usually null hypothesis is used when we say there's no difference between two groups. For example, you draw a sample from a population and want to check if there is any difference between the mean of the sample or mean of the population. You will make null hypothesis that the two means are no different. Similarly when comparing two groups if you want to check if there means are same, you will develop null hypothesis that there are no difference between two. One last example, first let's say you flip a coin 5 times and get heads 5 times. You will make a null hypothesis that my coin is no different than the normal coin. Hope this helps :)
@snehalpophale62872 жыл бұрын
Thank you so much!
@pratikshagwalwanshi86764 жыл бұрын
When we already established poisson mean (mu) as 30 in classA_ages=stats.poisson.rvs(loc=18,mu=30,size=60) Then why do we get different value for classA_ages.mean()?
@mahenderboda13393 жыл бұрын
I also got the same doubt got any answer?
@madhavilathamandaleeka59533 жыл бұрын
I also ....☹️.....and how can we take those mu values ..?? Plz anyone clear my doubt
@pratikshagwalwanshi86763 жыл бұрын
Nope didn't get it yet. If someone gets this doubt clear then please tell.
@sashpatra884 жыл бұрын
Hi Krish, Can you please share the ANOVA Implementation with Python video as I couldn't find it in your list?
@madhureddy53285 жыл бұрын
Why we do P or T or anova test? If we come to conclusion what we do with the dataset
@gopichand88743 жыл бұрын
Have you got the answer ?
@Arasu892 жыл бұрын
Hi Krish, After rejecting the Null Hypothesis or accepting the null hypothesis, what is the next steps we will do with data. Do we remove the features?
@anveshpoloju73315 жыл бұрын
Hi Krish, for beginners can you please suggest 'order' of preparing for DATA SCIENCE... For example 1st statistics 2nd python 3rd ML 4th DL.... Or simultaneously. Where to start exactly is confusion for many people.... Thank you
@megirija18974 жыл бұрын
pls upload video on impementation for anova and chi square test...
@somtonnamah57343 жыл бұрын
please i would like to know if the distribution of data groups matter when checking correlation
@sumitsaurav17104 жыл бұрын
Hi Sir The value of mu selected as 30 in "classB_ages=stats.poisson.rvs(loc=18,mu=30,size=60)" is mean of what? and how does it differ from classB_ages.mean()
@kiranchowdary81004 жыл бұрын
ya same doubt i think mu is possion parameter
@ankitgadwe22004 жыл бұрын
@@kiranchowdary8100 You are right. It is some parameter. You can check it here: docs.scipy.org/doc/scipy-0.15.1/reference/generated/scipy.stats.poisson.html
@architagarwal73792 жыл бұрын
Bootstrapping is enough to implement this all t test, chi swuare test etc ??
@payalbhattad804810 ай бұрын
Hey Krish, it was a great explanation. I have 1 doubt though, I am trying to use it on my dataset where I am finding if there is a significant difference between gender based on salary. But I am getting p-value as nan even if there is no null values in the dataset. Whereas if I am performing the same on SPSS it gives a p-value of 0.43. any suggestions?
@soujanyabagam20343 жыл бұрын
sir, are you using poisson distribution to create a fake data set of ages since we dont have a real data set?
@veereshbk43944 жыл бұрын
99k subscribers were there while making this video, now it is 262k subscribers, means 262k-99k people are on the way to become data scientist during covid pandemic. 2021 will have less demand for data scientists as supply is increasing! This is just sample testing
@shahidraza7965 Жыл бұрын
Can someone tell me why the mean calculated differs from mu in one t test of class_ages
@DataInsights20013 жыл бұрын
Nice! Test control analysis is good for if promotion programs apply to test group and test whether there is any difference? Also nice to know that which direction to test, whether it is two tailed, left tailed, or right tailed? Also need to consider Type 1 error and type 2 errors?
@neelroy32 жыл бұрын
which statistical test can be used to find difference between two groups' percentage values?
@prashu25925 Жыл бұрын
can we perform hypothesis testing on multiple column data?
@itzmekallam72772 жыл бұрын
how can 1.13 < 0.05, at 10:10 , is it mistake or just Krish Naik logic
@NishatJillani Жыл бұрын
At the end it shows -13 in power which you actaully missing . so how -13 power is greater then 0.05 even.
@gunjanchaturvedi-m1b Жыл бұрын
Let’s suppose 5 years ago, the average cost-per-person at a cafe was 300, has it changed now. (perform hypothesis testing to conclude). how to solve this which test need to perform here ..
@kusumakatamneni34045 жыл бұрын
Hi Krish we have to perform T-test in between two population is meaningful but how can we do on one population? How can we get significance difference on one population?
@i_black_hawk5 жыл бұрын
if you need to test effect of drug on a sample of population then same sample of population needs to be taken before and after drug dose . This is done to reduce bias. And in such type of stats modelling we use paired sample t test
@i_black_hawk5 жыл бұрын
Another example you can take is effect of workout on weight of a person. You need same sample of people whose weight were recorded before work out
@kusumakatamneni34045 жыл бұрын
@@i_black_hawk tanq so much
@nishabhatt5268 Жыл бұрын
Thanks for sharing this video it is very helpful, can you please advise on how to integrate python script for p value in power bi? Many thanks
@ankita6844 жыл бұрын
Thank you for this video Krish. In the two sample t test , 'ttest_ind' function you have taken equal _var to be 'False'. The code reads as : _,p_value=stats.ttest_ind(a=classA_height,b=ClassB_ages,equal_var=False). Shouldnt the 'equal_var' be True? As T test assumes that the populations have identical variances by default. Could you please check once. Thanks
@munishasharma52655 жыл бұрын
Hi Krish, I hope you are doing, Awesome explanation. However I have One Question What we will do after the analysis of these P test values?
@piyushsaurav57914 жыл бұрын
Hypothesis testing has been used by analysts to make inferences about the population .These tests are done to answer business questions eg . A/B testing ( version A is better than version B etc)
@adityamahimkar61383 жыл бұрын
I'm not an expert bt I think on a dataset we can make lot of changes statistically and train a model on the data bt using such test we can in the first look out of corrections in the data before training thus saving time and computation power. Do correct if I'm wrong, it just a null hypothesis 😅 :)
@prasadrajmane46962 жыл бұрын
Thank u sir
@hvchetan14 жыл бұрын
How do we get to know whether the test is one tail or two tail test... How python interpret this thing whether it's a one tail or two tail test?? As we are not specifying that thing .
@sunilpatil19235 жыл бұрын
Hello, thanks for the detailed explanations, Why p value is 5% only? Why can't it is 10% or 8% or any other value? Pls clarify.
@krishnaik065 жыл бұрын
It is decided before itself and yes it may changes...this value is decided by domain expertise
@Andynath1004 жыл бұрын
Look up Statistical Hypothesis testing(Inferential Stats), It depends on the confidence level (alpha) for the test. In stats this measure of likelihood is taken as .05 (5%), .01(1%), .001(.1%) or in simpler terms what is the likelihood that the alternate is true if we assume the Null to be true. The t test calculates a p value and if it is less than alpha (.05 or 5% in this case) we reject the null because the probability of getting this sample if the null is true is very small (magnitude of p value). Or in simpler terms it's not possible (or very unlikely) to get this sample by chance. If you want a better explanation please follow the Udacity Intro to inferential stats course, its free.
@abhinavjain55613 жыл бұрын
Sir in 1sample t test second example,we take the mean of classA_ages as 30 but in next step it is coming 46 so what about that
@shashikantrrathod36174 жыл бұрын
Hello Krish, What are the limitation of linear statistical test? why we choose non-linear classifier over linear classifier?
@rsinh37923 жыл бұрын
Sir reviewer has asked me this question I don't know how to address it, can you please guide me "Use some statistical significant test such as T-test or ANOVA to prove you validate the proposed diagnostic model on patients and quality improvements of your method". I have two datasets. Dataset 1 was used to train the model and dataset 2 was used to validate the trained model. I have trained the ML model deployed it and Validated it on new data and presented the results. Actually, I have understood the question. Shall I apply the statistical test between the performance metrics of trained model results and validation results? Please help me, sir.
@aksaini90633 жыл бұрын
Sir if two independent feature are highly positive correlated or highly negative correlated. What is the best solution for this ?is the right to drp the one feature?
@itplacementprep3 жыл бұрын
Summary : A One sample t-test tests the mean of a sample group against a known population mean. Two sample t test, An Independent Samples t-test compares the means for two groups. A Paired sample t-test compares means from the same group at different times (say, one year apart).
@pramitbanerjee43813 жыл бұрын
Why the classA_ages.mean() and classA_ages.var() not equal and what is the role of mu if it is not equal to classA_ages.mean()?
@santiagorey13824 жыл бұрын
very good video but please tell me, if the two features are statistically different, p_value> 0.05, then does that mean we should discard or keep that feature?
@autismblessingindisguise4 жыл бұрын
Could not see the next video on chi Sq test implementation using python.. Please load it soon. Thanks
@tannurohela61922 жыл бұрын
Great explanation sir, but I didn't get why are we using np.random.seed() ? Can anyone please help with the seed thing.
@meenadalvi97433 жыл бұрын
@5.43 the p value is greater than 0.05 so in that case we should accept the NULL hypothesis.Correct me if am wrong.
@Abhishek-st4mu3 жыл бұрын
same here, @10.00 i confusing on that statement, how can 0.05 is greater than 1.139
@sohinibanerjee96173 жыл бұрын
In the second one sample t test how is p value of 1.13 less than 0.05? Can someone please explain.
@NishatJillani Жыл бұрын
t the end it shows -13 in power which you actaully missing . so how -13 power is greater then 0.05 even.
@zionramdinthara84034 жыл бұрын
Hi Krish, can i compare two different population using t test. I want to compare the height of plants overtime with controlled and uncontrolled temperature. I actually have different datasets for both. Please help
@dipk.mishra4 жыл бұрын
Sir , How do u decide whether Null is there is no difference ? Is there any logic behind?
@ammar462 жыл бұрын
Height and age doesn't follow Poisson distribution, it follows normal distribution.
@hemantsharma79864 жыл бұрын
Is one sample t test and one tail t test same?
@mohinimarathe87693 жыл бұрын
GOD OF STATS :)
@dharamjeetsingh29364 жыл бұрын
Krish i have a 3 years experience in business resilience analyst but we only use Excel not python SQL tableu. Do u think i have an advantage for becoming DS
@sohinimitra51314 жыл бұрын
In the first example of ages, you passed the expected NULL hypothesis value as 30 [ttest(ages_sample,30)]. Shouldn't it be 0? [ttest(ages_sample,0)]. Since The NULL hypothesis states the difference between mean of population and sample is 0. Why is the population mean passed there? Also, in many scenarios we will not have access to the population mean too.
@lancelotdsouza47052 жыл бұрын
pls explain this on a real dataset
@c.dharmeshwaran34704 жыл бұрын
In the 2 Sample T-Test, Were the samples/groups are selected from the same population or from 2 different populations?
@soumyaranjansahu42623 жыл бұрын
Hi Krish , In this School age problem, You have taken Sample Size=60 which is more that >30 .Hence shouldn't you calculate the P value on based on z- distribution rather than t-distribution?
@amansinghrathore83083 жыл бұрын
The t-test can be applied to any size (even n>30 also).
@manitachakraborty23483 жыл бұрын
can u please solve this T test problem without python
@arpitjaiswal59723 жыл бұрын
Why T test is used? Because there is no information given for Population SD so Z can't be used. If population SD was given then use Z test t distribution is Normal distribution / chi square. Check the formula you will be able to find the realtion
@anupamjamatia Жыл бұрын
hi, your tutorial is great, but I have a doubt regarding the statistical significance in this scenario -- if I do train data on Lang0 language and generate a model. afterward using the Lang0 model I do testing on other languages like Lang1, Lan2...Lang5 used different algorithms like AlgoA, AlgoB, and AlgoC and got the accuracy. so in that case is it possible to do the statistical significance test? no cross-validation is done while training. Say I have Lang Algo1 Algo2 Algo3 Algo4 Algo5 Lang1 80 32 95 93 96.67 Lang2 88 11 98 97 92.51 Lang3 49 12 76 80 72.75 Lang4 81 2 95 94 77.7 Lang5 81 43 95 96 94.95
@adidbaker76073 жыл бұрын
hey guys ive got a doubt in first one sample t test he said he is rejecting the null hyp when the p value is 0.740 which is higher than 0.05 ,so is isnt he supposed to accept the null hyp??
@richasharma7968 Жыл бұрын
I have the same doubt. Can anyone explain it?
@NishatJillani Жыл бұрын
t the end it shows -13 in power which you actaully missing . so how -13 power is greater then 0.05 even.
@pratikbambulkar89813 жыл бұрын
But why we used hypothesis for ML?
@sunitam10252 жыл бұрын
sir, can you provide pdf of this
@OnkarSingh-rg5jp3 жыл бұрын
Sir, in what case do we divide the p-value by 2?
@siyabongamyeza53152 жыл бұрын
He was supposed to divide the p-value by 2 since its a two sided test. Two sided test occur when you use the word "difference". When it is one sided, i.e either less than or greater than, you do not divide alpha by 2
@lopamudrachandra24933 жыл бұрын
Thank you so much for your video. Your channel is really helpful for students who cannot afford to online courses. I would like to know if I join your 59/month membership will it help me learn better on Data Science overall?
@ankeshsingh25764 жыл бұрын
If you execute the function ttest_1samp(), p_values keeps changing after every excution, varying from 0.05. How can we fix it ?
@shreyasaxena51694 жыл бұрын
If you execute random.choice then it will resample and change mean accordingly. For same sample , p value cannot vary.
@questforprogramming4 жыл бұрын
Fix a number in random state.
@utsavroy53464 жыл бұрын
What if I reverse the assumptions? I means if H0 becomes H1 and vice versa. In that case how to move ahead?
@akashprabhakar63534 жыл бұрын
Yes you can do but u need to ensure that null hypothesis statement is chosen in such a way that you can conduct the experiment based on that null hypothesis. For example, you observation is that you got 10 head on 10 coin toss. Now you want to check if the coin is biased or not. Now, If you take Ho(null hypothesis ): coin is biased...then the problem is how will u find the p value or conduct the experiment ..bcz the coin can be biased with any probability And suppose you take Ho : coin is unbiased ...means probability of getting 10 heads on tossing the coin 10 times is : (0.5)^10......as probability of getting one heads is 0.5 for single toss when ""coin is unbiased"" Now u will get the values as 0.00097
@devmani1004 жыл бұрын
Since you are dealing with the sample size and not the population, the relationship you might be getting from the sample may be due to random chances. The idea behind the null hypothesis is that relationship you are observing in the variables are due to randomness. S, my null hypothesis is always of the form that, "There is no relationship between the selected variables. This is what I have derived from all the sources from the StatsLand :D . Please correct me if I an wrong.
@pratikbhansali40864 жыл бұрын
What are we even achieving by doing one sample t Test
@thousandsunny1004 жыл бұрын
ttest, p_value = ttest_1samp(covid, 30) TypeError: 'module' object is not callable
@ppriyesh304 жыл бұрын
Sir, this is unfair...I am just trying to build concepts of Data science and I come to know that there has been some term used which are totally new. Some times you import preprocessing, sometimes model_selection, sometime metrics, and, now, import maths..and poisons distribution and scipy stats.. please let us know when to choose what..Thanks..
@ppriyesh304 жыл бұрын
Specially please help us with the scikit learn library..remaining I guess has not that much importance
@ManishKumar-qs1fm5 жыл бұрын
Plz corr explaine in details , bz m confuse in this
@adeyinkaAdedejiNaMe Жыл бұрын
Educative video but you calculated the population to be 30.4375 why did assume the population mean to be 30
@shrikantdeshmukh79513 жыл бұрын
Poisson distribution it's not poison distribution
@BlueSkyGoldSun2 жыл бұрын
Fix the mistake ,how come 1.13 is less than 0.05?
@karanbisht63592 жыл бұрын
it was 1.12 *10^-somthing
@SoumyaDasgupta3 жыл бұрын
Krish likes WWE. My Man
@aws61433 жыл бұрын
dimaag ho to essa ho jinda to pappu bhi h
@vaibhavberiwal4 жыл бұрын
Watch khan academy videos for a more intuitive and in-depth understanding of the concepts :)
@rsinh37923 жыл бұрын
Sir reviewer has asked me this question I don't know how to address it, can you please guide me "Use some statistical significant test such as T-test or ANOVA to prove you validate the proposed diagnostic model on patients and quality improvements of you method" I have trained the ML model deployed it and Validated it on new data and presented the results. Actually, I have understood the question. Shall I apply the statistical test between the performance metrics of trained model results and validation results? Please help me, sir