Hypothesis testing Practical Implementation|Hypothesis testing with data example in python

  Рет қаралды 46,375

Unfold Data Science

Unfold Data Science

Күн бұрын

Пікірлер: 146
@kshitijdesai2402
@kshitijdesai2402 Жыл бұрын
This is the best video for practical hypothesis testing that I have watched!
@UnfoldDataScience
@UnfoldDataScience Жыл бұрын
Thanks Kshitij
@isaackodera9441
@isaackodera9441 Жыл бұрын
You are God sent. Exactly what I have been looking for
@UnfoldDataScience
@UnfoldDataScience Жыл бұрын
THanks alot
@out_aloud
@out_aloud 3 жыл бұрын
This is what I had been looking for.... Great and simplified content ! Awesome !
@UnfoldDataScience
@UnfoldDataScience 3 жыл бұрын
Thanks Nilut.
@beyou7893
@beyou7893 3 жыл бұрын
thank you..........sirji..........what i saw from, video is recorded at 1:52am....shows the damn dedication...🙏
@UnfoldDataScience
@UnfoldDataScience 3 жыл бұрын
Thanks for watching.
@DimiqBaba
@DimiqBaba 5 ай бұрын
My man recorded this at 2 AM to bring us content, much love
@santosh44kumar
@santosh44kumar 3 жыл бұрын
Hi Aman, video time at: 5:12, the graph is +ve / right skewed, u stated its left skewed? any inputs on it appreciated.
@UnfoldDataScience
@UnfoldDataScience 3 жыл бұрын
Thanks for inputs I will check, sometimes I get confused between left and right. In general also🤣
@whatsuppk6466
@whatsuppk6466 Жыл бұрын
​@Unfold Data Science hahah its true.. i remember it by using position of outliers.. if outliers are on left, its left skewed.. if outliers on right, its right skewed
@anuragchandnani8037
@anuragchandnani8037 4 жыл бұрын
Hi aman , I read this concept so many times but never understood it like I did today.. thanks
@UnfoldDataScience
@UnfoldDataScience 4 жыл бұрын
Most Welcome Anurag. Your comments keep me motivated :)
@dev5289
@dev5289 3 жыл бұрын
Seeing this video before the interview.Thanks a lot
@UnfoldDataScience
@UnfoldDataScience 3 жыл бұрын
Welcome Devansh.
@maaleem90
@maaleem90 Жыл бұрын
And brother, what was the result. I hope you got in
@shubhamchoudhary5461
@shubhamchoudhary5461 3 жыл бұрын
straight forward explanation .. literally amazing.. i want to learn teaching skill from you sir.. thanks!
@UnfoldDataScience
@UnfoldDataScience 3 жыл бұрын
Welcome Shubham.
@APARNASHAHARE99
@APARNASHAHARE99 2 жыл бұрын
hello sir, while performing shapiro test in output i am getting nan and 1 value for stats and p . can you please tell what could be the reason?
@mukeshkumar-kh2fh
@mukeshkumar-kh2fh 2 жыл бұрын
far better explanation than any other youtube channel....awsm work Aman
@UnfoldDataScience
@UnfoldDataScience 2 жыл бұрын
Thanks alot Mukesh.
@mukeshrar265
@mukeshrar265 4 жыл бұрын
Thank you so much, Aman, for making hypothesis testing very simple
@UnfoldDataScience
@UnfoldDataScience 4 жыл бұрын
My pleasure Mukesh :)
@k10-w3s
@k10-w3s 3 жыл бұрын
In chi Square test sub categories in features one should be equal to sub categories of feature two or can be any ? exp: feature 1 : pass ,fail , grace feature 2 : study , no study , mixed or exp: feature 1 : pass ,fail , grace feature 2 : study , no study , Which one is correct ?
@anuradhabalasubramanian9845
@anuradhabalasubramanian9845 3 жыл бұрын
Thanks for a simplified explanation Sir
@UnfoldDataScience
@UnfoldDataScience 3 жыл бұрын
Welcome.
@randommagic3073
@randommagic3073 7 ай бұрын
This is what i was looking for and suddenly it appeared on my feed yesterday. I watched it and feeling blessed. Thanks for this content. Can i add this project on my resume anyhow if yes what would be title and description??
@YashpalNSharma
@YashpalNSharma 4 жыл бұрын
Very simple and effective summary of various hypothesis tests and statistical thinking. Thank you!
@UnfoldDataScience
@UnfoldDataScience 4 жыл бұрын
Thanks Yashpal :)
@YashpalNSharma
@YashpalNSharma 4 жыл бұрын
Unfold Data Science I have always had very basic question. How do we get to decide the Null Hypothesis statement? It could also be the exact opposite of the current statement - and the results will be reversed as well. I think I’m missing something basic but unsure what that is. Please clarify. Thanks in advance!
@UnfoldDataScience
@UnfoldDataScience 4 жыл бұрын
Very good question, this shows u r not satisfied until you understand the concept well. Coming to the answer, let's take "T-Test", when u go to the documentation of the package for example scipy. Ttest in python, u will clearly see it's written that " Null assumption is means are equal" Just an example, similarly for other packages and tools as well null hypothesis is mentioned.
@YashpalNSharma
@YashpalNSharma 4 жыл бұрын
Unfold Data Science Sure Aman. Will check it out. Thank you for clarifying.
@jyotigoyal2295
@jyotigoyal2295 3 жыл бұрын
very nice explanation sir...
@UnfoldDataScience
@UnfoldDataScience 3 жыл бұрын
Thanks Jyoti.
@nayemhasan5015
@nayemhasan5015 2 жыл бұрын
simple and useable. Great video.
@bennurakash
@bennurakash 3 жыл бұрын
Thank you for the data science notes in python.
@UnfoldDataScience
@UnfoldDataScience 3 жыл бұрын
Welcome :)
@yogeshbharadwaj6200
@yogeshbharadwaj6200 2 жыл бұрын
Very nice explanation with perfect examples, thanks a lot.....
@UnfoldDataScience
@UnfoldDataScience 2 жыл бұрын
Most welcome Yogesh
@santoshvaidya3752
@santoshvaidya3752 3 жыл бұрын
Aman - very good, keep it up
@UnfoldDataScience
@UnfoldDataScience 3 жыл бұрын
Thanks Santosh.
@ramakrishna1008
@ramakrishna1008 3 жыл бұрын
Great explanation, simple and straight to the point. Can you please share the jupyter note book shown in this video.
@letsplay0711
@letsplay0711 2 жыл бұрын
9:32 , How you have deciced value above 0.85 and below -0.85 is correlated? Please respond...
@dnyanjal5471
@dnyanjal5471 2 жыл бұрын
You have not mentioned How to state which is null hypothesis and which is alternate hypothesis? Please explain this.
@shivanshjayara6372
@shivanshjayara6372 3 жыл бұрын
sir by saying that 'observations are identical...can we also say that their dispersion is same from mean? or by that statement do we mean that it is already assume that data is normally distributed by default.....that what null hypothesis is?
@kartikmeher8061
@kartikmeher8061 2 жыл бұрын
Thank you🙏🙏🙏
@UnfoldDataScience
@UnfoldDataScience 2 жыл бұрын
Welcome Kartik
@shreyasb.s3819
@shreyasb.s3819 4 жыл бұрын
Very good explaination. Thanks
@UnfoldDataScience
@UnfoldDataScience 4 жыл бұрын
Glad it was helpful Shreyas!
@ArpitYadav-ws5xe
@ArpitYadav-ws5xe 4 жыл бұрын
Excellent Aman. You are excellent. Keep making more videos. Keep it up
@UnfoldDataScience
@UnfoldDataScience 4 жыл бұрын
Thank you so much Arpit 😀
@sudipnarayanchoudhury1112
@sudipnarayanchoudhury1112 3 жыл бұрын
@Aman, What is that 'stat' output signifies? We are getting two results as the output of the tests one is p value another is stat(like stat=0.97 & p=0.0047). what is that stat value??
@harshmohan8419
@harshmohan8419 2 жыл бұрын
I AM not able to understnd, whemn we have linar regression Ml models to calculate R square and other parameters.. why would we calculate the spearman cofficient... why would test, when ML model show lots paramters precis confusion matrix... Im bit confuse any guidance will helpful
@ravneetkaur7278
@ravneetkaur7278 2 жыл бұрын
PLEASE reply to these queries- These misconceptions must get cleared as these foundational concepts of Data Science. 1. In Pearson correlation you are checking p value here but in last video u said we DO NOT CHECK P value instead we check closeness of 2 variables on the scale of -1 to 1. 2. Normality test is just to check the SHAPE -if the data is normally distributed or not ? What is the USE of P value here? How is checking P value in Shapiro beneficial? what is null hypothesis here ? Didn't understand. 3. Didn't understand DIFFERENCE in Shapiro test and k^2 normality test. What are VARIABLE TYPE here Categorical, Continuos etc what? 4. Pearson and Spearman are just 2 WAYS of checking correlation? Any of these can be chosen randomly to check correlation ? Or are there any SPECIFIC RULES to choose? 5. "value>0.85 on neg side , value below 0.85 s considered big " you spoke at 9:58th min FROM WHERE are we taking 0.85 ? what is its significance? Didn't understand. 6. Moreover, correlation is applied on CONTINUOUS vs CONTINUOUS data ONLY and measured from -1 to 1 whereas chi sq is applied on CATEGORICAL vs CATEGORICAL to check P value then WHY are we COMBINING chi sq and correlation when working on categorcal vs categorical.. It's confusing. Thanks in Advance!!
@ravanshyam7653
@ravanshyam7653 3 жыл бұрын
sir ,sharply at the time of 5:13 u telled that graph is left skewed actually it is right skewed am right???
@UnfoldDataScience
@UnfoldDataScience 3 жыл бұрын
Yes, I sometimes confuse between left right may be you are right, thank you.
@umashankarverma3179
@umashankarverma3179 4 жыл бұрын
Very good explanation sir 👌
@UnfoldDataScience
@UnfoldDataScience 4 жыл бұрын
Keep watching Uma.
@techinfo89
@techinfo89 Жыл бұрын
great explanation.. how to test null hypothsis
@divyamsaxena3198
@divyamsaxena3198 3 жыл бұрын
can we have hypothesis testing practice session on different data please...........
@datascienceworld7041
@datascienceworld7041 3 жыл бұрын
Basically P value is used to check if the null hypothesis is true or not ??? Is that correct
@UnfoldDataScience
@UnfoldDataScience 3 жыл бұрын
Yes Amrit
@datascienceworld7041
@datascienceworld7041 3 жыл бұрын
And its aim is to reject the null Hypothesis and Accept the Alternate Hypothesis??
@varshakotwal3726
@varshakotwal3726 4 ай бұрын
how we can do not normal distribution to narmality on spss
@souravbiswas6892
@souravbiswas6892 4 жыл бұрын
Awesome explanation 👍👍
@UnfoldDataScience
@UnfoldDataScience 4 жыл бұрын
Thanks again Sourav. Your comments are my power engine.
@souravbiswas6892
@souravbiswas6892 4 жыл бұрын
You are creating the video at 2am which clearly shows your dedication and love towards data science. Fully appreciated.
@dr.vinodkamble4618
@dr.vinodkamble4618 2 жыл бұрын
For which kind of data we need to do normality test?
@sagarmestry5514
@sagarmestry5514 3 жыл бұрын
Aman I've one doubt like do we have to do hypothesis testing on every problem statement? Or have to use for just regression based to check the assumptions?
@UnfoldDataScience
@UnfoldDataScience 3 жыл бұрын
Hi Sager, ideally yes to understand more about your data.
@NinadNakhwa
@NinadNakhwa 2 жыл бұрын
Hi Aman, Is it a Left skewed data or Right Skewed ?
@UnfoldDataScience
@UnfoldDataScience 2 жыл бұрын
Hi Ninad, Ahh I think I might hv said other way, can you read previous comments please, I think I hv clarified this.
@satyaki44
@satyaki44 4 жыл бұрын
Great playlist Aman Bhaiya ! Please keep up your good work..
@UnfoldDataScience
@UnfoldDataScience 4 жыл бұрын
Thank you, I will
@geekyprogrammer4831
@geekyprogrammer4831 2 жыл бұрын
This is content is not widely available. The way you simiplied complicated concept is remarkable!
@UnfoldDataScience
@UnfoldDataScience 2 жыл бұрын
Thanks a lot. Your comments keep me going.
@quamar0313
@quamar0313 2 жыл бұрын
Thanks bro Please make video on what to do after these testing that we got some useful information about the dataset what next
@UnfoldDataScience
@UnfoldDataScience 2 жыл бұрын
As soon as possible
@pramod3469
@pramod3469 4 жыл бұрын
Hi Aman please explain where and why we use all these statistical test in machine learning model and its use
@UnfoldDataScience
@UnfoldDataScience 4 жыл бұрын
Answered below.
@rsinh3792
@rsinh3792 3 жыл бұрын
Sir reviewer has asked me this question I don't know how to address it, can you please guide me "Use some statistical significant test such as T-test or ANOVA to prove you validate the proposed diagnostic model on patients and quality improvements of your method". I have two datasets. Dataset 1 was used to train the model and dataset 2 was used to validate the trained model. I have trained the ML model deployed it and Validated it on new data and presented the results. Actually, I have understood the question. Shall I apply the statistical test between the performance metrics of trained model results and validation results? Please help me, sir.
@jayashreehv5222
@jayashreehv5222 3 жыл бұрын
hi Aman, for correlation why not only heatmap or correlation table sufficient, why it should be tested again with P values or stats techniques
@UnfoldDataScience
@UnfoldDataScience 3 жыл бұрын
heatmap becomes difficult to analyze when u have more features.
@vishalrai2859
@vishalrai2859 3 жыл бұрын
sir re arrange the playlist part 2 is coming before part 11
@shriyagoswami8452
@shriyagoswami8452 3 жыл бұрын
Can you explain what is stat value?
@UnfoldDataScience
@UnfoldDataScience 3 жыл бұрын
Stat value, which part of video I spoke abt it? Just to correlate
@shriyagoswami8452
@shriyagoswami8452 3 жыл бұрын
@@UnfoldDataScience stat,p=spearmanr(data1,data2) what is this stat ??
@RohanB-xg6vg
@RohanB-xg6vg 3 жыл бұрын
What is the stat variable returned by those fuctions. Stat, pvalue = spearmanr(firstsample,secondsample)
@UnfoldDataScience
@UnfoldDataScience 3 жыл бұрын
Please refer here for detailed explanation: docs.scipy.org/doc/scipy/reference/generated/scipy.stats.spearmanr.html
@fahadnasir1605
@fahadnasir1605 2 жыл бұрын
Thank You for sharing this amazing tutorial. Will you be able to share the CSV file of the data you have used?
@UnfoldDataScience
@UnfoldDataScience 2 жыл бұрын
In the description
@spicytuna08
@spicytuna08 Жыл бұрын
what does it mean when cat variables are dep or indep? i understand when 2 continuous variables are dep and indeo,
@UnfoldDataScience
@UnfoldDataScience Жыл бұрын
cat variables will be dependent or target in case of classification scenarios
@vishaldas6346
@vishaldas6346 3 жыл бұрын
Hi Aman, I owe you a big one, this is the best explanation ever for hypothesis testing. I have a doubt suppose I have 2 categorical variables of 2 features and they are dependent, so in case of feature selection can we drop 1 feature?
@UnfoldDataScience
@UnfoldDataScience 3 жыл бұрын
Thanks Vishal. Yes for testing one variable at a time
@ramshaazeemi8851
@ramshaazeemi8851 3 жыл бұрын
How will we decide if p>0.05 then its an independent variable or dependent ?
@UnfoldDataScience
@UnfoldDataScience 3 жыл бұрын
Depends on what hypothesis we have taken.
@kalaisiva7437
@kalaisiva7437 3 жыл бұрын
@@UnfoldDataScience Sir, please provide some examples for P>0.05. I am having same confusion.
@parthbisht5597
@parthbisht5597 3 жыл бұрын
Sir what if we want to plot a correlation test on more than 2 independent categorical feature, please answer.
@abhishekgaurav7786
@abhishekgaurav7786 3 жыл бұрын
bro can we use correlation fun for categorical feature
@UnfoldDataScience
@UnfoldDataScience 3 жыл бұрын
for anything more than 2d, a 3d chart is needed.
@UnfoldDataScience
@UnfoldDataScience 3 жыл бұрын
Not pearson correlation.
@aa4734
@aa4734 3 жыл бұрын
Hi sir, please help me with deep learning and NLP notes as I'm not getting clear way...
@UnfoldDataScience
@UnfoldDataScience 3 жыл бұрын
You can find lot of content on web
@chetangondaliya
@chetangondaliya 3 жыл бұрын
Thank you so much for explanation. can u provide dataset which you used for this experiment?
@UnfoldDataScience
@UnfoldDataScience 3 жыл бұрын
drive.google.com/drive/folders/1XdPbyAc9iWml0fPPNX91Yq3BRwkZAG2M
@chetangondaliya
@chetangondaliya 3 жыл бұрын
​@@UnfoldDataScience This contain only Jupyter file. I want csv file which you used for date reading.
@r21061991
@r21061991 3 жыл бұрын
Great video.. Can you also make a video on the math behind all the tests..Would be great.. !!
@dracula5505
@dracula5505 3 жыл бұрын
Sir what to do when one columns is categorically and another is numerical????
@UnfoldDataScience
@UnfoldDataScience 3 жыл бұрын
Anova will work.
@riteshtripathi8626
@riteshtripathi8626 3 жыл бұрын
hi, would appreciate if you could share the loan_status dataset that you have used in hypothesis testing (checked in your google drive as well), as i have searched for the exact in kaggle, but looks to me different version, thanks
@UnfoldDataScience
@UnfoldDataScience 3 жыл бұрын
Hi Ritesh, hope u are doing good and staying safe with your near and dear ones. See if you find data here: drive.google.com/drive/folders/1XdPbyAc9iWml0fPPNX91Yq3BRwkZAG2M
@lekshmia.k.8264
@lekshmia.k.8264 2 жыл бұрын
Thanks a lot for the video Aman. I have a doubt what to do if I have less than 25 data for a categorical-categorical variables? If chi squared test didn't give the correct values, is there any other test which I can use in that case?
@UnfoldDataScience
@UnfoldDataScience 2 жыл бұрын
Fisher's exact test
@praneethaluru2601
@praneethaluru2601 3 жыл бұрын
Sir, can you make a video on why these tests are used in machine learning and which algorithms get affected by this and how the affect would happen.
@UnfoldDataScience
@UnfoldDataScience 3 жыл бұрын
Sure Praneeth, Thanks for suggesting. Noted.
@bsarath438
@bsarath438 3 ай бұрын
​@@UnfoldDataScience Sir,please provide a link for above comment if u done a video
@dataartist5195
@dataartist5195 3 жыл бұрын
hello sir your playlist is not arranged sequentially
@UnfoldDataScience
@UnfoldDataScience 3 жыл бұрын
Let me check
@sowmyatushar7487
@sowmyatushar7487 3 жыл бұрын
It is right skewed not left skewed. In the Video u mentioned it as left skewed.
@UnfoldDataScience
@UnfoldDataScience 3 жыл бұрын
Yes sometimes I confuse just like left/right on road. Sorry If I said left skewed.
@Amansingh-tr1cf
@Amansingh-tr1cf 3 жыл бұрын
u saved my ass..thanks Aman
@UnfoldDataScience
@UnfoldDataScience 3 жыл бұрын
Cheers Sir Jee :)
@tariqanwar8097
@tariqanwar8097 4 жыл бұрын
Sir, do more on hypothesis test in practical
@UnfoldDataScience
@UnfoldDataScience 4 жыл бұрын
Sure Tariq. More explanation in Part 2 is on the way.
@2galacticos
@2galacticos 4 жыл бұрын
Aman, thank you for clear and precise explanation of Hypothesis testing. I wanted to ask how often do you use Hypothesis tests in your data to day EDA. I have seen many courses where in EDA we are only doing univariate, bi and multivariate analysis, but, no one explains the use of hypothesis testing in EDA.
@UnfoldDataScience
@UnfoldDataScience 4 жыл бұрын
Hi Abhishek, thats a good question. Hypothesis testing is lot of hard work hence sometimes part of it are skipped however more you do, better you will know your data.
@hoangha6680
@hoangha6680 5 ай бұрын
the code "from numpy.random import randn' doesn't always generate normally distributed data. I tried and the test result is nor normally distributed. But thanks for the video anw
@UnfoldDataScience
@UnfoldDataScience 5 ай бұрын
Thanks for the info!
@akar_excel
@akar_excel 5 ай бұрын
what is normal distribution?
@UnfoldDataScience
@UnfoldDataScience 5 ай бұрын
A normal distribution, also known as the Gaussian distribution, is a continuous probability distribution characterized by its bell-shaped curve. It is fundamental in statistics and is often used to represent real-valued random variables with a distribution that is symmetric around the mean.
@ragsgags5901
@ragsgags5901 2 жыл бұрын
Please share loan_status.csv file, not available in drive
@shilashm5691
@shilashm5691 2 жыл бұрын
It is right skewed
@UnfoldDataScience
@UnfoldDataScience 2 жыл бұрын
Ok
@harshkashyap7822
@harshkashyap7822 9 ай бұрын
this video worth a value of a paid live class, because they charge 50K for the entire course
@samratshimpi5093
@samratshimpi5093 4 жыл бұрын
It is the best video I have seen so far for hypothesis testing. My name is Ameya Girhe. I have sent you request on LinkedIn. could you please accept?
@UnfoldDataScience
@UnfoldDataScience 4 жыл бұрын
Thanks samrat. I will accept.
@PraveenKumar-kc3ge
@PraveenKumar-kc3ge 3 жыл бұрын
sir provide notebooks of this video
@sandipansarkar9211
@sandipansarkar9211 3 жыл бұрын
finisahe watching
@UnfoldDataScience
@UnfoldDataScience 3 жыл бұрын
Thanks sadipan for following my videos in proper order. Sure you will get topics understood.
@sandipansarkar9211
@sandipansarkar9211 3 жыл бұрын
@@UnfoldDataScience I talked with a friend Sourav Biswas and he said he knows you.You live in Bangalore (Varthur).I am also in Bangalore(Kadugodi).He is also in Bangalore
@cocgamingstar6990
@cocgamingstar6990 Жыл бұрын
it right skew data
@UnfoldDataScience
@UnfoldDataScience Жыл бұрын
My mistake. I accepted it before in comments I believe
@sagarlokare5269
@sagarlokare5269 Жыл бұрын
Confusing
@UnfoldDataScience
@UnfoldDataScience Жыл бұрын
Thanks for feedback Sagar. will have alook in video
Why no RONALDO?! 🤔⚽️
00:28
Celine Dept
Рет қаралды 79 МЛН
Thank you Santa
00:13
Nadir Show
Рет қаралды 32 МЛН
ML Was Hard Until I Learned These 5 Secrets!
13:11
Boris Meinardus
Рет қаралды 340 М.
Python for Data Analysis: Hypothesis Testing and T-Tests
20:07
What Is P Value In Statistics In Simple Language?
11:18
Krish Naik
Рет қаралды 311 М.
Causal Inference in Python: Theory to Practice
43:50
Data Science Festival
Рет қаралды 10 М.
Hypothesis Testing and The Null Hypothesis, Clearly Explained!!!
14:41
StatQuest with Josh Starmer
Рет қаралды 595 М.
Why no RONALDO?! 🤔⚽️
00:28
Celine Dept
Рет қаралды 79 МЛН