Median, Mean, Mode, Percentile | Math, Statistics for data science, machine learning

  Рет қаралды 163,541

codebasics

codebasics

Күн бұрын

Пікірлер: 128
@vikashdas1852
@vikashdas1852 3 жыл бұрын
Subscribing to this channel proved to be lot more helpful than enrolling into college for graduation
@zishanafzal6671
@zishanafzal6671 3 жыл бұрын
You are the best teacher and have the best content on data analysis. NO need to go any channel.
@codebasics
@codebasics 3 жыл бұрын
I am happy this was helpful to you.
@georgetzimas6882
@georgetzimas6882 3 жыл бұрын
2:45 When you added an extra value you did not sort them in ascending order (7000,7500,8000) instead of (7000,8000,7500).
@HeavvvyRock
@HeavvvyRock 3 жыл бұрын
Mean, median, mode and percentile are also known as 'Measures of Central Tendency'.
@programmingwithraahim
@programmingwithraahim 3 жыл бұрын
Yeah bro I listened them from Khan Academy
@belfloretkoriciza5279
@belfloretkoriciza5279 2 жыл бұрын
Thank you so much Sir you're a good teacher and you're different from others because of the practice you demonstrate
@Damian-lu8sx
@Damian-lu8sx 3 ай бұрын
Clear, to the point, with real life examples. I've been learning pandas and I decided to do a recap on math. The fact that you provided examples in pandas is the happiest coincidence I've come across this week. Thank you!
@accountingsapayag
@accountingsapayag 2 жыл бұрын
As a. Beginner, I should say this is the best.
@chilledvibes5700
@chilledvibes5700 2 жыл бұрын
I have no words to say, really awesome series!
@wasimrajamiddya7560
@wasimrajamiddya7560 2 жыл бұрын
Thank you Sir, for making such kind of beginners friendly videos. I really enjoyed and learned a lot. Please make make more such kind of videos so that we can understand easily. ❤️
@instagramstarstories6682
@instagramstarstories6682 11 ай бұрын
Tell me Your insta id bro...plz
@Akshay-vq1uv
@Akshay-vq1uv 2 жыл бұрын
Your content and examples are great😃. Please don't stop making such easily explained content.
@arunadang7872
@arunadang7872 3 жыл бұрын
This series are masterpiece. Thank you.
@ankurhalke139
@ankurhalke139 2 жыл бұрын
Yeah . So true ...*uck education system
@bilalzubair6843
@bilalzubair6843 2 жыл бұрын
The best video to understand the concept of removing outliers
@Gadgets-v2w
@Gadgets-v2w Жыл бұрын
In the median example at minute 2:40 , shouldn't we order the values first before guessing about which value is the median? shouldn't the values be like that: 4,000 < 5,000 < 6,000 < 7,000 < 7,500 < 8,000 < 8,000 < 10 million so, the median would be the average of 7,000 and 7,500 which is 7,250
@MuhammadUmar-px6ij
@MuhammadUmar-px6ij 3 жыл бұрын
Before exploring the Codebasics channel. I never had an interest in Math & Stat. Thanks, Bro. Love & Respect from Pakistan
@codebasics
@codebasics 3 жыл бұрын
I am happy this was helpful to you.
@ravidawade5178
@ravidawade5178 3 жыл бұрын
Sr please make one video for freshers on real life data science project, your teaching skills are so simple everyone can understand very easily
@albertology
@albertology 5 ай бұрын
you earned a sub man!!! what an explanation
@sandeepwalia2430
@sandeepwalia2430 13 күн бұрын
Hi Dhaval, Greetings! Thanks a lot for creating wonderful videos for everyone and these videos are invaluable. I was watching your Mean, Median, Mode video and found one issue with the example at 2 mins and 47 secs where you explained the median of even values by mistake you didn't sort the values because the values you took for a median were (7000+8000)/2 = 7500 but instead of this it should be (7000+7500)/2 = 7250.
@Baburao_Aapte
@Baburao_Aapte 3 жыл бұрын
Your way of teaching is incredible, I love your videos. Whenever anyone ask me from where you learn all this then, I share link of ur channel to my juniors.
@codebasics
@codebasics 3 жыл бұрын
Thanks for sharing! I am happy this was helpful to you.
@kIocuchl2
@kIocuchl2 Жыл бұрын
2:43 there should be sorted values and median will be equals to (7000+7500)/2
@kakumanusridhanalakshmi3203
@kakumanusridhanalakshmi3203 Жыл бұрын
Ultimate Explanation🎉 Got a good idea on using mean and medain
@andresfrr100
@andresfrr100 3 жыл бұрын
Hi! in time = 2:44 for the median you take Tao and Prem, but they must be first sorted and Prem it is not counted in the median, but Sofia do. So m=(Tao + Sofia)/2?
@parishjain159
@parishjain159 2 жыл бұрын
Sir your way of teaching is very awesome
@alokbhushan9026
@alokbhushan9026 7 ай бұрын
At 3:02 adding prem to the dataset is disturbing the ascending sorting order. So the median should really be 7000+7500 / 2 = 7250.
@ridayefatima-w5i
@ridayefatima-w5i 5 ай бұрын
percentile_95=df.price.quantile(0.95) sir i do by this approach, it give 350 something if i increase value of quantile outlierss comes max gap that's why i remove all values upon this condition
@kelvinticllahuanacohuachac9562
@kelvinticllahuanacohuachac9562 2 жыл бұрын
furthermore to learn, this was even a enjoyable video, thanks a lot sir.
@sudarshanm.s6736
@sudarshanm.s6736 6 ай бұрын
Sir , how is the median of the data points 7500 , since the median has to be the average of Tao's and Sofia's income so it will be (7000+7500)/2 = 7250 right.. So I meant after arranging in ascending order
@sandeepwalia2430
@sandeepwalia2430 13 күн бұрын
Hi Dhaval, Greetings! Thanks a lot for creating wonderful videos for everyone and these videos are invaluable. I was watching your Mean, Median, Mode video and found one issue with the example at 7 mins and 17 secs where you explained the 25th percentile of the dataset. Here the 25th percentile is 1.75 which is less than 2 data points. As per my understanding, it should be between the first two data points (4750) or approximately the second data point which is 5000. Please let me know if my understanding is incorrect. I really appreciate any help you can provide.
@gowthamannarayanan361
@gowthamannarayanan361 5 ай бұрын
Thank you very much for detailed and nice explanation. Have a question, Do we need to remove outlier all time? What if the salary range is constant not like unusual high salary(Elan musk as per mentioned use case)?
@aquapisces
@aquapisces Жыл бұрын
16:04 df.income.iloc[3] =Nan will work too
@kirankapruwan8892
@kirankapruwan8892 Жыл бұрын
While calculating the median( when data values are even) we need to sort data values in ascending order.
@pavan2926
@pavan2926 3 жыл бұрын
Only one word loved your explination
@locu83
@locu83 2 жыл бұрын
Exactly what I wanted a mentor 👍🏻❤️🙂.
@sathesht7532
@sathesht7532 3 жыл бұрын
Hi sir, thanks a lot for your extraordinary teaching, I have learned lot and did my homework by following your machine learning tutorial. Sir, Can you do for a video about Generative Adversarial Network (GAN) for regression prediction?
@shubhampathare4892
@shubhampathare4892 Жыл бұрын
in the example at 3:00 u havent sort data in ascending order for median
@balajib.9561
@balajib.9561 3 жыл бұрын
Sir upload real life data science project 👍😁
@codebasics
@codebasics 3 жыл бұрын
On KZbin search for "codebasics data science project", you will find my videos please watch it
@harishkannan8023
@harishkannan8023 3 жыл бұрын
Beautiful explanation
@codebasics
@codebasics 3 жыл бұрын
Glad it was helpful!
@pijushdhar7310
@pijushdhar7310 4 ай бұрын
Sir Your quartile calculation seems to be wrong. The formula for the rank of 25th percentile is 25/100*(7+1) which is 2. This is universally accepted. It means the value should be 5000 only. I really don’t know how pandas is also doing the same mistake
@shreyas_._
@shreyas_._ 3 жыл бұрын
One of the best tutorial ❤️🔥
@codebasics
@codebasics 3 жыл бұрын
Glad it was helpful!
@iamthebearerofchrist
@iamthebearerofchrist 3 ай бұрын
why is using median better than leaving musk out and getting the average of the rest? is it compulsory for all data to be used?
@siddharudtevaramani1055
@siddharudtevaramani1055 3 жыл бұрын
Example of Mode is lit 😀
@ridayefatima-w5i
@ridayefatima-w5i 5 ай бұрын
Really great no raatta
@himanshusemwal1889
@himanshusemwal1889 3 жыл бұрын
Again Great Video Sir. I have a silly doubt. As you said we cant take average to fill null value if outlier have very large value like Elon musk(10 million$) and now we are going to take Median to fill na values.but nan values itself present at the middle of datapoints .So how we gonna calculate median if nan value is present at those points. median=(nan+nan)/2 ?
@abhijeetjain2098
@abhijeetjain2098 3 жыл бұрын
maybe you can take the median of non-null values and fill up
@shutterup24-7
@shutterup24-7 3 жыл бұрын
I think for taking median of dataset first we have to rearrange data to ascending order that will shift position of Nan value!!
@samvhora9076
@samvhora9076 2 жыл бұрын
@@shutterup24-7 yes thats the first step
@Kennerdoll
@Kennerdoll 11 күн бұрын
14:33 what if you have thousands of data points and you don't know the 100 pctile ?
@gauravvats3964
@gauravvats3964 25 күн бұрын
I love maths even though I flunked this subject in college, I have 5plus years of dev experience want to move to Data science pls share some beginners project
@MarinaYt-c2l
@MarinaYt-c2l Ай бұрын
when we can use the median??? when we have an outlier use case 1: (descriptive analysis التحليل الوصفي for cars) median ---> odd it is the middle median --> even is adding both numbers / 2 (their average) use case 2: Handling missing values 4:27 summary 4:52 Outlier: is a datapoint is very different than the rest of the data points How to remove the outlier? percentile النسبة المئوية 50th percentile of this dataset is 7000 (the median) = 50% of my data has value < 7000 and 50% of my data has value > 7000 100th percentile? what is the data for which 1oo % of that data is < or = that value? A\ 10M 0th percentile? obviously is 4000 what is the 25th percentile? ❌❌median / 2 ??❌❌ 🟩🟩🟩🟩total values of data = 7 25% of 7 ??? 0.25 * 7 = 1.75 -- but this is not a (Fractional Number or Decimal) so approximately will be 2 data sets or two values of data = 7 so what we will do is we pass the first two values and after will be the 25th of the data set which is the middle data between that and the data after (the data before + the data after) / 2 = (5000+6000)/2 = 5500 So what does the 75th percentile mean what is the 75% of the 7 data 7*0.75 =5.25 (7500+8000)/2= 7750 The range between the 25th and the 75th percentile: IQR: Interquartile Range normal range is the range between the min and max but the IQR
@bhuralal5299
@bhuralal5299 3 жыл бұрын
Thanks for making this video its very helpful
@Javeria_jamil00
@Javeria_jamil00 4 ай бұрын
Sir apney last maen outlier ko remove kaisey kia ?
@unstoppablesaad1978
@unstoppablesaad1978 2 ай бұрын
i am a jr. data analyst with less than a year experience if i apply for jobs is it expected of be to be able to code advance python funcions? cause now i feel that i am just able to understand code by debugging it but if i try to write similar code i am not able to but i k what function does what and if i a problem statement is given i will be able to identify what thing we should be doing to achieve the result but i am unable to implement it. please give your opinions on this. cause coming from non-it i am always havinng a sens eof insecurity that i dont know python enough.
@abhinavkumbalwar6837
@abhinavkumbalwar6837 Жыл бұрын
Very informative video.
@Murlik1604
@Murlik1604 3 жыл бұрын
One very basic question - Should the outlier removal be applied on labels (values to be predicted) as well if outliers exist on such data labels as well ?
@architchaudhary1791
@architchaudhary1791 3 жыл бұрын
No
@annonymous.
@annonymous. 2 жыл бұрын
Why don't we fill missing values with mode? Mode is the one that appears most but why we use mean and median most of the time?
@soheilpalermo491
@soheilpalermo491 3 жыл бұрын
Thank you that was very informative content.
@Malayalam_learner
@Malayalam_learner 2 ай бұрын
Im hoping that by end of covering all playlist ill become master at data science and the following know of ML,DL,LLM
@mivaangadewadvlogs
@mivaangadewadvlogs 3 жыл бұрын
Hi Sir,can we use multiple median for multiple NaN data like you did in sofia;s case?
@BusinessAnalyticsTV
@BusinessAnalyticsTV 3 жыл бұрын
Awesome learning 🆗😎👍
@codebasics
@codebasics 3 жыл бұрын
Glad you enjoyed it
@mimosveta
@mimosveta 3 жыл бұрын
am I just scatter brain, or did you not include the link to video where you explain how to use iqr to remove outliers? I only see a link to a playlist, but none of them seem to be on that particular topic? EDIT: okay, seems you explained it later in this video, but it really sounded like you had a link for us...
@codebasics
@codebasics 3 жыл бұрын
mimosvera, you are right I forgot to include a link but I just added it now. Please check video description
@d3v487
@d3v487 2 жыл бұрын
Hi , I have a dataset where 3 columns are independent categorical features and 5 dependent features that are 10th ,25th, 50th ,75th , 90th percentile of annual wage. How can I get values (annual wage ,which is missing) from the 5 percentile columns ?
@lathaloganathan4429
@lathaloganathan4429 Жыл бұрын
So, How to identify there is an outlier in the dataset? please calrify
@momincomputer9967
@momincomputer9967 Жыл бұрын
great sir 🥰
@VishalSingh-dv2vg
@VishalSingh-dv2vg 7 ай бұрын
Sir what if the data is missing from or below 25% ,75% then how to find The Average.please reply
@mvcutube
@mvcutube 3 жыл бұрын
Thanks for such a nice tutorial
@codebasics
@codebasics 3 жыл бұрын
Glad it was helpful!
@cyptowithkelv
@cyptowithkelv Жыл бұрын
do you have any full course on data analysis?
@iamthebearerofchrist
@iamthebearerofchrist 3 ай бұрын
links to softwares used?
@vidhikapadia9700
@vidhikapadia9700 3 жыл бұрын
What is the difference between 0.99 and 0.999 quantile range as in exercise 0.999 is used?
@renuprasadnaidu7554
@renuprasadnaidu7554 5 ай бұрын
Sir, could you please add the assignment link?
@ParulBedi
@ParulBedi 3 жыл бұрын
what is the difference between Linear Quantile and Midpoint quantile ??
@SURAJKUMAR-ug4oi
@SURAJKUMAR-ug4oi 3 жыл бұрын
Sir there could have been possibility that sofia's income would really high then median will not work well?
@ParamitasPotpourri
@ParamitasPotpourri 8 ай бұрын
I'm near about 50 . I have completed MCA from IGNOU and Digital marketing from NIIT imperia. I worked as a software developer and now im a digital marketer. If I want to change my career in data science after learning this field, can i get a job in data science field?
@shantanughode275
@shantanughode275 3 жыл бұрын
Is the amount of statistics required for data science and data analytics the same?
@prathampatel582
@prathampatel582 Жыл бұрын
why cannt we use trimmed mean?
@arupgorai2320
@arupgorai2320 3 жыл бұрын
Sir I want to know which language is very important? Should we start with Java or python
@saikatdutta1991
@saikatdutta1991 8 ай бұрын
Consider my data points: 100 100 100 100 here the 50th percentile which is 100 is kinda misleading right? because 2 more 100 values are present in the right side of median. SO.. 100% of the data values are equals to 50th percentile. Can you please explain where I am confused??
@shivasiddharthnarayanan
@shivasiddharthnarayanan 4 ай бұрын
I am not sure still, you can double-check with someone else too .In your case, you should consider mode as your measure and ignore mean or median.
@_craig_
@_craig_ 3 жыл бұрын
Nice video. I would like to suggest a change. 100th percentile doesn't exist, only 99th. In your example, Musk would have to be earning higher than himself to be the 100th percentile.
@friendonymous
@friendonymous Жыл бұрын
What is the difference between average and mean?
@HitmanBlitz15
@HitmanBlitz15 3 жыл бұрын
Sir can u explain the steps to become a data analyst and skills required for that
@codebasics
@codebasics 3 жыл бұрын
On KZbin search for "codebasics learn data analyst skills", you will find my videos please watch it
@HitmanBlitz15
@HitmanBlitz15 3 жыл бұрын
@@codebasics tq sir
@sundar6323
@sundar6323 3 жыл бұрын
Is careerera a good institute to join as a beginner. Im final yr ECE student.
@mayur_variya1219
@mayur_variya1219 Жыл бұрын
in case of even n.of data point you have not sorted them so median is wrong
@universal4334
@universal4334 3 жыл бұрын
For suppose the data is like this 4,4,6,7,40,100,110,120,1300...in this case taking median doesn't make sense right ....same for mean outlier 1300 involved...and for mode also 4,4 just repeating 4 for 2 times doesn't make sense right... What to do in this case please any one answer me ...could we find solution from this video..
@codebasics
@codebasics 3 жыл бұрын
Taking mode of 4 is perfectly ok because you are looking for a value that is most frequently occurring and 4 is that value. It really depends on what problem you are trying to solve here. Can you suggest what type of dataset this is? You just made up the values and are generally curious about such distribution?
@universal4334
@universal4334 3 жыл бұрын
@@codebasics I just take it as an example...but just for repeating 4 for 2 times blindly we can't take 4 for filling the missing value right because it is far less than other higher values
@pankajjoshi8292
@pankajjoshi8292 Жыл бұрын
Power Bi KO Course Kaha Cha Hola?
@micagar2510
@micagar2510 3 жыл бұрын
Should we first learn pandas then attempt exercises?
@philtoa334
@philtoa334 3 жыл бұрын
so clear, thx.
@codebasics
@codebasics 3 жыл бұрын
Glad it helped!
@shariqueansari9921
@shariqueansari9921 3 жыл бұрын
Sir, I need your suggestion. Can you help me ?
@dimpisayed9710
@dimpisayed9710 3 жыл бұрын
How can i code in Jupyter, just like you.
@ramananagavelli3055
@ramananagavelli3055 11 ай бұрын
how do you know that your data has oulier
@catherinezeng4917
@catherinezeng4917 3 жыл бұрын
Hi, I'm a bit confused with the solution of the exercise. To me, the outlier is not simply removed by percentile, we should exclude the line with 365 availability and 0 reviews + 0 availability and 0 reviews because those lists are just "ghost" lists that no one actually rent them or just the data is not accurate. If we go further down, we should probably clean the data by review date also, I see some of them are with 2011 date, but if we are analyzing the average of this/recent year then there should be a cut off of the latest year we can use. Please let me know your thoughts. Thanks.
@codebasics
@codebasics 3 жыл бұрын
Totally agreed with your thoughts here. Percentile is just one of the ways, using common sense simple logic is totally a legit way of treating outliers
@catherinezeng4917
@catherinezeng4917 3 жыл бұрын
@@codebasics Thank you for replying to me so quickly, so if I apply what I said in the post first and then apply percentile, is that going to be right, or let's say with better accuracy? Also, how do we measure the accuracy? should the mean be close to the 50% percentile? how do we know our analysis is good or bad? Thank you so much!
@TradewithSalim
@TradewithSalim Жыл бұрын
2:44 median will be 725
@mrrshaqproduction1255
@mrrshaqproduction1255 Ай бұрын
Legend
@sagarhirapara5455
@sagarhirapara5455 3 жыл бұрын
Sir tamari sathe contact kai rite kari saku?
@wallahengineer9989
@wallahengineer9989 8 ай бұрын
Sandeep Jain sir GFG samjhne wale haath uthao😅😅
@EvolutionTube957
@EvolutionTube957 3 жыл бұрын
why 0.999 in the exercise ?
@arshad1781
@arshad1781 3 жыл бұрын
nice
@universal4334
@universal4334 3 жыл бұрын
It is good if you should have taught why not median and mode in some cases
@financewithsom485
@financewithsom485 3 жыл бұрын
removing elon from twitter as an outlier is also great
@troubution
@troubution 3 жыл бұрын
The funniest part is if Elon Musk lives in our town😂😂
@codebasics
@codebasics 3 жыл бұрын
Ha ha.. yes he is my neighbor ☺️🧐
@Desarrolloestructrur4lñi4322
@Desarrolloestructrur4lñi4322 2 жыл бұрын
🙏🏻
@akshitsinghal8590
@akshitsinghal8590 3 жыл бұрын
Sorry sir , you miss one part in the video first we have to sort the nos. ( When the count of no is even (while finding median )
@aravintht3774
@aravintht3774 3 жыл бұрын
1:55
@ankurhalke139
@ankurhalke139 2 жыл бұрын
This is legend . Go to hell teachers and education system...
@RH-hv4ir
@RH-hv4ir 6 ай бұрын
The video is great but i didnt like the exercise because there is more in it than it has been covered in the video
@Malayalam_learner
@Malayalam_learner 2 ай бұрын
I didn't get percentile in first glance
@jayasurya3864
@jayasurya3864 2 жыл бұрын
You really wish musk to be your neighbour it seems
When you have a very capricious child 😂😘👍
00:16
Like Asiya
Рет қаралды 18 МЛН
Гениальное изобретение из обычного стаканчика!
00:31
Лютая физика | Олимпиадная физика
Рет қаралды 4,8 МЛН
Enceinte et en Bazard: Les Chroniques du Nettoyage ! 🚽✨
00:21
Two More French
Рет қаралды 42 МЛН
Transformers (how LLMs work) explained visually | DL5
27:14
3Blue1Brown
Рет қаралды 4,2 МЛН
About Data Analyst Bootcamp 4.0!
17:39
codebasics
Рет қаралды 15 М.
How I use SQL as a Data Analyst
15:30
Luke Barousse
Рет қаралды 852 М.
Probability and Statistics: Overview
29:43
Steve Brunton
Рет қаралды 94 М.
Teach me STATISTICS in half an hour! Seriously.
42:09
zedstatistics
Рет қаралды 2,9 МЛН
How I use Python as a Data Analyst
13:56
Luke Barousse
Рет қаралды 387 М.
All Learning Algorithms Explained in 14 Minutes
14:10
CinemaGuess
Рет қаралды 286 М.
Math Antics - Mean, Median and Mode
11:04
mathantics
Рет қаралды 6 МЛН