Cleaning NBA Stats Data With Python And Pandas: Data Project [part 2 of 3]

  Рет қаралды 13,930

Dataquest

Dataquest

Күн бұрын

Пікірлер: 34
@patrickmurray1988
@patrickmurray1988 2 жыл бұрын
Thanks for this project. I'm currently working through the Data Analyst in Python path and it's fun seeing the things I'm learning being put to use on projects outside of the lessons.
@SuperSumittanwar
@SuperSumittanwar 2 жыл бұрын
This is the second video i have seen on data quest and feels like vik is the rockstar, the way he think and solve data is remarkable 🤩
@futureverse9347
@futureverse9347 Жыл бұрын
As a former professional athlete who is looking to learn data science and statisitcs THANK YOU for this great work!
@Data_Man
@Data_Man 2 жыл бұрын
This was very helpful in applying things I just recently learned into something I have interest in. Plus now I have a dataset to do all kinds of analysis with.
@Dataquestio
@Dataquestio 2 жыл бұрын
Glad to hear it! -Vik
@nicesoundworks7954
@nicesoundworks7954 2 жыл бұрын
Thanks to you and the DQ team. here another DQ student, making and enjoying indata scientist path. You are doing a great great job. Again: [ ∞ 🙏 ]
@Dataquestio
@Dataquestio 2 жыл бұрын
Thanks a lot! Glad you're enjoying it :)
@IamDeftly
@IamDeftly Жыл бұрын
I'm running into a bit of an issue. After I combine the MVPs and Player data and I try to look at the data by sorting through the "Pts Won" for the MVPs, the data I get in the table has NaN values for everything except for the pts won, pts max and share. I'm not sure what went wrong or how to fix this. Any help?
@AndresIniestaLujain
@AndresIniestaLujain 2 жыл бұрын
How would you have dealt with null values in your dataframe? E.g. For a player during a given season, 3P% is 'null' for a player that had 0 3PA. Would you complete the data, leave it null, or is it context-dependent? If you would complete it, would you replace the null value with a calculation of their career average? If leaving it null, could you still run correlations without the null values affecting the corr values too heavily? Thanks! Really learning a lot from this series.
@Dataquestio
@Dataquestio 2 жыл бұрын
Hi Andres - it is definitely context dependent. It depends on what you'll be doing with the data. In this case, we're predicting who will win MVP. Someone with 0 3P attempts won't get any MVP votes, so we can replace it with 0 or drop the row. If you were trying to calculate a rating for each player, then replacing with career averages would make more sense. When finding correlations, rows with null values usually aren't considered, so you'd still be fine!
@AndresIniestaLujain
@AndresIniestaLujain 2 жыл бұрын
@@Dataquestio Thanks for your answer! Makes sense. However, I would disagree that someone with 0 3PT attempts won't get MVP votes. Ex: Paint dominant players like Shaquille O'Neil in 1997-98 :)
@Dataquestio
@Dataquestio 2 жыл бұрын
That's a fair point :) I was thinking modern era!
@afasfafafas
@afasfafafas 2 жыл бұрын
Great project! Looking forward to the ML model!
@Dataquestio
@Dataquestio 2 жыл бұрын
We'll be uploading this later in the week!
@bobbyjordan4532
@bobbyjordan4532 2 жыл бұрын
very great project! by the way, the * means that the player was an All-Star in that specific year!
@Dataquestio
@Dataquestio 2 жыл бұрын
Awesome, thanks! -Vik
@danielowolabi6891
@danielowolabi6891 2 жыл бұрын
Thanks for this project. Are we to create the mapping of team nicknames and team names csv file ourselves or it can be accessed somewhere too?
@Dataquestio
@Dataquestio 2 жыл бұрын
Hi Daniel - you can find the mapping file here - github.com/dataquestio/project-walkthroughs/blob/master/mvp/nicknames.txt
@justDlight
@justDlight 2 жыл бұрын
CSV file Nicknames can be created through this its a manual process you should do it yourself. ATL Atlanta Hawks BKN Brooklyn Nets BOS Boston Celtics CHA Charlotte Hornets CHI Chicago Bulls CLE Cleveland Cavaliers DAL Dallas Mavericks DEN Denver Nuggets DET Detroit Pistons GSW Golden State Warriors HOU Houston Rockets IND Indiana Pacers LAC Los Angeles Clippers LAL Los Angeles Lakers MEM Memphis Grizzlies MIA Miami Heat MIL Milwaukee Bucks MIN Minnesota Timberwolves NOP New Orleans Pelicans NYK New York Knicks OKC Oklahoma City Thunder ORL Orlando Magic PHI Philadelphia 76ers PHX Phoenix Suns POR Portland Trail Blazers SAC Sacramento Kings SAS San Antonio Spurs TOR Toronto Raptors UTA Utah Jazz WAS Washington Wizards
@vivekdwivedi3130
@vivekdwivedi3130 Жыл бұрын
can anybody tell me where i can get this mvp cvs file
@naschendani1474
@naschendani1474 2 жыл бұрын
huge thanks
@tarkanh2519
@tarkanh2519 2 жыл бұрын
Perfect video...
@romhen233
@romhen233 2 жыл бұрын
hey, i would like to know how to create the nicknames.csv
@Dataquestio
@Dataquestio 2 жыл бұрын
Hi Rom - I started with this list (en.wikipedia.org/wiki/Wikipedia:WikiProject_National_Basketball_Association/National_Basketball_Association_team_abbreviations), then added in some historical team codes/nicknames as well.
@romhen233
@romhen233 2 жыл бұрын
thank you. can you show/write how to do it because i’m stuck on it and cannot continue the project. i will really appreciate it.
@tarkanh2519
@tarkanh2519 2 жыл бұрын
Hi, how can we find relevant csv.files? Please support.
@Dataquestio
@Dataquestio 2 жыл бұрын
You can find the 3 csv files from the last part (teams, players, mvps) here - github.com/dataquestio/project-walkthroughs/tree/master/mvp
@vasoochigava5213
@vasoochigava5213 2 жыл бұрын
thanks, stats.apply(pd.to_numeric, errors='ignore') doesn't convert objects to integers and there is no error. to say the truth please in future don't skip part what you don't showcase( i am saying about nickname csv). I made the conversion by astype by the way. i've tried to understand the issue with unique values but they were integers for sure in case of age but it wasn't converting with to_numeric anyways
@Dataquestio
@Dataquestio 2 жыл бұрын
Hi Vaso - pd.to_numeric should convert all numeric columns. You do need to make sure to assign back, though, so `stats=stats.apply....`. You also need to make sure the columns are clean (don't have any non-numeric values in them), otherwise `errors="ignore"` will cause nothing to happen.
@Speakingmymind365
@Speakingmymind365 2 жыл бұрын
when we are combining teams with rest of the data combined i am getting suffix kindly help
@Dataquestio
@Dataquestio 2 жыл бұрын
Hi - can you please share the error message you are getting, your code, and the code right before/after your code?
@svetlanadolgushina4936
@svetlanadolgushina4936 2 жыл бұрын
regex = False trick works with "+" as well.
Web Scraping NBA Stats With Python: Data Project [Part 1 of 3]
43:43
Predicting the NBA MVP: Machine Learning Project [part 3 of 3]
47:49
Players vs Pitch 🤯
00:26
LE FOOT EN VIDÉO
Рет қаралды 126 МЛН
小丑揭穿坏人的阴谋 #小丑 #天使 #shorts
00:35
好人小丑
Рет қаралды 52 МЛН
World’s strongest WOMAN vs regular GIRLS
00:56
A4
Рет қаралды 40 МЛН
When u fight over the armrest
00:41
Adam W
Рет қаралды 26 МЛН
Fabric Fridays: Dynamic Parameters with Erik Svensen #60
1:09:13
Data Analysis with Python for Excel Users - Full Course
3:57:46
freeCodeCamp.org
Рет қаралды 3,1 МЛН
Learn How to scrape NHL Stats with Python and Pandas. The Final Part
25:39
Real World Data Cleaning in Python Pandas (Step By Step)
40:01
Ryan & Matt Data Science
Рет қаралды 85 М.
Solving real world data science tasks with Python Pandas!
1:26:07
Keith Galli
Рет қаралды 1,5 МЛН
Analyzing NBA Data in Python | NBA Data Analytics Project (part 2/2)
1:11:20
Players vs Pitch 🤯
00:26
LE FOOT EN VIDÉO
Рет қаралды 126 МЛН