Solving real-world data analysis problems with Python Pandas! (Lego dataset analysis)

  Рет қаралды 87,760

Keith Galli

Keith Galli

Күн бұрын

Пікірлер: 120
@KeithGalli
@KeithGalli 2 жыл бұрын
Level up your data science skills with courses, projects, and competitions offered by DataCamp! Use my link below and check out the first chapter of any course for FREE! :) datacamp.pxf.io/c/3588040/1012793/13294
@masternobody1896
@masternobody1896 2 жыл бұрын
can you do some google job coding. so how can i get a job
@KeithGalli
@KeithGalli 2 жыл бұрын
Big shout-out to my mom for not throwing away my Legos! She's the real MVP
@bobbyg603
@bobbyg603 2 жыл бұрын
Thanks mom!
@vishwasjajpura796
@vishwasjajpura796 2 жыл бұрын
Finally Keith will build his LEGO
@ocraking
@ocraking 10 ай бұрын
nice Kevin Durant reference
@KenJee_ds
@KenJee_ds 2 жыл бұрын
dude, loved the intro!
@KeithGalli
@KeithGalli 2 жыл бұрын
Hahaha thanks man :). Very happy that my mom didn't throw out all of my legos!
@markomarjanovic8348
@markomarjanovic8348 Жыл бұрын
Absolutely love the raw natural style you are doing, hope everyone else appreciates it too, keep going buddy, you are amazing!
@DataProfessor
@DataProfessor 2 жыл бұрын
Wow the Lego stop motion was awesome!
@ahsanshah1866
@ahsanshah1866 2 жыл бұрын
Data professor is here 😀
@rafaelmello8194
@rafaelmello8194 2 жыл бұрын
I'm a begginer in Python and I'm learning a lot from you. You are an awesome teacher. Your pacing and didactic are perfect. Thanks a lot for your effort
@rksingh1997mp
@rksingh1997mp 2 жыл бұрын
He’s back baby!!
@alan6506305
@alan6506305 2 жыл бұрын
God, this is brilliant. I watched the other two videos of yours on Pandas. You are a great teacher and friend. Thank you very much for your hard work and kindness.
@simonvanwijk5178
@simonvanwijk5178 2 жыл бұрын
Man so good to have you back! If it was not for you I would have not gotten a role as a DA as you helped me the most in the beginning.
@lVaNeSsA90
@lVaNeSsA90 2 жыл бұрын
Thanks for being honest while you search for syntax in the beginning. Love this raw, step by step video. I'm using your videos on my project to get inspired ❤️ thanks for being a good tutor 😊
@logannon
@logannon 2 жыл бұрын
Dude, I thought you were dead. Your videos have helped me so much. Glad to see you back!
@leomiao5959
@leomiao5959 2 жыл бұрын
The man is back. The hero is back for us!!
@thebeeskhakis7145
@thebeeskhakis7145 2 жыл бұрын
I'm so happy you're back. Your videos helped me get my new job!
@FIBONACCIVEGA
@FIBONACCIVEGA 2 жыл бұрын
This video has been a true inspiration to continue learning. I'm doing the datacamp since I want to change my field and I've always liked programming and analyzing data. But he didn't know if he could use the learned knowledge to use it in real life. Now I know that everything I have learned is what is used in real life data analysis. Saludos
@amansorout.6779
@amansorout.6779 2 жыл бұрын
Happy to see you back, fighting with something serious, you are not alone.
@weitingteng3241
@weitingteng3241 2 жыл бұрын
Great great and great to see you back
@stratascratch
@stratascratch 2 жыл бұрын
Good to see you’re back!
@JW-pu1uk
@JW-pu1uk 2 жыл бұрын
I really like the thought process in these videos. It's very raw, and really will translate well to an actual work project.
@PaYaMv2
@PaYaMv2 2 жыл бұрын
Good to have you back my dude! Loooooooved this!
@danielsantoyo2640
@danielsantoyo2640 2 жыл бұрын
Im so happy to see you are back! Panda and Numpy tutorials would be great !!! I’m currently trying to learn panda and numpy for data analytics and this video was super interesting !!! Thanks Keith keep going you are doing great 💯
@Omzodijacky
@Omzodijacky 2 жыл бұрын
Man , I'm happy you are back ! you were truly missed
@H99x2
@H99x2 2 жыл бұрын
These type of videos are your strengths! Great tutorial and explanation Keith
@YunusFidan_
@YunusFidan_ 2 жыл бұрын
Good to see you uploading again!!
@cyrilodoi6868
@cyrilodoi6868 2 жыл бұрын
So good to have you back man! 💯
@terrytas13
@terrytas13 2 жыл бұрын
Welcome back Keith, so good to see your face again. Stay well my friend!
@KeithGalli
@KeithGalli 2 жыл бұрын
Glad to be back!! :)
@qalinlekhaliif5518
@qalinlekhaliif5518 2 жыл бұрын
Thanks a lot man. Your videos are helpful and entertaining as well. We appreciate your great work.
@Sensei10238
@Sensei10238 2 жыл бұрын
Finally back! It helped me a lot in learning python! Thank you so much!
@ben-tiki
@ben-tiki 2 жыл бұрын
Another great video Keith! Glad to see yo back. Awesome that you got to work with datacamp. Please if you can make a video o OpenAI it would be awesome. Ive been using their API and its awesome
@itsReshad
@itsReshad 2 жыл бұрын
Love the great content! Please dont stop! You have an impeccable way of teaching its amazing
@tuandino6990
@tuandino6990 2 жыл бұрын
I've been waiting for this
@MashiroRedo
@MashiroRedo 2 жыл бұрын
Waited so long! Thank you
@ocraking
@ocraking 10 ай бұрын
Dude, you ROCK
@terrytas13
@terrytas13 2 жыл бұрын
Love the introduction!!!
@Viralvlogvideos
@Viralvlogvideos 2 жыл бұрын
welcome back to your first tutorial after long back :P
@kartikeyasharma9908
@kartikeyasharma9908 2 жыл бұрын
Hi Keith, loving the video tutorials!
@dharshankumar2522
@dharshankumar2522 2 жыл бұрын
Keith is back...yeahhhh
@1990andstillgoing
@1990andstillgoing 2 жыл бұрын
props for sharing your knowledge man, its really easy to understand and apply what you're doing (Y)
@Magmatic91
@Magmatic91 2 жыл бұрын
Did this project on DataCamp. Was a lot of fun.
@rafaelcastellarmartinez3498
@rafaelcastellarmartinez3498 2 жыл бұрын
Hi Keith, just tried to do the project with you and i got that Star Wars was not the most popular theme in 2004 - Harry Potter and 2017 - Super Heroes, weird that datcamp test said ok, but i did the math manually and harry potter was the most popular in 2004, thanks for your videos. an student from Colombia Latin America!
@adelekeemmanuel4917
@adelekeemmanuel4917 Жыл бұрын
omg... i just did the exercise myself and i discovered the same thing too... Came ti check the video but im seeing something else
@manfungnewmanyu1426
@manfungnewmanyu1426 2 жыл бұрын
Yeah!!! Your tutorial is very great and help me so much at the AI master course .
@putyah
@putyah 2 жыл бұрын
Awesome video. Small detail: On the new era answer you typed the variable in. It would be nicer to drop every value that is Star Wars. Next select the remaining year as an variable. When the dataset is changed the variable is dynamic so the answer would still be correct.
@KeithGalli
@KeithGalli 2 жыл бұрын
Good suggestion! I agree that would be a better way to go about it :)
@lucaspioli7970
@lucaspioli7970 2 жыл бұрын
Love your videos! Keep going
@sanjeetlal1873
@sanjeetlal1873 2 жыл бұрын
Legend's back❤️
@baggid6257
@baggid6257 2 жыл бұрын
He is back~!
@jongcheulkim7284
@jongcheulkim7284 2 жыл бұрын
Thank you, sir. I had lots of fun^^
@azrmuradl6420
@azrmuradl6420 2 жыл бұрын
Please provide more such kind of videos, or as you always do, give us tips about how we can find such kind of real world ds projects online.
@ДимитърСираков-щ7ы
@ДимитърСираков-щ7ы 2 жыл бұрын
keep up the good work!
@tuandino6990
@tuandino6990 2 жыл бұрын
Question 2: theme_count_by_year = licensed_lego_set.groupby('year')['parent_theme'].value_counts().unstack() theme_count_by_year.fillna(0, inplace=True) theme_count_by_year = pd.DataFrame.transpose(theme_count_by_year) Or you can use pivot_table function. By approaching in this way you can create a data frame that's easy to do plot (heatmap) and make high number pops out.
@tuandino6990
@tuandino6990 2 жыл бұрын
@Josh Yorko nice
@davida99
@davida99 2 жыл бұрын
Yoooo love the vids
@codewithkarthik7136
@codewithkarthik7136 2 жыл бұрын
nice video keith
@aditiparashar9171
@aditiparashar9171 Жыл бұрын
you are freakingly smart!
@kotharidhruv75
@kotharidhruv75 2 жыл бұрын
w8ing fr more such videos
@kirubaselvi6754
@kirubaselvi6754 2 жыл бұрын
Keith, Pytorch tutorial please
@KeithGalli
@KeithGalli 2 жыл бұрын
I definitely want to! I need to spend considerable time reviewing and building up my own PyTorch skills before I make a tutorial on it.
@freddy4videos
@freddy4videos 2 жыл бұрын
thank you, much love
@ChileHeroico
@ChileHeroico 2 жыл бұрын
keep doing more videos pls :D
@rodrigo100kk
@rodrigo100kk 2 жыл бұрын
This dude is cool, this chanel too.
@Levy957
@Levy957 2 жыл бұрын
that task #2 was really hard to do alone
@merterisen
@merterisen 2 жыл бұрын
16:52 how did you change 'Star wars' text immediately?
@KeithGalli
@KeithGalli 2 жыл бұрын
Lol that was just video editing xD.
@admonitoring-pi9os
@admonitoring-pi9os 9 ай бұрын
Hello there. I hope you are good. I am a little late with this comment because this video is already more than 2 years old but since i have started learning python now its the right time for me. where can i find the codes you explained in the video bcz no code is availbale in the project file at the github provided link.
@guisande
@guisande 2 жыл бұрын
Hey Keith, I'm divided between going towards data science or cyber security. I love both but I kinda needs to make money by now. Do you think I can own money in a short time in data science? Working as a freelancer or supporting small companies... Edit: I'm glad that you came back. Really love your videos
@adeshmishra1671
@adeshmishra1671 2 жыл бұрын
Go for Cybersecurity brother, Since difficulty level is medium.. But while earning 💰 you can also learn data scientist!!
@ratchakoon
@ratchakoon 2 жыл бұрын
themes.csv which you provided on github does not have 'is_licensed' field. Is 'parent_id' filed as same as 'is_licensed' field?
@KeithGalli
@KeithGalli 2 жыл бұрын
A little confusing, but you want to use parent_themes.csv, not themes.csv !!
@ratchakoon
@ratchakoon 2 жыл бұрын
@@KeithGalli Thank you
@raghavgoyal3324
@raghavgoyal3324 2 жыл бұрын
please upload a project every week
@KeithGalli
@KeithGalli 2 жыл бұрын
I'll try my best!
@damarbowo
@damarbowo 2 жыл бұрын
Can I see your membership playlist? I can't find that playlist
@KeithGalli
@KeithGalli 2 жыл бұрын
Hmm I'm not sure what you are asking to see, can you clarify?
@damarbowo
@damarbowo 2 жыл бұрын
@@KeithGalli you have a membership benefits. One of the benefit is got playlist or videos for member. Do you have an example the video or playlist for member join your channel? Hope you understand
@KeithGalli
@KeithGalli 2 жыл бұрын
I just started my memberships last week so I haven't posted any exclusive videos there yet. To get an idea of the types of content I'll post there, check out these videos kzbin.info/www/bejne/p5-2d2uPlrWrbZo kzbin.info/www/bejne/hZyooIN_hNypnsk
@damarbowo
@damarbowo 2 жыл бұрын
@@KeithGalli I'll wait Keith. Regards
@KeithGalli
@KeithGalli 2 жыл бұрын
Sounds good!
@shahoftrading
@shahoftrading 2 жыл бұрын
question: when you merge when using left_on and right_on ...we get the merged df. So for the merged df and under parent_theme why are most if not all of those are "Legoland" and all IDs are 411? also how do we check the full tabular data -- print(df)?
@baburamchaudhary159
@baburamchaudhary159 2 жыл бұрын
in line [99] ie. .groupby(['year', 'parent_theme']) and in next line: .drop_duplilcates(['year']) since we already have grouped by 'year' and 'parent_theme' [I think, it groups unique year and parent_theme] why do we need to drop duplicates by 'year'?
@БулатМиннуллин-р8щ
@БулатМиннуллин-р8щ 2 жыл бұрын
why didn't you use .agg?
@gopikaprasad8607
@gopikaprasad8607 Жыл бұрын
How to export the for loops result into excel?? Please reply
@gersonchadijunior7499
@gersonchadijunior7499 2 жыл бұрын
Hey Keith, I love so much your videos. I've been learning Pandas with you since your pokemon's video, but I feel that the last answer is not accurate and in fact the right year should be 2006, because it was the year with less Star Wars Sets released. Can I send you my code somehow?
@ElianMrl
@ElianMrl 2 жыл бұрын
Hey guys, would it be a good idea to use Datacamp projects in my resume?
@nitiknayyar7659
@nitiknayyar7659 2 жыл бұрын
Damn I also started this project on Datacamp.
@alkiviadessavoullis2021
@alkiviadessavoullis2021 2 жыл бұрын
does anyone know why when I press continue or start project the Python Use python ... code checks gets highlighted pink and I can't work on the project ?
@zeasammy7572
@zeasammy7572 2 жыл бұрын
Does DataCamp have video learning platform?
@KeithGalli
@KeithGalli 2 жыл бұрын
The typical structure of classes is short videos that overview the concepts and then a bunch of interactive problems with a code editor to drill down the technical side of those concepts.
@sabbirahmed8012
@sabbirahmed8012 2 жыл бұрын
Hello Keith, can you please mention some resource to master natural language processing?
@KeithGalli
@KeithGalli 2 жыл бұрын
Hey! I actually did a PyCon lecture on NLP. That should be pretty helpful: kzbin.info/www/bejne/rKqymIqerLqgm8U
@clayherz_
@clayherz_ Жыл бұрын
if i solve the second question with this code, counted_2 = licensed_sets.groupby(["year", "parent_theme"])[["is_licensed"]].count() counted_2 = counted_2.reset_index().sort_values("is_licensed", ascending=False) counted_2.drop_duplicates("year").sort_values("year", ascending=True) is it wrong
@letsjoinhands
@letsjoinhands 2 жыл бұрын
hello again Keith. For Q#2 I am getting a different result for new_era using this code: So the lego_all_lic is the DF containing all licensed lego set themes with the shape (1179 x 8) and that has been grouped by year to form lego_all_lic_yr. And the rest of the code I have written is quite simple to understand. Looks as if I have made a big mistake in aggregation but can't seem to locate it. lego_all_lic_yr = pd.DataFrame(lego_all_lic.groupby(by = ['year', 'parent_theme'], axis = 0).agg(Parent_Theme = ('set_num', 'count'))) lego_all_lic_yr.reset_index( inplace = True) lego_all_lic_yr.replace(to_replace = [theme for theme in lego_all_lic_yr['parent_theme'] if theme != 'Star Wars'], value = 'Others', inplace = True) lego_all_lic_yr = pd.DataFrame(lego_all_lic_yr.groupby(by = ['year', 'parent_theme'], axis = 0).agg(Parent_Theme = ('Parent_Theme', 'sum'))) lego_all_lic_yr When you look at the result it shows that 2006 was the first year in which Star Wars lost to other themes in terms of the sets released in that year.
@letsjoinhands
@letsjoinhands 2 жыл бұрын
Ok so I misunderstood the Q basically. It wasn't about Star Wars themed sets vs All The Rest rather it the year in which Star Wars lost out to some other individual theme. Got the correct answer using: lego_all_lic_yr = pd.DataFrame(lego_all_lic.groupby(by = ['year', 'parent_theme'], axis = 0).agg(Parent_Theme = ('set_num', 'count'))) lego_all_lic_yr.reset_index( inplace = True) lego_all_lic_yr = pd.DataFrame(lego_all_lic_yr.groupby(by = ['year', 'parent_theme'], axis = 0).agg(Parent_Theme = ('Parent_Theme', 'sum'))) lego_all_lic_yr = lego_all_lic_yr.sort_values(by = ['year','Parent_Theme'], ascending = False) lego_all_lic_yr.head(50)
@manu93ize
@manu93ize 2 жыл бұрын
bro Can you do a tutorial on data cleaning with Pyspark with real world example.
@mufasao6776
@mufasao6776 2 жыл бұрын
I see that you posted some of your hidden videos. Thank you.
@rabinmainali3373
@rabinmainali3373 2 жыл бұрын
I done it in following ways:(question 2) 1. i count each licenced film released every year. 2.Then count the only star wars film released every year 3.And i calculate the proportion of step2 and step1. Is it okey ? ,by the way the result is also 2017 for me.
@Silly_Duck_Guy
@Silly_Duck_Guy Жыл бұрын
keith moment
@letsjoinhands
@letsjoinhands 2 жыл бұрын
Hi Keith! this is how I solved Q # 1. Pls let me know if this is a bad coding practice, is acceptable or is good in your opinion. so I first made a function called is_lic. def is_lic(df_1, df_2): df_1['is_licensed'] = bool theme_1 = list(df_1['parent_theme']) theme_2 = list(df_2['name']) lic_status = list(df_2['is_licensed']) for i, s in enumerate(theme_1): for r, t in enumerate(theme_2): if s == t: df_1['is_licensed'][i] = lic_status[r] Then is_lic(lego_sets, lego_themes) Then all_themes = [ ] for r in lego_sets.itertuples(): all_themes.append([ r[6], r[1], r[7] ]). Then all_lic_themes = [x for [x, y, z] in all_themes if y is not np.NaN and z == True] star_wars = [theme for theme in all_lic_themes if theme == 'Star Wars'] the_force = int(len(star_wars)/len(all_lic_themes) * 100) the_force = 51%
@KeithGalli
@KeithGalli 2 жыл бұрын
So my biggest recommendation based on your code is to be more explicit with how you name your variables. So instead of "df_1" & "df_2" you might name those dataframes "parent_themes_df" & "lego_sets_df" respectively. Furthermore it would be better to name variables "i" & "s" something like "parent_theme_index" & "parent_theme_value". These types of changes will make your code more readable. Functionally, everything looks sound though. Nice work!
@letsjoinhands
@letsjoinhands 2 жыл бұрын
@@KeithGalli thanks a bunch Keith. and now in retrospect when I think about how you were working on solving this Q in the video I realised that all the time you were using pandas built in methods to solve the Q. so yes we could use a smattering of python methods to do this (like I did) but using that libraries' built-in methods would be more simpler and advantageous most of the times. Is that correct?
@igor-xadrezxadrez8541
@igor-xadrezxadrez8541 2 жыл бұрын
Hey, there's a red dot on your nose.
@KeithGalli
@KeithGalli 2 жыл бұрын
I got in a fight playing hockey!
@Viralvlogvideos
@Viralvlogvideos 2 жыл бұрын
Big nose :P
@AbhishekSharma-hy4nl
@AbhishekSharma-hy4nl 2 жыл бұрын
Bro what happened to your nose😟?
@KeithGalli
@KeithGalli 2 жыл бұрын
Got into a little fight playing ice hockey! We won the game though so it's cool xD
5 Python Libraries You Should Know in 2025!
22:30
Keith Galli
Рет қаралды 71 М.
It works #beatbox #tiktok
00:34
BeatboxJCOP
Рет қаралды 41 МЛН
Что-что Мурсдей говорит? 💭 #симбочка #симба #мурсдей
00:19
黑天使只对C罗有感觉#short #angel #clown
00:39
Super Beauty team
Рет қаралды 36 МЛН
1% vs 100% #beatbox #tiktok
01:10
BeatboxJCOP
Рет қаралды 67 МЛН
Day in the Life of a Data Analyst - SurveyMonkey Data Transformation
1:17:14
Shashank Kalanithi
Рет қаралды 3,5 МЛН
Exploratory Data Analysis with Pandas Python
40:22
Rob Mulla
Рет қаралды 512 М.
Solving real world data science tasks with Python Pandas!
1:26:07
Keith Galli
Рет қаралды 1,5 МЛН
The Dome Paradox: A Loophole in Newton's Laws
22:59
Up and Atom
Рет қаралды 44 М.
Complete Python Pandas Data Science Tutorial! (2024 Updated Edition)
1:34:11
It works #beatbox #tiktok
00:34
BeatboxJCOP
Рет қаралды 41 МЛН