Posted a new "Solving real world data science tasks" video! Check it out here: kzbin.info/www/bejne/e6jKqmB9bsigmMk
@Trazynn4 жыл бұрын
This is awesome. Learning Python is so much easier when there's something tangible and grounded to work towards.
@colorways5184 жыл бұрын
hii keith!!! I am getting an error after this line CODE: for file in files: current_data = pd.read_csv(path + "/" + file) ERROR: ParserError: Error tokenizing data. C error: Expected 1 fields in line 4, saw 2 Please can you help me solve this error....I tried to find solution online but didn't get any.
@larrywang19834 жыл бұрын
@@colorways518 Just thinking out loud,aren't we able to find the below kind of info from Amazon Jungle Scout, Helium10, Sellics. We are amazon seller, do we also need to go thru Python and data-science on Amazon. There are 3rd Party SaaS plug-ins to solve these questions. Correct me if i am wrong? - What was the best month for sales? How much was earned that month?
@ismaeelaileru46124 жыл бұрын
For the problem on getting city with highest sales, we ran into an ordering problem while plotting the cities, I think we can also use result.index as our xtick That way it simply takes the values straight from the Dataframe in the right order rather than using df.unique and rearranging
@rodrigo100kk4 жыл бұрын
This red warning displays bcuz u didn't make a copy of the original dataframe, do it and this warning goes off.
@justapugontheinternet Жыл бұрын
As a programmer/data analyst/systems administrator I can safely say that this is exactly how we solve problems in real life. Good job!
@pasha7293 Жыл бұрын
you wouldnt have watched this video if you were
@justapugontheinternet Жыл бұрын
@Pasha people who think they know it all are a bore. 🙄 You could always learn something new from other people, it never hurts to learn new perspectives. Good luck with that mindset. I learn everyday. 😌
@saugatjarif8272 Жыл бұрын
@@justapugontheinternet love your mindset on🎉🎉🎉🎉
@akhildharimisetty276726 күн бұрын
I was waiting for this type of comments, an assurance from data analyst ls thankyou so much., and thankyou SM keith
@terrymaverick5804 жыл бұрын
the best part part was watching some one google the answer an seeing how they implement the solution instead of just acting like they know everything. man your tutorials are the best an down to earth
@Amir-tv4nn2 жыл бұрын
hahahahaaha you think this kids knows what he is doing and for your information we all google no matter what postion we hold. 🤣 we built websites for a reason to always look back to when needed. Google provides faster search capability rather going to src and look through to get to. Get your mind straight about goodle 🤣 This kid clearly looking around for the code he already written and you assuming google is preferred to be a bad example as a programmer 😂 tells me you expecting movies type like hackers hahahaahahaha. Come to reality
@dragonmateX2 жыл бұрын
It honestly makes it feel more real, like, I am studying data science now and I google stuff all the time, the fact that even someone well versed in data science still googles stuff constantly is reassuring.
@Amir-tv4nn2 жыл бұрын
@@dragonmateX people who work in google google stuff 😂 get back to reality to why google is meant for🤣
@buak8092 жыл бұрын
@@Amir-tv4nn and? what the fuck is your problem? so far you didn't write anything valuable here
@Diabolic9595 Жыл бұрын
@@Amir-tv4nn Come to reality. Man, come to reality. Could you please come to reality? Btw you should come to reality
@kyledawes95933 жыл бұрын
As a business major with very limited internship experience, I am teaching myself python and data analytics from scratch. This video is literal gold to me because this is one of the few that actually shows the entire wrangling process! Thanks for the great vid!
@vilw47393 жыл бұрын
If i use only fd=pd.read_csv("./Sales_Data/Sales_April_2019.csv") i get file not found error..i should use the whole path starting from c drive..How does he not get error
@ashiksrinivas3 жыл бұрын
@@vilw4739 He is using jupyter notebook where files are stored separately in a jupyter notebook directory and you can upload files in the directory and import them by simply running fd=pd.read_csv("./Sales_Data/Sales_April_2019.csv") If you're using a local python IDE like pycharm and VSCode, you need to specify the whole directory like fd=pd.read_csv("C:/Data Science/Sales_Data/Sales_April_2019.csv") to import.
@vilw47393 жыл бұрын
@@ashiksrinivas thankyou
@muhsintabatabayee85922 жыл бұрын
@@vilw4739 did you ever figure it out? getting the same error
@vilw47392 жыл бұрын
@@muhsintabatabayee8592 they should be in the same folder.Otherwise you need to put the whole path
@billyjorrosh93944 жыл бұрын
"I dont know how to do it, but i know how to google it." this guys knows how things going in real world haha
@thanhnhando30704 жыл бұрын
Googling is, indeed, one of the most important skills for coding.
@indexima65173 жыл бұрын
Hahaha! We invite you to take a look at our videos which deal with the same topics :)
@carlurbananimals3 жыл бұрын
His very fast too, like I would need to know it, coz once I go to google im there for 4 hours :/
@samirvinchurkar82263 жыл бұрын
I did the exact same process be it R, Matlab or Py
@samirvinchurkar82263 жыл бұрын
@@carlurbananimals that's coz your question isn't exactly right ;)
@edric75522 жыл бұрын
Hi Keith, I feel obligated to personally thank everyone that helps in pursuing my data career and of course, you included. I've used your project (and learned a LOT) and modify/add codes here and there with my own styling for my online portfolio. Moreover, you're a fantastic teacher and you deserve all the credits you should get for helping others like me. Thank you for doing this, may God return the favor and always bless you. Rock on Keith!
@KeithGalli2 жыл бұрын
Thank you so much for the kind words! :)
@KeithGalli5 жыл бұрын
Video Timeline! 0:00 - Intro 1:22 - Downloading the Data 2:57 - Getting started with the code (Jupyter Notebook) Task #1: Merging 12 csvs into a single dataframe (3:35) 4:25 - Read single CSV file 5:44 - List all files in a directory 7:06 - Concatenating files 11:00 - Reading in Updated dataframe Task #2: Add a Month column (12:48) 14:12 - Parse string in Pandas cell (.str) Cleaning our data! 17:31 - Drop NaN values from df 21:25 - Remove rows based on condition Task #3: Add a sales column (24:58) 25:58 - Another way to convert a column to numeric (ints & floats) Question #1: What was the best month for sales? (29:20) 30:35 - Visualizing our results with bar chart in matplotlib Question #2: What city sold the most product? (34:17) 35:32 - Add a city column 36:10 - Using the .apply() method (super useful!!) 40:35 - Why do we use the lambda x ? 40:57 - Dropping a column 46:45 - Answering the question (using groupby) 47:34 - Plotting our results Question #3: What time should we display advertisements to maximize the likelihood of purchases? (52:13) 53:16 - Using to_datetime() method 56:01 - Creating hour & minute columns 58:17 - Matplotlib line graph to plot our results 1:00:15 - Interpreting our results Question #4: What products are most often sold together? (1:02:17) 1:03:31 - Finding duplicate values in our DataFrame 1:05:43 - Use transform() method to join values from two rows into a single row 1:08:00 - Dropping rows with duplicate values 1:09:39 - Counting pairs of products (itertools, collections) Question #5: What product sold the most? Why do you think it did? (1:14:04) 1:15:28 - Graphing data 1:18:41 - Overlaying a second Y-axis on existing chart 1:23:41 - Interpreting our results Thanks for watching! If you enjoyed, please consider subscribing :).
@ANKITRAJ-fe8dh5 жыл бұрын
Heyy,machine learning would be awesome
@luuminhvuong4 жыл бұрын
I Have very big data in xlsx format. Read excel tâkes like forever...
@mberoakoko244 жыл бұрын
I am on holiday and have started datascience for fun to see what the buzz is all about. I have to say I love it and I would appreciate if you'd apload more videos like this. I have learnt a TON
@kulpreetsingh90644 жыл бұрын
Hey man, are you gonna do more such videos anytime soon?
@mohammedyounis72074 жыл бұрын
Thank you so much, it is very useful to me
@Jordanptheone10 ай бұрын
Watching this 4 years after you published it, and you're still a legend ! Thank you !!!
@KeithGalli10 ай бұрын
Thank you for watching and the kind words!!
@mid_paulownia4 жыл бұрын
This is the most practical Python tutorial video I've ever watched.
@sathirasilva49583 жыл бұрын
Great tutorial! 55:00 When parsing a column into datetime, specifying the format manually will decrease the execution time significantly: all_data['Order Date'] = pd.to_datetime(all_data['Order Date'], format='%m/%d/%y %H:%M')
@rotan90 Жыл бұрын
on google colab it was like 30 sec vs 2 sec. Great tip !
@anthonygonsalvis1214 жыл бұрын
Love how this cool dude researches solutions on the fly and explains things as he goes even when he commits minor unforced errors. He is so relatable. His other tutorials on Pandas, Numpy, Matplotlib, etc. are equally helpful. I wish him all the success and hope that he continues to share his knowledge for decades to come.
@chineduezeofor24814 жыл бұрын
He's such a GREAT tutor!!!
@indrajeetsinghyadav8762 жыл бұрын
Agreed totally relatable and helpful videos for beginners giving them a chance to know what error can happen due to what syntax errors. Thanks for the informative guide.
@matty5ps4442 жыл бұрын
just to add to what most people are saying, this is in my opinion the best way to do a tutorial. you showed me that even though im a super beginner and not long coming out of learning basic python things im able to pick up something really easily while realising that i dont have to feel bad thinking everyone else is better than me and that even experienced programmers google stuff and actually are not gods sitting on pedestals acting like they are better than us haha. great work
@helmialfath98975 жыл бұрын
This situation so realistic. The mistakes, the solving.. great video!
@Pidamoussouma4 жыл бұрын
Yes liked it ..it was so realistic
@ЧернійЮрійМиколайович4 жыл бұрын
is this sarcasm?
@ipshie4 жыл бұрын
Юрій Черній pretty sure no it's not
@billyjorrosh93944 жыл бұрын
not only teach us about pandas but also give us the confidence that "If this guy could be so success in data science then why shouldn't I?"
@89DerChristian2 жыл бұрын
@@ЧернійЮрійМиколайович no
@akosasuke51282 жыл бұрын
I get the feeling in this video that you know more than you're letting on but you're just trying to make things as basic as possible and I love it. I hope to teach others in this same manner. God bless you
@Yayaloy94 жыл бұрын
At 50:10 for anyone who wants to use .unique(), when you calculate the sales for each city make sure to throw in a .reset_index() in there, it will reset the indexes and your bar is going to be alright. cityy=all_data.groupby("City").sum().reset_index() then you do the rest like him, you can also throw in ascending order in there as well, just follow the rest of his instruction. cityy=all_data.groupby("City").sum().reset_index().sort_values("Sales",ascending=False) xxx=cityy["City"].unique() plt.bar(xxx,cityy["Sales"]) plt.ylabel("$$$") plt.xlabel("Cities") plt.xticks(xxx, rotation='vertical', size=8) plt.show()
@smackedup7657 Жыл бұрын
thanks a lot
@rezwanmehedad2095 Жыл бұрын
unfortunately, I am getting a ValueError. Any idea how I can solve this: ValueError: shape mismatch: objects cannot be broadcast to a single shape. Mismatch is between arg 0 with shape (10,) and arg 1 with shape (12,). I havent got any proper answer from google or maybe not an expert enough to understand :p.
@mclovin73002 жыл бұрын
Dude!! You are awesone teaching data science. You make the world better
@KeithGalli2 жыл бұрын
Thanks mclovin!
@ujjawaljani67314 жыл бұрын
He is like my friend who teachs one day before exams. 😂😅
@ziephk4 жыл бұрын
omg!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! i searched for so much "a day in a life of a data science" thinking they would show a glimpse of reality. and this is the best portrayal AND simultaneously one of the best tutorial video. YOU ARE A LEGEND!!!!
@H99x23 жыл бұрын
Dude, this is by far one of the best real-life tutorials on YT. Subbed for more like this!
@cusescholar35828 ай бұрын
This is the best data science class on the net (that I have seen, of course). We are solving real problems, using google, and working with datasets that require a lot of preprocessing. Perfect.
@olajiireolajide3 жыл бұрын
Love how realistic and down to earth all your videos are! Makes data analysis way more approachable. What a guy!
@alfiomarra92072 жыл бұрын
I definitely prefer to watch your tutorials instead of netflix...I love this format, thanks man 😊
@kepenge Жыл бұрын
It's been three years since the video was posted, anyone watching now, as I am, in the column Month one way of getting the names of the months from Order Date, would be to convert the Order Date to_datetime and using the dt.month_name() in Month column. One other thing to remember is to clean the data before starting doing all the analysis.
@abdulqadirtinwala12964 жыл бұрын
Dude , literary i have never seen anyone solving real world problems on you tube .Your, way of teaching is quite impressive. Many, you tubers just showcase basic problems .But, hats off to you !!!
@sushiplatter55403 жыл бұрын
Keith, you're literally the most underrated and one of the best teachers on youtube. This exercise cleared most of my doubts about Data Science and i fell in love with it because of you. Thank you so much for this, you're the best!
@Account-fi1cu4 жыл бұрын
Great tutorial! thank you for sharing In 50:26 for cities: can always use the index values from 'results' DF: cities = results.index.values instead of a for loop
@FrancisBaconthe3rd4 жыл бұрын
Didn't watch more than a few minutes since I already know how to do most of this stuff but loved how the dude straight up tells us to google it. SO TRUE!!! I've had professors who tell me the same thing. Thumbs up.
@МаксимБазанов-и9э4 жыл бұрын
Content of this quality deserves far more recognition. Thank you!
@hamishdosiad57642 жыл бұрын
mate, you're a legend! not only did I learn matplolib and pandas but now I know my pokemon too, tip of the hat!
@anubhkumar88245 жыл бұрын
34:34 Pro tip: go to command mode (press Esc) and press 'b' to make cells below current cell or 'a' to make cells above
@KeithGalli5 жыл бұрын
Thanks for the tips! Love when people comment helpful stuff like this :). Just started using command mode to easily switch cells from code to markdown, will have to add these two commands to the arsenal as well!
@FlyingMonkeis4 жыл бұрын
f and j will move focus to above or below cells and u can pair this with shift and then press ‘m’ to merge the highlighted cells. so shift+f+m will merge the current cell with the one below it. ‘dd’ will delete a cell also! (these bindings are very vim like)
@christopherlyons76134 жыл бұрын
Think that's reversed. Use 'b' to make cells above and 'a' to make cells below.
@羅俊洪7 ай бұрын
I just enter data analysis area and amazing this videos made 4 years before already! thanks for made this, learnt your skills and problem solving as talents, appreciated!
@MicahJohns4 жыл бұрын
23:39 that duplication was because of the header rows in each of the files. I've dealt with this a lot. You would have had had to have excluded those header rows on each file before you concatenated all of them together to resolve this. Great video course man, thank you for making all of that content
@vertik38953 жыл бұрын
I just did what he did and all I am getting is the header rows, what's the solution?
@oscardyremyhr59483 жыл бұрын
@@vertik3895 load first df as normal and proceeding df´s as pd.read_csv('file2.csv', skiprows=1) before concat
@eduardosa96582 жыл бұрын
@@vertik3895 The solution is call the method read_csv(..., header=None) for each iteration
@Josh-di2ig4 жыл бұрын
you're the best. not only are you teaching people how to use python Pandas lib, but you're also teaching the type of hat you should be wearing when solving real world problems! kudos x 10000
@dawnfantasy4 жыл бұрын
50:47 cities = result.Sales.keys() works as expected. great tutorial, tks!
@AndyRhye4 жыл бұрын
Man, I really like your style. Firstly, because you take real world problems and not some primitive stuff like some other bloggers, secondly, because you encourage your viewers to search for solutions themselves, and, thirdly, because you show how to find a sollution to a certain problem on the Interned. Please keep doing similar videos! With best wishes and sincere appreciation from Ukraine.
@ijbarraza4 жыл бұрын
As a new learner of python I found this to be one of the best videos on youtube for beginners. How he managed to deal with the problems and solve them on the go (not knowing it all, but knowing how to consult google for the right answer). Way to go! Loved the approach and how easy you made it look
@katherinenavarrohansen27483 жыл бұрын
I write from Denmark, but I'm Chilean, I followed all the steps and really everything is very clear, I loved your explanations of each task and each question
@hoiying-chan4 жыл бұрын
Your assignments are harder than Coursera's. I'm actually learning something. Major thanks all the way from Holland! 🙏
@andre__4422 жыл бұрын
if every human being on earth had the will and disposition to teach like Keith... the world would be a 99% better place
@royvivat1134 жыл бұрын
This is the most informative video I've ever seen on what data science actually is! I keep looking for actual applications and I loved seeing your thought process, comments, and method of asking and answering questions.
@florenthoti91014 жыл бұрын
As a beginner in Data science with Python, I find you as the best youtuber in this field. Good Job!
@SaulOjeda3 жыл бұрын
this video was amazing, I can't believe I actually sat throught the whole thing past my bedtime
@exploringwithdave59263 жыл бұрын
If you are a coder, there is no such thing as "bedtime". Just, awake, and not awake.
@hajarja451210 ай бұрын
Hey Keith , thank you so much for this video concerning the 4th question 'What products are most often sold together?' i kinda had a similar approach and I got same order of grouped products when i counted the values using .value_counts(). However, the values themselves were different ! here is my approach order_grouped =months_purchase.groupby('Order ID') def concatenate_strings(x): return x.str.cat(sep=',') products = pd.DataFrame(order_grouped['Product'].agg(concatenate_strings)) combined_items = products[products['Product'].str.contains(',')] combined_items.value_counts().head(10)
@oluwadamilaretijani17772 жыл бұрын
Your courses are very great as you delve into practical content. Your course helped me to pass data analysis test in Turing. Thank you so much
@akosasuke51282 жыл бұрын
Congrats oludamire, I'm guessing you're a Nigerian. I'm a Nigerian too and recently got into Exploratory Data Analysis through the udacity Nanodegree program. I'm currently on my second project which is an Investigation of WeRateDogs Twitter dataset. I think I have learnt a thing or two so far. Do you think I'm ready for Turin?..i hear it's like going to the big leagues lol.
@mikeyu6347 Жыл бұрын
I was absolutely blown away by the fanastic lectures. The best teacher I've ever had!
@JoaoOliveira-wh1tp4 жыл бұрын
Great video. Just a few suggestions: At 4:25 when using os.listdir("'./"), this returns a list alread. So using [file for file in os.listdir(...)] is redundant. At 40:50 you don't need to use the lambda function, even if you want to access a cell content. If you simply pass the reference to a function, by default the *args will be passed. Example: def modify(a): return 'CHANGED ' + a + ' CHANGED' df['Column'].apply(modify) # modify without parenthesis is the reference to the function.
@mahermonirify4 жыл бұрын
could u please help : why i'm getting path error when i did try to use os.listdir but not when i opened a specific file to read?
@enaba6 ай бұрын
@@mahermonirify hello i'm getting path error too can you please tell how do i resolve it?
@Magmatic914 жыл бұрын
I love how this guy is explaining, I really enjoyed learning from you.
@manhaabdellah26822 жыл бұрын
Im new to data analysis. My instructor always tells us to search our questions on google and get help from stack overflow. I didnt understand it till now and got stuck on my second project for sales analysis. This helped me big time!!! I'm so thankful to you for telling all those shortcuts. The data time split had such a long tricky code online.
@kafaayari3 жыл бұрын
When passing a function to apply, you could have just passed the function name, there's no need to do apply(lambda x:get_city(x)). This is just enough and better => apply(get_city)
@MattHuisman2 жыл бұрын
Came here to make sure someone said this! As long as the function you pass only takes a single argument. Otherwise lambda x: my_func(x, other_arg)
@Eren53-z5b4 ай бұрын
Best things I have done today is finding this man. I was eating and chilling and saw the thumbnail of this video with pandas name and think that let see 5 min what he had to say. But believe guys I have already watched 1:02:57 this portion of the video and it getting more intresting as it goes towards ending. Kudos to his technique ❤❤❤
@Random_dudebro5 жыл бұрын
I just finished your two videos demonstrating numpy and pandas, finally feeling a good grasp of python basics (y) Thank you for everything you do!
@it_is_ni3 жыл бұрын
The dataset contains January-data for both 2019 and 2020, so the grouping by month doesn't work because you only look at the month, not the year. Stopgap solution: also slice the year off the date string Proper solution: convert date string to an actual datetime, then groupby month with pd.Grouper. I suggest putting a card or a note there so others aren't confused. Thanks for the video though!
@devmrin4 жыл бұрын
Hands down one of the most useful I've seen. Insights galore. Thank you!
@jack.1.4 жыл бұрын
Thank you, there are tons of brilliant programmers on youtube but only a few programmers who are good communicators and teachers.
@OK-Computer4 жыл бұрын
Great video! At the beginning it is much more concise to do this and concatenate all csv files into one like this (better to put ipython notebook csv files in the same directory and then): files=[f for f in os.listdir("./") if f.endswith('.csv')] df=pd.concat(pd.read_csv(i) for i in files) THAT'S IT!
@muhammadbashirmuhammad55294 жыл бұрын
Thats better thanks
@subho17664 жыл бұрын
monthly_dataframes = [pd.read_csv(file) for file in glob.glob(filePath + "*.csv")] merged_dataframe = pd.concat(monthly_dataframes)
@bartproffitt52403 жыл бұрын
thank you so much i have been battling no such directory all morning
@jeisonsanchez48422 жыл бұрын
Also consider adding a condition to skip the first row of each subsequent file - to avoid duplicate headers.
@kelvingitari2 жыл бұрын
Best data analysis video I have watched so far! I also love how most people in the comment sections have outlined alternative ways of approaching some of the tasks.
@yaswanthfinds4 жыл бұрын
so nice I was searching this kind of tutorial, it has real-time mistake and solution,I hope you do this kind of videos regularly
@GhizlaneBOUSKRI4 жыл бұрын
The first time I let the ads on a youtube video, because I wanted to watch every second of it. Many thanks Keith, you' re just amazing !
@KeithGalli4 жыл бұрын
I appreciate the kind words! Glad you enjoyed :)
@leec89772 жыл бұрын
1:09:28 you can use df=df.groupby('Order ID')['Product'].apply(','.join) instead those three lines. Thanks for this video, it was great for me.
@DarshanMalu4 жыл бұрын
You are awesome! Thanks for patiently explaining everything, also teaching how to google what you want! Thanks man!
@williamhendro91773 жыл бұрын
i don't know, man. i think this is one of the very best channels in all platforms (not only youtube)
@Scratchmex4 жыл бұрын
22:00 I think is more reliable to parse column of dates as datetime type to avoid all these problems
@stevejuso4 жыл бұрын
pd.to_datetime did not work for me on this data. How did you use it? I get an error
@SiIentFire3 жыл бұрын
@@stevejuso Really late reply, but just incase it helps someone. You can tell the read_csv function to read a column as a date by passing in parse_dates=['col1', 'col2'] for any amount of columns. You can tell it to use European format with dayfirst=True And if you need a specific format you can use date_parser to give your own parser for a specific format. So in my case it was: df = pd.read_csv('filepath', parse_dates=[datecols], dayfirst=True) to get the cols I needed into European date format. One key thing is that it converts the dates to a pandas timestamp. But they are interchangeable with python datetimes almost all of the time. Can also be converted with an .apply(lambda x: x.to_pydatetime) if you need.
@Data-2-d3k7 ай бұрын
This is the most practical Python tutorial video I've ever watched. Thanks for sharing!
@rezap13564 жыл бұрын
The best graph type for correlation is 'scatter graph', looks like a constellation. Great video Keith. Thanks.
@ryanmugo42068 ай бұрын
i would give this guy a 10/10...truly understood everything
@francescofaccia4 жыл бұрын
Hy Keith, you're great! thanks to you we can be introduced to a hell of a lot of useful panda tools! keep up the good work!
@ahmetsenol6104 Жыл бұрын
I even liked the name of the video. Straight to the point. I said "YESS IVE BEEN LOOKING FOR THIS" perfect. Thanks.
@karimkhatib85693 жыл бұрын
Really interesting to go through the entire process, including looking up solutions and solving errors!
@vickyzhang8202 жыл бұрын
Sooooo fantastic!!! This is definitely the best Data Project video I've seen on KZbin!
@abhishek_raj3 жыл бұрын
Keith: I am gonna snatch the first two digits and make it the month. The data: Hold my NaNs !
@andymartinez54073 жыл бұрын
Hi Guys, For this task: 30:35 - Visualizing our results with bar chart in matplotlib Instead of using the datetime library I used the calendar library. This code helps me change the month number to month name: import matplotlib.pyplot as plt import calendar results = all_data.groupby('Month').sum()['Sales'] months_int = range(1,13) months = [] for month in months_int: months.append(calendar.month_name[month]) plt.figure(figsize=(10,6)) plt.bar(months,results) plt.xticks(months) plt.xticks(rotation=90) plt.gcf().axes[0].yaxis.get_major_formatter().set_scientific(False) plt.ylabel('Sales inUSD ($)') plt.xlabel('Month number') plt.show()
@Abdullahkbc2 жыл бұрын
You are great Keith. You are doing it in a manner that most students can understand better.
@jenn69974 жыл бұрын
You are always so passionate and enthusiastic even if there're errors haha :) Love your positive attitude! Look forward to more great videos!! :)
@masthanjinostra29814 жыл бұрын
I get tensed like in hell..
@geekyprogrammer48313 жыл бұрын
he purposely introduced those errors for us to have real-life problem-solving experience :)
@mrzorak4532 Жыл бұрын
At 1:13:50 an interesting question to ask is 'from this couple of most sold together items, what is the third item most common to the first two' and then add the promotion to that third item.
@rafacardenas87834 жыл бұрын
great job Keith!, keep up with the walk-through-style tutorials, hands on is the best and even better when you have the feedback.
@stefanlasek32564 жыл бұрын
Honestly, one of the best videos I have seen. From mistakes, how to look for answers and little tips & tricks. You have got new subscriber in me.
@vikram32974 жыл бұрын
32:15 you have created months list to pass it to plt.bar() out of thin air, in current scenario as our data is coming in sorted way by month so no issue is coming else it would have plotted Sales against wrong month. Instead I tried this, please let me know if I'm wrong about it? all_data.groupby('Month')['Daily Sale'].sum().plot(kind='bar') plt.show()
@naishkiteboarder4 жыл бұрын
The groupby function sorts by months I think so that will be [1:13], same as the new month variable
@naishkiteboarder4 жыл бұрын
Monthss = [month for month, df in All_Data.groupby('Month')]
@iunknown5633 жыл бұрын
All the errors that were driving nuts are resurfacing here and being handled nicely! Such a treat:)!
@JohnnyRottenest4 жыл бұрын
50:00, use result.index as x values and x ticks.
@jasonwong83154 жыл бұрын
yes that would be easier.
@RSUtsha2 жыл бұрын
This video is really good, not only the solutions but the process of getting to the solutions shown is what makes it so good...!
@berkayozkan26313 жыл бұрын
I love how he freaks out whenever there is a small warning lol
@muradnurmammadov Жыл бұрын
Keep uploading such videos. These are more needed than tutorials
@a.yashwanth4 жыл бұрын
Checking the length of dataframe helps instead of storing in csv file and verifying.
@dp6736 Жыл бұрын
Hi Keith, Even after three years, this video is very useful. You are very good at explaining the concepts. Thank you very much
@omrieliyahulevy79854 жыл бұрын
Great tutorial, I've learned a lot! a suggestion for you first question for the best month for sales: Instead of creating the extra cols of 'month' and 'sales' we can use the pandas "resample" method which does the group by month for us, and just like in the groupby method we close it with the "sum" and we get the same table! all_data.resample('M', on='Order Date').sum().sort_values(by='Price Each', ascending=False)
@Yayaloy94 жыл бұрын
But heres the problem, Order Date is not a date time type so you have to conver it first. all_data["Order Date"]= pd.to_datetime(all_data["Order Date"], format="%m/%d/%y %H:%M")
@prashanthchandroth81084 жыл бұрын
Keith - your data creation script is great and thank you.
@Doorshlak4 жыл бұрын
This channel is the best thing I've encountered in a while. Thank you for helping the desperate ;-; Would do 5 likes if I could
@RobTaylor1985 Жыл бұрын
this was my frirst time using Python with data and you made it awesome! i really enjoyed the realness of using the internet to solve the problems
@arnopisspot51154 жыл бұрын
this video was super interesting. I can certainly watch 10 more of these!
@Tales.of.Irshad2 жыл бұрын
great experience when dealing with this video. Loved the way you used the online search. Need more videos please
@ng4logic4 жыл бұрын
58:22 I heard that
@diogoledermann73934 жыл бұрын
LOOOOLLLL
@0xfsec4 жыл бұрын
Great video Keith! I'm using import glob for get all csv. And set it like this sales_files = glob.glob('./Sales_Data/Sales_*.csv') df_list = [pd.read_csv(files) for files in sales_files] data_list = pd.concat(df_list) And then merge it like your code. For warning in 1:26:07 you just have to add .copy() in last line of df. variable
@KeithGalli2 жыл бұрын
I'm launching a data analytics bootcamp! goto.masterschool.com/5wn3sw Some highlights of the program: - Fully remote (with flexible working hours) - No tuition fees until after you land a job in tech - Open to applicants anywhere in the world! This is a 7-month long program kicking off in June. To learn more and get your application started, click the link above ⬆
@anthonycampos46732 жыл бұрын
cool, greetings to you from Lima/Perú
@mrzorak4532 Жыл бұрын
at 1:25:00, you can also ask what's the mean of the quantity ordered for each product and have an almost 1:1 of the mean of quantity ordered with the quantity ordered, that would answer the question is a more meaningful way that the price, people who order batteries, buy in average 1.5 batteries per order, while people who buy Macbooks usually just buy one.
@nishantbanjade9204 жыл бұрын
I like the way you say in every mistakes - :: AAAAh What did i do ::" lol :D xD
@Jack-xy4fy4 жыл бұрын
hahaa it made me laugh because i do the exact same thing
@9t4j5 ай бұрын
One of the best demos i have seen! the fact that it is 5 years old and still relevant is amazing! i followed it to the end and was amazed ! the only thing that did not show the same results for me is question 4 ! i could not get it right even with copy paste , otherwithe 5 stars :D
@JulioSerratos Жыл бұрын
Really good job that really give us a real daily solving problem. I’m sure most of us resolve problems as this way, googling, prove an error. I do not understand why in Interviews they expect you know everything about the Language, Algorithms and Syntax.
@Frdy123454 жыл бұрын
At 40:37, you don't need to use the lambda function when you use the apply method. df['City'] = df['Purchase Address'].apply(get_city) Also to plot the best city without any issue one can use; df.groupby('City').Sales.sum().plot(kind='bar')
@tanmaysinghi18683 жыл бұрын
i would greatly appreciate another simillar video with a new project with some newer formulas and features, maybe understanding heatmaps, creating more complex functions etc. Thanks again for this.
@arpangoyal73372 жыл бұрын
LOVED the entire video and how raw it was, alongwith his explanation!