Clean Excel Data with Python and Pandas - 5 Minute Python Scripts - Full Code Along Walkthrough

  Рет қаралды 60,606

Derrick Sherrill

Derrick Sherrill

5 жыл бұрын

In this video we'll cover the basics of how to clean your excel data using python.
We'll cover how we can load in excel files, change or modify their current cells to meet your requirements, and then rewrite back to a new excel file.
Kite helps fund the channel, thanks for checking them out and supporting me --
⭐ Kite is a free AI-powered coding assistant that will help you code faster and smarter. The Kite plugin integrates with all the top editors and IDEs to give you smart completions and documentation while you’re typing. www.kite.com/g...
Thanks so much for all the support!! You all are far too good to me. 330+ Subscribers!!! Thank you all so much.
Here's the GitHub link:
github.com/Der...
If you have any suggestions for the next video please let me know!
Until next time!
*****************************************************************
Code from this tutorial and all my others can be found on my GitHub:
github.com/Der...
Check out my website:
www.derrickshe...
If you liked the video - please hit the like button. It means more than you know. Thanks for watching!!
Useful Links
-----------------------------------------------------------------------------------------------------------------
Python Download:
www.python.org...
(Remember Python 3 is the future!)
I use Atom Text Editor for all my tutorials
Atom Text Editor:
atom.io/
Packages I often use in Python tutorials:
-Pandas
pandas.pydata....
-Numpy
www.numpy.org/
-xlrd
xlrd.readthedo...
-TensorFlow
www.tensorflow...
-Matplotlib
matplotlib.org/
-Django Framework
www.djangoproj...
-Beautiful Soup
www.crummy.com...
(Install through Terminal $pip3 install ....)
Other Useful Services sometimes featured:
-Amazon Web Services (AWS)
aws.amazon.com/
-Microsoft Azure
azure.microsof...
-Google Cloud
cloud.google.com/
-Juypter Notebooks
jupyter.org/
Always looking for suggestions on what video to make next -- leave me a comment with your project! Happy Coding!

Пікірлер: 95
@davestark3261
@davestark3261 5 жыл бұрын
Such a huge fan of the 'bitesize' format of these videos. Clear instructions, excellent explanations. Keep it up!!
@dlutherc
@dlutherc 4 жыл бұрын
It's so much easier to learn the content when the important information is not separated by a lot of talk. These videos have helped tremendously, you do a great job!
@koontzrob22
@koontzrob22 5 жыл бұрын
Really Awesome Video! It would be pretty cool if you did like a part 2 where you clean excel data (data with way more errors, random empty rows, wrong data types, misspellings etc) with python.
@anthonyrojas9989
@anthonyrojas9989 Жыл бұрын
This is my new mentor man, the simplicity and clear explanation is on point.
@josephtran1500
@josephtran1500 5 жыл бұрын
sheet1['First Name'] = sheet1['First Name, Last Name].map(lambda x: x.split(',')[0] sheet1['Last Name'] = sheet1['First Name, Last Name].map(lambda x: x.split(' ')[1] pandas is built on top of numpy which supports vectorized operations. No need to write the for loop. You can just call the column and .map() with a lambda expression inside
@CodeWithDerrick
@CodeWithDerrick 5 жыл бұрын
Totally agree! Lambda functions are just difficult to teach effectively to the level of audience I’m targeting. 😬 without a doubt though yours is less code and faster than the for loop.
@rverm1000
@rverm1000 2 жыл бұрын
Cool taking machine learning now . This really helps
@AcceptableBread
@AcceptableBread 4 жыл бұрын
These tutorials are some of the best I have ever seen! Your ability to clearly and concisely explain the concepts is exceptional. Looking forward to seeing more.
@mimichui1
@mimichui1 5 жыл бұрын
Stumble to your channel and found these 5-min tutorials. Almost watched every single one. Thank you so much! Looking forward to watching more 5-min videos.
@MacTheDJCom
@MacTheDJCom 4 жыл бұрын
These 5 minute tutorials are life!
@keagankemp6275
@keagankemp6275 2 жыл бұрын
Wow starting my python journey and came across your channel, needless to say all this is a good find.
@zayamadin
@zayamadin 2 жыл бұрын
Thank You! That's AWSOME!!!!
@createyourlifestylenow
@createyourlifestylenow 5 жыл бұрын
great, they way you explain it is very easy to follow for beginners
@nethsz
@nethsz 3 жыл бұрын
Thanks, I am beginer, it's really usefull, just need to install -xlrd and -xlsxwriter at first.
@TeverRus
@TeverRus 3 жыл бұрын
My man, you are a genuis! Thank you so much! I'm going to use it at work on Monday :) Cheers!
@Regc10
@Regc10 4 жыл бұрын
what a legend! Love your tutorials :)
@vilw4739
@vilw4739 2 жыл бұрын
Thanks so much for this,this might seem simple but can trouble you while dealing with huge datset.I was stuck for few days now after watching the video got an idea of what to do and what was i doing wrong!!
@big_cheese2162
@big_cheese2162 4 жыл бұрын
Thanks very much for all the great content, has been fantastic both for my work & coding in my spare time. Keep it up!
@udaracperera
@udaracperera 4 жыл бұрын
Huge fan bother keep it up!!! Thanks
@vincenzo3292
@vincenzo3292 5 жыл бұрын
Great tutorial, thanks. Can use this at work. Like the Milan entry - definitely go to Italy.
@alessandroformiconi6242
@alessandroformiconi6242 4 жыл бұрын
Hi Derrick thank you, very good work and ALL IN 7 MINUTES, that's great for learning ... i have programmed in Java for years, but Python is so funny!
@matony19
@matony19 3 жыл бұрын
awesome video! hope you continue
@santoshgujar5237
@santoshgujar5237 2 жыл бұрын
Thank you, Sir
@xujerry3762
@xujerry3762 4 жыл бұрын
it is good lessons of your video, It makes me learn more about python. nice to keep in touch with you.
@skyblue021
@skyblue021 5 жыл бұрын
You rock Dude, thanks for your great work!
@dagudelo88
@dagudelo88 5 жыл бұрын
Very usefull and also a very clear explanation. keep it up :)
@johndunn6253
@johndunn6253 5 жыл бұрын
Fantastic, thanks for making these!
@AshokKumar-eu4dd
@AshokKumar-eu4dd 5 жыл бұрын
Hi Sherrill, i wanted to learn python course. while searching videos in KZbin,i found ur channel and feels this right channel for beginner. i am basically excel user and not aware about python and any programmer . could you please advise me how to start my carrier in python and also request you to post basic videos from scratch in excel user. it would be appreciate if you can help me
@quantum7401
@quantum7401 5 жыл бұрын
Very nice, reminds me of a good lecture.
@josephtortolano786
@josephtortolano786 2 жыл бұрын
Hey nice video I am working a lot with VBA at work and I was wondering why would you do this into python if you could easily do it in VBA ? Is it faster or more reliable with python ? Also can you show cases where python might awser problems that can't be done in VBA. That would be a very nice video in order to see differences and limits in both languages. Keep the good work 👏
@dantedt3931
@dantedt3931 5 жыл бұрын
Awesome videos!Thanks!
@stephang5671
@stephang5671 3 жыл бұрын
Great video, on spot and no 'superivised typing' for us. At the moment I'm fighting with date-Time-fields. You could add a column 'Birth date' with dates in different formats ('200-12-30', '1998-11-28 05:25:59', '01.01.1985' (European)) and clean it up in a way that I can apply conditions on it (for e.g. find the over 18 year old). Just if you need ideas :-)
@meryemgazanayi5665
@meryemgazanayi5665 Жыл бұрын
thank u sooooo much
@nathanliu4018
@nathanliu4018 4 жыл бұрын
Love it!!
@martin-xq7te
@martin-xq7te 4 жыл бұрын
Great tutorial Derrick, how about a frequency tutorial showing how to display the count of names or numbers in an excel sheet.
@CodeWithDerrick
@CodeWithDerrick 4 жыл бұрын
Hey Martin, thanks for the kind words! Would the groupby method work for what you’re thinking, or are you thinking displaying the counts of items across the entire worksheet (not just the count of of a name in a single column but all columns)?
@osmankhaled4565
@osmankhaled4565 4 жыл бұрын
Excellent
@deepakgiya
@deepakgiya 5 жыл бұрын
Do you have any tutorials on search and replace column values in excel Reqmts: 1) Search a pattern in rows and then delete the row 2) Search a pattern in rows and then replace it with new pattern 3) Search a pattern in rows and remove that pattern and leave the rest in the rows 4) Delete a column based on search pattern 5) Save to new file
@everydayhappy965
@everydayhappy965 3 жыл бұрын
hi, I am wondering do you have a good way to import a number of excel sheets but do not have to do the type in the import name many times. thanks
@ThanhTruong-sf3pc
@ThanhTruong-sf3pc 3 жыл бұрын
What's the fuck ? This video is really clear with few minutes ♥ Thanks god bring me to here
@robertcliffort2354
@robertcliffort2354 2 жыл бұрын
great.
@thenickrodriquez
@thenickrodriquez 3 жыл бұрын
at the 4:01 timestamp, what is the 1 for in the For loop?
@dhananjaykansal8097
@dhananjaykansal8097 4 жыл бұрын
Lovely!
@skytell
@skytell 3 жыл бұрын
After you outputted new data, column A in the excel display 0 thru 10 as row number - how do you get rid of that row number on the output of the excel file?
@CodingIsFun
@CodingIsFun 3 жыл бұрын
sheet1.to_excel('output.xlsx', index=False)
@RS-el7iu
@RS-el7iu 4 жыл бұрын
thanks 4d very nice explanation .... how can we write the sheet onto the same excel file but in a different sheet?
@vincentsvlog1761
@vincentsvlog1761 2 жыл бұрын
Hello Derrick, Thanks for the fantastic video. I'm curious that is there any AI function to do so?
@Devil7288
@Devil7288 5 ай бұрын
Bro please make a full course python in excel
@liestyaq
@liestyaq 4 жыл бұрын
How can i preprocessing in text mining using excel file? I am a beginner .
@marc10uae
@marc10uae 5 жыл бұрын
Great tutorial - thanks for it.. but what is the advantage of doing this in python, vs direct in excel itself with the upper command and just adding the words with comma to a new column
@BiancaAguglia
@BiancaAguglia 5 жыл бұрын
I think Python is the better choice when your clean-up tasks are more complicated than the one Derrick showed (and when you're more comfortable using Python than you are using Excel. 😁) For example, if the Excel spreadsheet has messy text that needs to be cleaned up using regular expressions. Another example is when you have to apply more complex functions to certain entries in the spreadsheet. Python is very powerful and it's worth learning, but I've seen Excel experts who can automate many spreadsheet operations simply by using Excel (and VBA). So, if you're already a pro in Excel, you might not see an improvement in your workflow by using Python. Personally, I recommend learning Python because it can help you far beyond cleaning up Excel spreadsheets. 😊
@wendzbrand
@wendzbrand 2 жыл бұрын
the advantage of doing this in python is when you are cleaning a big set of data that excel could not handle.
@ajith.p481
@ajith.p481 5 жыл бұрын
Your are good, how to put pivot table and delimit the column content?
@fishtheory7529
@fishtheory7529 3 жыл бұрын
Having trouble pulling up the Excel file in the command prompt. It states that there is no such file or directory. I have tried using the full path as well as setting the folder holding my excel files as the working directory. Unsure what the problem is.
@leechinheng7908
@leechinheng7908 4 жыл бұрын
Derrick, I am curious why you don't install "scripts" package in atom? It seems troublesome to run the script with the command "python3 xxx".
@nicolaimartin7279
@nicolaimartin7279 5 жыл бұрын
great thx
@the_randomguy7989
@the_randomguy7989 3 жыл бұрын
Where will be that output file gets saved? My previous one is not updated also Please help
@adityacodz3121
@adityacodz3121 3 жыл бұрын
Which IDE are you using?
@jakeg9711
@jakeg9711 4 жыл бұрын
Is it possible to remove time stamps and change date formats (both US and Europe date formats in same column) of my excel data within python?
@kavankailey506
@kavankailey506 2 ай бұрын
how can do the next given question from my assignment, if anyone can suggest please • The size of these data sets is quite large. The weather data is provided in xlsx format and will need to be cleaned up and converted to a suitable format before you can use it in your program - you should discard any data that you don’t need to reduce the amount of time it will take to train your models. • The data set includes data from 2015 - 2021 inclusive, but 2021 does not contain the full year. Your predictions should be for the year 2022.
@patalpunuoma
@patalpunuoma 3 жыл бұрын
Thank you for the video really helpful. But it works only if every cell has a first name and last name. If there isn't the last name in the cell script breaks. (ValueError: not enough values to unpack (expected 2, got 1) Tried to write if statement but still doesn't work properly. What would be the solution? :)
@Frankenstein786
@Frankenstein786 2 жыл бұрын
You should look up exception handling, try except pass
@alinajaved2165
@alinajaved2165 2 жыл бұрын
how to automate the data cleanup with python?
@andrewc2174
@andrewc2174 4 жыл бұрын
I'm trying to do the command prompt in your video but when I enter what you wrote it doesn't do anything. I'm on windows, is there another step?
@manishsrivastava5611
@manishsrivastava5611 3 жыл бұрын
I am getting this error .. 'ValueError: Neither the `x` nor `y` variable appears to be numeric". help
@sean7258
@sean7258 2 жыл бұрын
What if you had over 1000 rows of unknown names and multiple names contained either (Prof, Mr, Ms, Fr) at the start or (Jr, Bsc) at the end. Because that's my problem at the moment and I'm stumped......
@limitless4766
@limitless4766 4 жыл бұрын
Am getting error each time I try to import panda , but I have already installed the panda module
@mcnamarachiwaye6359
@mcnamarachiwaye6359 4 жыл бұрын
hie how do i sum up excel dataframe ($320), $350
@belaidmabrouk29
@belaidmabrouk29 5 жыл бұрын
Hi derrick thank you so much for your effort . I have 1 question do pandas lib creat chart or graphic data assimilation! If no which is the most suitable lib on python
@CodeWithDerrick
@CodeWithDerrick 5 жыл бұрын
Hey Belaid, Thanks for your kind words! There are a couple useful ones. Pandas does have a data visualization already built in called pandas Visualization. It's built on the package of Matplotlib. Plotly is another cool one to check out too if you need interactive graphing!
@belaidmabrouk29
@belaidmabrouk29 5 жыл бұрын
@@CodeWithDerrick thank you so mach derrick, i will check it soon and give you my feedback. Thx and Have a nice day
@youssefahmad9112
@youssefahmad9112 5 жыл бұрын
Great video.. thanks.. I have a question please.. How can add (or append) rows to an existing Excel ?? Do you have a video about this ??
@CodeWithDerrick
@CodeWithDerrick 5 жыл бұрын
Thanks for the kind words!! Where is the data that you’re adding to the sheet? We can append how we did in this video, merge the two together, and several other things. I’m happy to do an example with more specifics! 😀
@youssefahmad9112
@youssefahmad9112 5 жыл бұрын
@@CodeWithDerrick Well.. the data are generated at the same code, like the Average of some values.
@youssefahmad9112
@youssefahmad9112 5 жыл бұрын
@@CodeWithDerrick And to be more specific, I'm trying to create an Excel sheet - using python of course - that contains: Student name and his result in an exam that he did it in the python program. Till now I can create one sheet for each student, but I need one sheet for all the student . 😀 Hope I didn't talk very much 😅, And I would be grateful if you helped me. 💛
@lilycheong3832
@lilycheong3832 4 жыл бұрын
questions is that if i want to spilt the file data into by ',' for eg '1234567890, FOOD, 10/UNIT, QTY 300', i want to spilt by '1234567890' , 'FOOD', ' 10/UNIT', 'QTY 300' , how can we do this ?
@RalphMartinez007
@RalphMartinez007 4 жыл бұрын
How do you export a finished clean data to an existing sheet?
@CodingIsFun
@CodingIsFun 3 жыл бұрын
This answer on Stackoverflow might help you out: stackoverflow.com/questions/42370977/how-to-save-a-new-sheet-in-an-existing-excel-file-using-pandas/42371251
@BAL31m89
@BAL31m89 3 жыл бұрын
hey i have a query, suppose we have 10 columns, and 100.. of rows. Columns will be like Date, Customer, city name, and then further are the Sales, price, items and some bunch of other columns, now suppose i have some abnormal value in the sales , price or any other cells in my excel data . Here is my query now. 1- I want to remove or replace that specific cell data but not whole row as other parameters are correct in the other columns. like may be i have some issue in my 'Item' column for that specific date. Also may be the wrong value is for one of the customer on that date, but fine for other. 2- I want to delete the whole row where i find any abnormal value for any specific column. 3- How to get back my final output sheet for both cases. please make a video for this case
@vipin_optimistic179
@vipin_optimistic179 3 жыл бұрын
Excel sheet se data kase nikalte hai
@comptegmail273
@comptegmail273 2 жыл бұрын
Hello sir, thank you so much for the tutorial. I'm actually stuck since my source in a CSV file. Except that sadly the file I'm working is extremely complex with indefinete columns since my main columns are repeated everyday based on the date. I've been stuck on this problem since over a week. Is there a way I could reach out to you and have your mail to maybe help solve this problem? Thanks a lot in advance.
@Frankenstein786
@Frankenstein786 2 жыл бұрын
You could try transposing your data frame when you initialize the program. That should swap the rows and columns and then you could index the date.
@djsanell
@djsanell 4 жыл бұрын
@Derrick - I have a little problem. Whenever I export that to Excel, is not working. Is exporting in the same state with First Name and Last name into one cell. In Terminal is working fine with split between two but when I do export to xlsx format this functions is not working anymore. I use to export df.to_excel("Clean data.xlsx") ---- Anyone from here could help me with that problem? Many thanks all!
@AI_CANISTER
@AI_CANISTER 3 жыл бұрын
please install openpyxl. it will work
@marcosfilho1815
@marcosfilho1815 4 жыл бұрын
Which IDE is it?
@johnbrady2930
@johnbrady2930 2 жыл бұрын
What would happen if somebody had a middle name (Joe Pat Murphy) or the surname is double barrelled (Moran Dylan)
@nishchitjain1
@nishchitjain1 5 жыл бұрын
How do I get Pandas ?
@markross7231
@markross7231 3 жыл бұрын
Is that Excel in 365 your using on the Mac??
@barrowmusics
@barrowmusics 4 жыл бұрын
Please Derick can i get your mail, need to send you a file
@pomicsaviox9971
@pomicsaviox9971 4 жыл бұрын
How to split below Column value Column 058-10-1312 The Little Rascals Split as Column1 - 058-10-1312 Column2 - The Little Rascals
@thehardikbhatia
@thehardikbhatia 4 жыл бұрын
Contanct me , i want to work on a project with you
@vipin_optimistic179
@vipin_optimistic179 3 жыл бұрын
My very big problem
@Bozon_Higgsa
@Bozon_Higgsa 2 жыл бұрын
...
This Is Why Python Data Classes Are Awesome
22:19
ArjanCodes
Рет қаралды 800 М.
25 Nooby Pandas Coding Mistakes You Should NEVER make.
11:30
Rob Mulla
Рет қаралды 266 М.
WHO CAN RUN FASTER?
00:23
Zhong
Рет қаралды 38 МЛН
Doing This Instead Of Studying.. 😳
00:12
Jojo Sim
Рет қаралды 32 МЛН
Replace Excel Vlookup with Python - Five Minute Python Scripts
6:33
Derrick Sherrill
Рет қаралды 228 М.
Automate Excel Work with Python and Pandas
21:29
John Watson Rooney
Рет қаралды 94 М.
Run Python Code From Excel with VBA
14:32
NeuralNine
Рет қаралды 35 М.
Automate Excel using Python | Excel Hacks with Pandas
16:47
SATSifaction
Рет қаралды 283 М.
Real World Data Cleaning in Python Pandas (Step By Step)
40:01
Ryan & Matt Data Science
Рет қаралды 66 М.
How I'd Learn AI (If I Had to Start Over)
15:04
Thu Vu data analytics
Рет қаралды 775 М.
Pandas for Data Science in 20 Minutes | Python Crash Course
23:06
Nicholas Renotte
Рет қаралды 120 М.
ГОТОВЫЙ ПК с OZON за 5000 рублей
20:24
Ремонтяш
Рет қаралды 287 М.
Kumanda İle Bilgisayarı Yönetmek #shorts
0:29
Osman Kabadayı
Рет қаралды 4 МЛН
iPhone VS Samsung🤯
1:00
Skinnycomics
Рет қаралды 11 МЛН
Что делать если в телефон попала вода?
0:17
Лена Тропоцел
Рет қаралды 4,5 МЛН
Nokia 3310 top
0:20
YT 𝒯𝒾𝓂𝓉𝒾𝓀
Рет қаралды 5 МЛН
Электронный звонок #shorts
0:26
TheBestBike
Рет қаралды 210 М.