Selecting columns when reading a CSV into pandas

  Рет қаралды 14,101

Python and Pandas with Reuven Lerner

Python and Pandas with Reuven Lerner

Күн бұрын

Are you reading a CSV file into pandas? Your analysis might not require all of the columns - and you can save a lot of memory by selecting only those that you need. In this video, I show you how to select specific columns from a CSV file, either by name or by position.
Jupyter notebooks for my KZbin videos are all at: github.com/reu...
And don't forget to subscribe to "Better Developers," with free, weekly articles about Python: BetterDevelope...

Пікірлер: 15
@KunjaBihariKrishna
@KunjaBihariKrishna Жыл бұрын
I really fell in love with CSV files. I recently started learning python to handle transaction data and capital gains calculation with pandas.
@ReuvenLerner
@ReuvenLerner Жыл бұрын
CSV isn't the best format, but it's definitely the most popular. Glad you're using Python for this sort of thing, and that you enjoyed the video!
@KunjaBihariKrishna
@KunjaBihariKrishna Жыл бұрын
@@ReuvenLerner I'm getting by on KZbinrs and some ChatGPT (who seems drunk half the time). But all the time I used to spend on videogames is now spent on programming, because it's apparently scratching whatever itch the games were scratching. I've got quite a fun project lined out. I basically need to create a sort of persistent dataframe that I (probably) need to save locally. I need to save the average price of several thousand different cryptocurrencies for each day until the date the currency was listed. I first worked on connecting a Google sheet directly to Coingecko API (which has that data), but quickly realized that this would mean that each transaction would send a request to the API, which would take hours considering the rate limit. So.. I basically have to set up a script that runs for several days/weeks that slowly puts together a local database of cryptocurrency prices. I suppose that could be appended to a CSV and pandas could then load that into a dataframe in the scripts where I calculate the $ value of a list of transactions.
@ReuvenLerner
@ReuvenLerner Жыл бұрын
@@KunjaBihariKrishna If you can get the data, then you can create a data frame and analyze it - and it doesn't matter whether it's weather data, crypto rates, or the price of oil. That's the amazing thing about this technology; it works with anything you can throw at it, and the limits are typically going to be size (i.e., how much can fit in memory), time (i.e., how long it takes to process), and your understanding. And yes, it is addictive!
@KunjaBihariKrishna
@KunjaBihariKrishna Жыл бұрын
@@ReuvenLerner Thanks for your reply. I'm going to try it first with a small list of currencies, so I can deal with size limitations later. I have so many basics to figure out. Like, I've managed to create a script that aggregates interest payments to one payment per day. (Which cuts the total transaction amount by 80%) However, after using group.by, I'm left with a datafram that only contains the columns that were relevant to the aggregation process. And I haven't figured out how to put the original CSV back together I wonder if after the aggregation, I need to create a new dataframe with the same header, put the aggregated transactions in there (while keeping the column position correct) And then merge it with the original dataframe? But then I'd also have to remove each aggregated row from the original fist, I guess. Oh well.. I'm just thinking out loud. I obviously am very new at this. Watching these videos saves me a lot of trial and error
@KT-oz1md
@KT-oz1md 10 ай бұрын
Thank you so much! This is what I've been looking for
@ReuvenLerner
@ReuvenLerner 10 ай бұрын
Great to hear -- glad I could help!
@shaswatachowdhury9032
@shaswatachowdhury9032 7 ай бұрын
Awesome! Thank you very much!
@ReuvenLerner
@ReuvenLerner 7 ай бұрын
My pleasure!
@juhakumpula8070
@juhakumpula8070 Жыл бұрын
If you're getting Usecols do not match columns, columns expected but not found: just add sep=';' or whatever your separator is.
@ReuvenLerner
@ReuvenLerner Жыл бұрын
Yes, the separator not matching is one of the most common problems i encounter.
@aliosaid1374
@aliosaid1374 Жыл бұрын
How can choose only range of columns such as 5 to 9, I tried 5:9 it will not work
@ReuvenLerner
@ReuvenLerner Жыл бұрын
The documentation says that you need to pass a list-like object. 5:9 is translated into a slice object when it's inside of [], so that won't work. But range(5,9) might work - or at worst, list(range(5,9)). I'll look into thismore, and maybe I'll have a video about it in a few days!
@hadikarimi2818
@hadikarimi2818 2 жыл бұрын
Great explanation, I have precipitation value in all columns, how can I select all the columns I have that starts with D_ which is 7400 date time? (for example: D_20001201)
@ReuvenLerner
@ReuvenLerner 2 жыл бұрын
If you have a list of columns, then you can get only those columns from a data frame. So if you say mycols = ['D1', 'D2', 'D3'] df[mycols] You'll get all rows in df, but only the columns D1, D2, and D3. Given that, one way to get all columns that start with D_ could be: mycols = [one_colname for one_colname in df.columns if one_colname.startswith("D_")] df[mycols]
Getting a random sample from your pandas data frame
6:20
Python and Pandas with Reuven Lerner
Рет қаралды 12 М.
This INCREDIBLE trick will speed up your data processes.
12:54
Rob Mulla
Рет қаралды 264 М.
小丑和白天使的比试。#天使 #小丑 #超人不会飞
00:51
超人不会飞
Рет қаралды 38 МЛН
王子原来是假正经#艾莎
00:39
在逃的公主
Рет қаралды 26 МЛН
Three ways to read CSV data into Python
9:59
Python and Pandas with Reuven Lerner
Рет қаралды 6 М.
How to Reshape Dataframes | Pivot, Stack, Melt and More
12:49
Mısra Turp
Рет қаралды 25 М.
How to combine DataFrames in Pandas | Merge, Join, Concat, & Append
13:40
My favorite Jupyter notebook shortcuts
10:07
Python and Pandas with Reuven Lerner
Рет қаралды 46 М.
25 Nooby Pandas Coding Mistakes You Should NEVER make.
11:30
Rob Mulla
Рет қаралды 269 М.
Python Tutorial: CSV Module - How to Read, Parse, and Write CSV Files
16:12
Pivot tables with Pandas
9:00
Python and Pandas with Reuven Lerner
Рет қаралды 39 М.
My top 25 pandas tricks
27:38
Data School
Рет қаралды 268 М.
How do I merge DataFrames in pandas?
21:49
Data School
Рет қаралды 158 М.
小丑和白天使的比试。#天使 #小丑 #超人不会飞
00:51
超人不会飞
Рет қаралды 38 МЛН