No video

Selecting rows in Pandas using .loc and lambda

  Рет қаралды 996

Python and Pandas with Reuven Lerner

Python and Pandas with Reuven Lerner

9 ай бұрын

.loc provides us with many ways to select rows and columns from a Pandas data frame. But did you know that you can use lambda to select rows, as well? This can come in handy in a variety of cases, including when you have a long, complex chain of methods in your query. In this video, I show you the basics of using lambda to select specific rows - including an explanation of what "lambda" does for the uninitiated.

Пікірлер: 13
@eleanortay7351
@eleanortay7351 7 ай бұрын
Great thanks for this amazing video. I learnt a great deal and have more confidence in using lambda now. 😊🙏
@ReuvenLerner
@ReuvenLerner 7 ай бұрын
Excellent! I'm delighted to hear it.
@goodmanshawnhuang
@goodmanshawnhuang 12 күн бұрын
great video, thanks for sharing.
@goodmanshawnhuang
@goodmanshawnhuang 12 күн бұрын
I prefer "lambda row: row['passenger_count'] > 4" because the lambda parameter means the "row" of dataframe, what do you think?
@ReuvenLerner
@ReuvenLerner 12 күн бұрын
@@goodmanshawnhuang Hmm, I hadn't ever thought of doing it that way. A very interesting idea! I'll need to chew on it a bit more, but it might show up in the future...
@jaredclaypoole135
@jaredclaypoole135 9 ай бұрын
Thanks for the video. I didn't know you could pass a function to '.loc'. What you said in a comment below about the lambda technique's utility when chaining multiple '.loc' (or presumably other types of manipulations) together makes sense. Otherwise, at least to me, the lambda technique seemed exactly the same as a regular '.loc' expression, albeit with a 'lambda df_:' in front and the dataframe variable name replaced by 'df_'. There's another reason I could see myself using this in practice: if my dataframe's variable name is long, it's inconvenient to repeat it inside of the '.loc' call. My convention is to do a quick 'df = my_long_df_name' just before, but maybe the lambda technique is more elegant.
@ReuvenLerner
@ReuvenLerner 9 ай бұрын
Yes, good point!
@MrAstonmartin78
@MrAstonmartin78 9 ай бұрын
Sorry for misunderstanding: what does lambda do in loc? Is it goes row for a row inside df? And 'df_' here just a noname variable for function?
@ReuvenLerner
@ReuvenLerner 9 ай бұрын
Normally, you could use a boolean series (such as the one you get from a comparison), and put that inside of .loc to select rows. But you can also, and instead, use lambda to do the same thing. Why would you want to do this? I hinted at the reason in the video, but wasn't clear enough. There are basically two reasons: (1) it can sometimes be cleaner and more elegant, and (2) more importantly, when you're chaining methods, there might not be any variable that you can reference to create a boolean series. Using lambda lets you use whatever the method chain has given you, even after multiple transformations, and still select only certain parts from it.
@RahulJain-fk7bu
@RahulJain-fk7bu 9 ай бұрын
Hi. Why it is not giving the error " df_ is not defined "? Kindly, guide thanks
@ReuvenLerner
@ReuvenLerner 9 ай бұрын
df_ is a parameter in the lambda, meaning that it's a local variable within the function. So df_ exists within the function, but as soon as the function's stack frame exits, the variable disappears as well.
@RahulJain-fk7bu
@RahulJain-fk7bu 9 ай бұрын
@@ReuvenLerner ok But that lamba function has not taken any argument?
@ReuvenLerner
@ReuvenLerner 9 ай бұрын
@@RahulJain-fk7bu Yes, it did. When you use df.apply with a lambda, it calls the lambda, one at a time, passing each row as an argument to the function. And the lambda that I wrote does indeed take an argument; that's what the df_ is right after the word "lambda".
Comparing values in Pandas with "diff" and "pct_change"
6:46
Python and Pandas with Reuven Lerner
Рет қаралды 697
Data Pipelines for ETL | Build Your Data Engineering Skillset
1:03:44
WHO CAN RUN FASTER?
00:23
Zhong
Рет қаралды 36 МЛН
Doing This Instead Of Studying.. 😳
00:12
Jojo Sim
Рет қаралды 32 МЛН
ПОМОГЛА НАЗЫВАЕТСЯ😂
00:20
Chapitosiki
Рет қаралды 25 МЛН
13. Add / Remove Rows From Pandas Data Frame | Pandas drop row
8:28
Method chaining in Pandas
18:17
Python and Pandas with Reuven Lerner
Рет қаралды 1,9 М.
Python Data Classes Are AMAZING! Here's Why
16:11
Tech With Tim
Рет қаралды 76 М.
This INCREDIBLE trick will speed up your data processes.
12:54
Rob Mulla
Рет қаралды 262 М.
Boolean indexing in Pandas made simple
8:23
Python and Pandas with Reuven Lerner
Рет қаралды 1,8 М.
25 Nooby Pandas Coding Mistakes You Should NEVER make.
11:30
Rob Mulla
Рет қаралды 266 М.
Flipping Data with Pandas: Stack & Unstack
8:17
Python and Pandas with Reuven Lerner
Рет қаралды 2,6 М.
Pivot tables with Pandas
9:00
Python and Pandas with Reuven Lerner
Рет қаралды 38 М.
WHO CAN RUN FASTER?
00:23
Zhong
Рет қаралды 36 МЛН