Why I Switched From Pandas to Polars | TDE Workshop

  Рет қаралды 10,184

The Data Entrepreneurs

The Data Entrepreneurs

Күн бұрын

Пікірлер: 19
@ИльяОжмегов
@ИльяОжмегов Жыл бұрын
Great Job Ben! 👍
@feifa13
@feifa13 Жыл бұрын
Thanks Ilia!
@smellypunks
@smellypunks 4 ай бұрын
It is a shame that the lazy API is so entangled into the API. Might be nice to write generic code which then has the option to switch on the lazy API with one single change. I don't like the idea of having to rewrite the whole codebase to switch between lazy and eager. I question if that was a good design decision from polars. - Side note please always upload videos in 1080p
@ShawhinTalebi
@ShawhinTalebi 4 ай бұрын
Here's my solution: cmd+f "scan_" replace with "read_" 😂 P.S. I'm on Mac
@spikeydude114
@spikeydude114 11 ай бұрын
Although I see the benefits of Polars. I haven't had enough obstacle with Pandas for my workflows. I don't deal datasets that exceed memory and I think currently I can extend my memory limit using Dask ... but looking forward to the development of Polars and will likely adopt once it has more support!
@virushk
@virushk 11 ай бұрын
Same situation here. I find Pandas and Dask to be sufficient tools for my workflows
@JOHNSMITH-ve3rq
@JOHNSMITH-ve3rq 10 ай бұрын
Chatgpt knows pandas much better. For exploratory work probably not an issue. But if shopping something to prod and want to keep it very fast and minimise system resource then polars seems a better choice.
@samuelswatson
@samuelswatson 10 ай бұрын
To me the appeal is the coherence of the API and the superior execution model. But the ecosystem disadvantages associated with using a much less popular library are substantial.
@signoc1964
@signoc1964 10 ай бұрын
@@samuelswatson but polars has a to_pandas() method, so the disadvantages is easily overcome, so its more like if you your doing simple things, then its unneccesary to bring in polars. We replaced a lot advanced elt(not etl) with polars. 16 000 lines of sql code done with the main transfroms done in polars instead, for this task it's excellent and translated really well, and a lot of stuff is easier to to in polars than in sql for example. Doing the same in pandas is a nightmare. Translating advanced sql code to pandas is a hard job.
@samuelswatson
@samuelswatson 10 ай бұрын
​@@signoc1964 That seems to me to be the best use case for Polars (replacing complex SQL in transformation pipelines, especially because of its composability), so it's cool to hear another testimonial for its success in that context.
@DarrenSaw
@DarrenSaw Жыл бұрын
Pandas is a massive mess. It's very easy to write very poor code in Pandas but to write it well is not that intuitive, Matt Harrison has written some great stuff, but it's not that easy to learn. Polars is way better and improving all the time. It's much easier to write and way quicker. The lazy API is a thing of beauty.
@TheDataEntrepreneurs
@TheDataEntrepreneurs Жыл бұрын
I'm looking forward to using Polars more in my own workflow -Shaw
@MartyAckerman310
@MartyAckerman310 11 ай бұрын
I agree, Pandas' learning curve was steeper for me than R. But I've kind of settled on a consistent workflow(.loc[:,['col']] instead of ['col'], and dotchaining) that minimizes the surprises.
@signoc1964
@signoc1964 10 ай бұрын
One problem with polars though is that "pandas" developers then to write "polars" code like they write pandas code, and to some extent it is possible which gives people a bad example, since a couple of those. Polars becomes like pandas then executing in serial instead of parallell.
@ravishmahajan9314
@ravishmahajan9314 10 ай бұрын
Can Polars replace pyspark Or hadoop?
@TheDataEntrepreneurs
@TheDataEntrepreneurs 10 ай бұрын
Good question. Here’s a response from Ben. “I’m not entirely sure tbh. i'm pretty sure pyspark is more scalable (e.g. > 1 TB data), but polars is better for data processing on your local machine (e.g. < 1 TB). i don't think Polars has so much stuff yet like pyspark does for distributed computing, whereas that is pretty much what pyspark was built for afaik.”
@Dmitrii-q6p
@Dmitrii-q6p 5 ай бұрын
man, we can read. why read everything on the screen?
How to Automate Anything with Python and LLMs | TDE Workshop
45:26
The Data Entrepreneurs
Рет қаралды 8 М.
DuckDB vs Pandas vs Polars For Python devs
12:05
MotherDuck
Рет қаралды 18 М.
WORLD BEST MAGIC SECRETS
00:50
MasomkaMagic
Рет қаралды 54 МЛН
iPhone or Chocolate??
00:16
Hungry FAM
Рет қаралды 40 МЛН
escape in roblox in real life
00:13
Kan Andrey
Рет қаралды 91 МЛН
Why I chose Python & Polars for Data Analysis
24:33
John Watson Rooney
Рет қаралды 7 М.
Why I Switched from Python to Rust for AI Deployment
9:57
Code In a Jiffy
Рет қаралды 26 М.
Polars Is The Faster Pandas
8:53
NeuralNine
Рет қаралды 15 М.
This INCREDIBLE trick will speed up your data processes.
12:54
Rob Mulla
Рет қаралды 265 М.