Projection and Predicate pushdown in Apache Parquet

  Рет қаралды 2,732

Learn Data with Mark

Learn Data with Mark

Күн бұрын

Пікірлер: 10
@pawarbi4675
@pawarbi4675 Жыл бұрын
Enjoying videos Mark, keep going. One suggestion - slow down a bit and highlight the code/ explain the code a bit (not in detail) so we get context.
@learndatawithmark
@learndatawithmark Жыл бұрын
Thanks and I'll try! Although I have been told to slow down for many years and I'm clearly not great at doing that!
@galsl
@galsl 11 ай бұрын
can you provide examples where predicate pushdown is not possible with parquet?
@dhaval1489
@dhaval1489 Жыл бұрын
Can you do a tutorial of Duckbd + ibis, I am totally new to databases, I am actually familiar with excel, pandas and Polars, just starting
@learndatawithmark
@learndatawithmark Жыл бұрын
Hi! Thanks for your comment. I haven't used Ibis before, but I'm gonna take a look and will try to make an intro tutorial 🙂
@DerekMahar
@DerekMahar Жыл бұрын
Where may I find pqrs?
@learndatawithmark
@learndatawithmark Жыл бұрын
Sorry forgot to put the link. Here we go: github.com/manojkarthick/pqrs
@michaelhunger6160
@michaelhunger6160 Жыл бұрын
does the pandas API also support predicate pushdown?
@learndatawithmark
@learndatawithmark Жыл бұрын
It does projection pushdown via the columns parameter (pandas.pydata.org/docs/reference/api/pandas.read_parquet.html) And I think if you'd use the pyarrow engine you would get predicate pushdown too. Pandas delegates the Parquet reading to fastparquet or pyarrow under the covers, it doesn't have a reader itself as far as I'm aware.
@jeevan88888
@jeevan88888 Жыл бұрын
@@learndatawithmark great insights Mark, just slowdown the explanation in the main part of the code.
How are strings encoded in Apache Parquet?
5:53
Learn Data with Mark
Рет қаралды 1,3 М.
Dynamic Partition Pruning in Apache Spark
9:32
Learning Journal
Рет қаралды 14 М.
Une nouvelle voiture pour Noël 🥹
00:28
Nicocapone
Рет қаралды 9 МЛН
How to treat Acne💉
00:31
ISSEI / いっせい
Рет қаралды 108 МЛН
It works #beatbox #tiktok
00:34
BeatboxJCOP
Рет қаралды 41 МЛН
This INCREDIBLE trick will speed up your data processes.
12:54
Rob Mulla
Рет қаралды 271 М.
Spark Interview Question | Partition Pruning | Predicate Pushdown
8:17
Dynamic Partition Pruning | Spark Performance Tuning
6:32
Data Savvy
Рет қаралды 42 М.
Was I Wrong About Blazor? | Coding Shorts 111
16:35
Shawn Wildermuth
Рет қаралды 30 М.
Event-Driven Architecture (EDA) vs Request/Response (RR)
12:00
Confluent
Рет қаралды 179 М.
Local GraphRAG with LLaMa 3.1 - LangChain, Ollama & Neo4j
15:01
Coding Crash Courses
Рет қаралды 36 М.