Near Real Time Data Warehousing with Apache Spark and Delta Lake - Jasper Groot (Eventbrite)

  Рет қаралды 12,327

Databricks

Databricks

Күн бұрын

Пікірлер: 4
@AshikaUmanga
@AshikaUmanga 4 жыл бұрын
about SCD2 part ,if the "base" table is very large, does to load all the data into Spark memory for the merge ?
@AshikaUmanga
@AshikaUmanga 4 жыл бұрын
so each parque file in DeltaLake stores only one record?
@saulofurtado
@saulofurtado 4 жыл бұрын
I would like to know too. Do you already have the answer?
Tuna 🍣 ​⁠@patrickzeinali ​⁠@ChefRush
00:48
albert_cancook
Рет қаралды 102 МЛН
Delta Lake for Apache Spark - Why do we need Delta Lake for Spark?
18:57
Learning Journal
Рет қаралды 46 М.
Building Robust Production Data Pipelines with Databricks Delta
39:52