Ask Databricks about Medallion Architecture Best Practices with Franco Patano

Рет қаралды 3,688

Күн бұрын

Пікірлер

@alexanderparakhnevich5663 Жыл бұрын

amazing conversation - all discussed questions could come from me.... we are facing now same things in our refactoring project Thnaks a lot!

@carloscantu75 Жыл бұрын

Very useful, thanks for sharing!

@NeumsFor9 Жыл бұрын

I have the exact track as Franco, and I fully applaud Franco's championing of this. Keep it up!

@danhorus Жыл бұрын

Awesome discussion! Congrats :)

@NeumsFor9 Жыл бұрын

In the best pipelines quantify happy path, schema on read error paths, schema on write error paths, inferred data types, and actual data types. In the newer era it is good to now quantify how much the schema evolved and even quantify and persist the movement of data quality statistics as a file or data pipeline changes.

@sravankumar1767 Жыл бұрын

Superb 🎉 👌

@fb-gu2er Жыл бұрын

Genuine question. Why using Dbt forETL when you have Spark? I can’t grasp why people do that

@fb-gu2er Жыл бұрын

Schema evolution is all good until it infers the wrong types. When you want decimal and gives you a double. I prefer rigid schemas until I have no choice

@fb-gu2er Жыл бұрын

I ended up creating my own framework/app that will evolve my schemas by releasing, tables and views, similar to flyway. People tend to embrace the new technologies blindly and sometimes old is good. I totally prefer if I can, with the rigid model. Automatic schema evolution is not modeling. You need constraints, table properties and such. Schema evolution will never be as smart as a design put together by a DBA