Рет қаралды 2,420
Too many data pipelines generate outputs with no consideration for the QUALITY of the data generated. In this presentation, Sandy Ryza (lead engineer on the open-source Dagster project) argues that the orchestration layer is uniquely placed to understand, define, and act on data quality checks.
Join Sandy for a deep dive into how data quality checks should be a first-class citizen in any data platform, and are a natural fit for delivering high-quality data to the organization, reducing computing costs, and improving the observability and debugging of your data pipelines.
00:00 On data pipelines and data quality
02:29 What are data quality checks?
06:15 Defining data quality checks
10:59 Executing data quality checks
13:02 Observing data quality checks
15:12 Alerting on data quality checks
16:36 Asset check patterns
21:31 Conclusion
You can explore data quality checks for free with a 30-day free trial of Dagster hosted on Dagster Cloud: dagster.io/lp/dagster-cloud-trial