Guaranteeing Data Quality SLAs with Deequ & Databand

  Рет қаралды 5,138

Databricks

Databricks

Күн бұрын

As the importance of data grows and its connection to business value becomes more direct, data engineering teams are increasingly adopting service level agreements (SLAs) for how they deliver data, covering new factors like data freshness, completeness, and accuracy.
In this session we’ll discuss how to use Deequ, a data quality library that’s purpose-built for Spark, to develop a data monitoring and QA system that will enable you to meet SLAs guaranteed to your analytics users, scientists, and other business stakeholders. We’ll cover how to use Deequ to create quality checks that report metrics and enforce rules on data arrivals, schemas, distributions, and custom metrics. We’ll cover how to visualize, trend, and alert on those metrics using pipeline observability tools. And we’ll discuss common challenges that teams face when setting up data quality logging infrastructure and best practices for adoption.
We’ll use common examples such as machine learning, data transformation, and replication pipelines (such as moving data from S3 to Delta Lake).
With these tools, you’ll be able to create more stable, reliable pipelines that your business can depend on.
Connect with us:
Website: databricks.com
Facebook: / databricksinc
Twitter: / databricks
LinkedIn: / databricks
Instagram: / databricksinc Databricks is proud to announce that Gartner has named us a Leader in both the 2021 Magic Quadrant for Cloud Database Management Systems and the 2021 Magic Quadrant for Data Science and Machine Learning Platforms. Download the reports here. databricks.com...

Пікірлер: 3
@vikasmikado
@vikasmikado 3 жыл бұрын
Good implementation!
@nikitamadhyan8852
@nikitamadhyan8852 Жыл бұрын
i am getting an error while executing AnalysisRunner i.e., 'Javapackage' object is not callable could anyone please help here
@D77798
@D77798 2 жыл бұрын
Hopla, somewhat good yet gald, if you could show failed checks and the rows!!?
Data Quality With or Without Apache Spark and Its Ecosystem
27:47
Angry Sigma Dog 🤣🤣 Aayush #momson #memes #funny #comedy
00:16
ASquare Crew
Рет қаралды 50 МЛН
Fake watermelon by Secret Vlog
00:16
Secret Vlog
Рет қаралды 15 МЛН
Or is Harriet Quinn good? #cosplay#joker #Harriet Quinn
00:20
佐助与鸣人
Рет қаралды 53 МЛН
Learn to Use Databricks for Data Science
38:56
Databricks
Рет қаралды 8 М.
Getting started with Databricks Lakehouse Monitoring
14:36
NextGenLakehouse
Рет қаралды 2,9 М.
Rethinking Feature Stores
27:14
Databricks
Рет қаралды 11 М.
Data Quality on Apache Spark
42:53
Data Zen Community
Рет қаралды 4,7 М.
Data Quality and Reliability with Soda Core - Vijay Kiran
1:30:59
DataTalksClub ⬛
Рет қаралды 3,9 М.
Sebastian Schelter - Deequ: Unit tests for data
57:01
CWI Database Architectures
Рет қаралды 3,2 М.
Accelerating Data Ingestion with Databricks Autoloader
59:25
Databricks
Рет қаралды 68 М.
Angry Sigma Dog 🤣🤣 Aayush #momson #memes #funny #comedy
00:16
ASquare Crew
Рет қаралды 50 МЛН