Data quality for big datasets

  Рет қаралды 34

Data Science Festival

Data Science Festival

15 күн бұрын

A talk by Akshay Dineshkumar Jain from Innovate UK.
The talk will cover automated data quality checks performed by large organisations to execute data reliability checks on big datasets in real time using data profiling and machine learning techniques. The demo will use the open source library Deequ, Spark framework and reporting & notifications tools to enforce data issues in a proactive manner. I will be covering an example of a framework I have developed at Amazon and Visa to validate customer facing data and its integration with notification tools based on the statistical methods.
Technical Level: Technical practitioner
This session was part of the Data Science Festival MayDay event 2024. Find out more at datasciencefestival.com/event...
The Data Science Festival is the place for data-driven people to come together, share cutting-edge ideas, and solve real-world problems. We run monthly events, meet-ups, and the biggest free-to-attend data festivals in the UK. Join the community at datasciencefestival.com/

Пікірлер
The moment we stopped understanding AI [AlexNet]
17:38
Welch Labs
Рет қаралды 649 М.
Navigating the AI Revolution: A Blueprint for Business Success
37:04
Data Science Festival
Рет қаралды 24
Heartwarming moment as priest rescues ceremony with kindness #shorts
00:33
Fabiosa Best Lifehacks
Рет қаралды 8 МЛН
Can You Draw A PERFECTLY Dotted Line?
00:55
Stokes Twins
Рет қаралды 114 МЛН
Implementing a Data Quality Framework in Purview
51:07
SQLBits
Рет қаралды 9 М.
Boost your career as a data professional
38:24
Data Science Festival
Рет қаралды 91
Indian student's Reddit confession goes horribly wrong
5:00
Brut India
Рет қаралды 1,5 МЛН
7 tips to ensure your data project succeeds
38:06
Data Science Festival
Рет қаралды 37
How I'd Learn AI (If I Had to Start Over)
15:04
Thu Vu data analytics
Рет қаралды 743 М.
Data Warehouse vs Data Lake vs Data Lakehouse
9:32
Jesper Lowgren
Рет қаралды 40 М.
GraphRAG: LLM-Derived Knowledge Graphs for RAG
15:40
Alex Chao
Рет қаралды 92 М.
Master Databricks and Apache Spark Step by Step: Lesson 1 - Introduction
32:23