Рет қаралды 1,188
Apache Hudi is an open-source data lake storage engine that provides a unified view of data for both batch and streaming applications. Hudi can be used to build data lakes that are both scalable and fault-tolerant.
In this video, we will discuss the following topics:
- What is Apache Hudi?
- How does Hudi work?
- What are the benefits of using Hudi?
- How to build a data lake using Hudi
This video is for anyone who is interested in learning more about Apache Hudi or building a data lake.
00:00 Getting started
02:06 About Datacouch
09:18 Agenda
10:07 Data Warehouses vs Data Lakes vs Data Lakehouses
13:00 What is Data Lakehouse?
15:38 How does a Lakehouse look like?
18:33 Why to build a Lakehouse?
22:40 Table Formats
29:52 What is Apache Hudi?
33:40 Advantages of Hudi
37:39 How Hudi Works?
39:31 Hudi Timeline
42:52 Hudi Tables and types
50:49 Hudi Table Services
56:13 Hudi Query Types
1:01:36 Type of Keys in Hudi
1:03:17 Indexing in Hudi
1:06:25 Write operation Types
1:08:51 Hudi DeltaStreamer
1:11:31 Data Serving
1:13:44 Demo: Working with Apache Hudi
#apache #hudi #datacouch #meetup #session #knowledgesharing #dcknowledgesharing #aiminds #allthingsdata