Evolving Data Pipelines at Scale

  Рет қаралды 1,598

Data Council

Data Council

Күн бұрын

Пікірлер: 4
@jasonkhaihoang781
@jasonkhaihoang781 2 ай бұрын
I have one question. This demo and approach works backward from prod to dev. However, what if I want to start something totally new from dev and then promote that to prod, such as adding a new ingested data source and new transformation models? Also, how do you switch the Views if the ingested data is not yet available on Prod? I think for ingested data, you cannot use the concept of Virtual Layer. The Virtual Layer will work only with "T"/Transformation models. Finally, what if I have more than 2 environments, say dev/test/preprod/prod? I like the concept of allowing developers to work on production data directly to remove the gap between dev and prod. However, there are some concerns that I cannot get my head around, thus I still do not have confidence to start using SQLMesh.
@1988YUVAL
@1988YUVAL 8 ай бұрын
Very interesting presentation. Looks like a very well thought out solution for managing data transformations. I wonder if it will take off like dbt.
@tratkotratkov126
@tratkotratkov126 8 ай бұрын
Great, very much needed and promising project ! However, it is not quiet clear what do you mean when you are talking about data versioning (DV) - do you version the data as LakeFS does or you are just versioning the source code which is producing this data. Also the diagrams in the presentation (Virtual/Physical layers) I find confusing and not easy to grasp at first glance. It will be nice in the next iteration if you use some real world/practical entities to describe demo objects like customer, product, sales etc. instead of just “source” and wrap the demo in some quick story like “Meet Alex, the data engineer at TechCorp, a rapidly growing tech company. Alex is responsible for managing the company’s data pipelines, ensuring that data from various sources is clean, consistent, and available for analysis” etc. you got the idea. Finally I would suggest you switch the sequence and the time you spend on the theory and the demo part - show your fantastic open source project demo first and how easy is implementing the 3 concepts in meaningful story then after each segment just mention the theoretical part, but don’t allow the theory to consume 75% of your presentation unless you want to be considered as one of the many Data Governance “gurus” which are presenting on this channel. Whishing you all good luck with this fantastic project !
@jasonkhaihoang781
@jasonkhaihoang781 2 ай бұрын
Agree. Putting the demo first may avoid some confusion from the beginning.
Open Data Foundations across Hudi, Iceberg and Delta
34:24
Data Council
Рет қаралды 3,9 М.
How Beam Uses Code-Based Dashboards to Scale Analytics Products
23:17
Mom Hack for Cooking Solo with a Little One! 🍳👶
00:15
5-Minute Crafts HOUSE
Рет қаралды 23 МЛН
黑天使被操控了#short #angel #clown
00:40
Super Beauty team
Рет қаралды 61 МЛН
Sigma Kid Mistake #funny #sigma
00:17
CRAZY GREAPA
Рет қаралды 30 МЛН
Cheerleader Transformation That Left Everyone Speechless! #shorts
00:27
Fabiosa Best Lifehacks
Рет қаралды 16 МЛН
LCM: The Ultimate Evolution of AI? Large Concept Models
30:13
Discover AI
Рет қаралды 57 М.
Systems Design in an Hour
1:11:00
Jordan has no life
Рет қаралды 41 М.
Unified Stream/Batch Execution with Ibis
33:58
Data Council
Рет қаралды 931
dbt Core Vs. SQLMesh for SQL Transformations!
16:01
The Data Guy
Рет қаралды 1,3 М.
Has Generative AI Already Peaked? - Computerphile
12:48
Computerphile
Рет қаралды 1,1 МЛН
SQLite: How it works, by Richard Hipp
1:39:27
Prof. Dr. Jens Dittrich, Big Data Analytics
Рет қаралды 26 М.
Evolution of software architecture with the co-creator of UML (Grady Booch)
1:30:43
The Pragmatic Engineer
Рет қаралды 113 М.
Open-Source Spotlight - SQLMesh - Toby Mao
17:27
DataTalksClub ⬛
Рет қаралды 1,7 М.
Mom Hack for Cooking Solo with a Little One! 🍳👶
00:15
5-Minute Crafts HOUSE
Рет қаралды 23 МЛН