Rounding out the lakehouse with the GOLD layer in Azure Synapse

  Рет қаралды 13,815

Guy in a Cube

Guy in a Cube

Күн бұрын

Пікірлер: 25
@dbalkin777
@dbalkin777 2 жыл бұрын
I love all of the Azure videos, but what is the entry point for those of us who are primarily PBI developers? Is there a road map to learning all of this? Are you planning on creating a How-To to get started in this world?
@FrostSpike
@FrostSpike 2 жыл бұрын
03:27 The data isn't "autonomous"! The "A" in ACID represents "Atomicity" rather than "Automicity". This just means that a set of changes to the data either completes fully, or not at all i.e. the "transaction" wrapping the changes is "rolled back" in its entirety on an error. Rather than changes partially occurring, the consistency of the database is maintained by reverting any mutating operations already done in the transaction prior to the error.
@AessamL
@AessamL 2 жыл бұрын
@Stijn thx for this video series, i kinda hope that you continue writing in your blog as well. I can only see part 1 for now.
@edgards
@edgards 2 жыл бұрын
Patrick's breathing very close to the mic! Awesome video regardless!
@tangerinekpopper1868
@tangerinekpopper1868 Жыл бұрын
Noticed that too. Lol
@yogeshnikam8064
@yogeshnikam8064 2 жыл бұрын
Great content! Will it be available in dedicated SQL pool as well?
@scottytarvan9523
@scottytarvan9523 2 жыл бұрын
I think that there is some confusion here (maybe on my part). You mention that the lake database is not a persistent store of data but a meta-data model over a file system, agreed on that. But then you mention that it supports ACID transactions even though data is not persisted in the gold layer? I would describe this a little different: The silver layer is persisted via the delta format allowing ACID transactions over a file system, this is the main feature of delta lake allowing the lakehouse architecture. The gold layer is a meta-data model (also called virtual tables) that is able to query the silver layer at runtime using the sql serverless pool. The virtual tables have been created in such a way that they simulate facts and dimensions in a star schema. Couple of problems: - You created a fake SCD type 2 table in your gold layer, this needs to be implemented properly in the silver layer, not a trivial task. - You would generally need to generate surrogate keys to join facts and dimensions, either by implementing your own algorithm or by using the built-in delta identity column, once again not a trivial task when you lookup foreign keys. -You need to consider partition pruning between facts and dimensions for query performance -You often need a calendar dimension for date calculations -You need to compare performance and cost between the synapse serverless sql pool and the databricks sql endpoint (now strangley called a datawarehouse), they can both be used to create virtual tables/views and be attached to power bi I appreciate the video but this is far from a real world working application that can scale and has considered performance and cost factors.
@anonimoi4957
@anonimoi4957 2 жыл бұрын
All good but this breathing sound doesn`t let me focus enough
@gabrielmorais7312
@gabrielmorais7312 2 жыл бұрын
True!
@bitips
@bitips Жыл бұрын
I notice that in all videos from Guy in a cube.
@kevinfrank7044
@kevinfrank7044 2 жыл бұрын
Has that SCD video been created yet? I'm looking for a good video on how to create a type 2 SCD with Databricks in Azure Synapse.
@lkassen
@lkassen Жыл бұрын
This is bananas! 🤣
@bunnihilator
@bunnihilator Жыл бұрын
the drop table if exists gives an hive error about illegal argument exception null path
@MrRittick
@MrRittick Жыл бұрын
I am a big fan of your posts. However you have been saying "Automicity/Autonomous" for the ACID properties, it should be Atomicity. Which means the transactions should happen only once(Atom), either commit or rollback. Great works! cheers
@attapon56
@attapon56 2 жыл бұрын
May I ask, why delta lake tool doesn't come GUI like other pipeline tools or use SQL as a main language, instead of python class? are there any reason behind this ? In my view, moving data across pool seems very common and should have been more simple and automated.
@danielmudie605
@danielmudie605 2 жыл бұрын
Maybe it's a stupid question... but if the gold layer tables aren't persistent how can I add indexes to tune them for specific queries? Or is there another "platinum" layer we'll be introduced to in another video?
@B-Luv
@B-Luv 2 жыл бұрын
Disclaimer, I'm just learning this too. But do you need to tune it for specific queries? Is your flow not to pull all of the data from gold into the Power BI service?
@NirmalRohan-og4oe
@NirmalRohan-og4oe 2 ай бұрын
We mant more videos in this series
@jkehannon
@jkehannon 2 жыл бұрын
omg the breathing
@bitips
@bitips Жыл бұрын
I notice that in all videos from Guy in a cube.
@kel78v2
@kel78v2 5 ай бұрын
These madellion steps were done from Azure Synapse. Would have loved to see how lakehouse is used instead.
@radekou
@radekou 2 жыл бұрын
I hope you're giving the product team a hard time for that "col" at 3:55 - seeing schema is not yet read properly when using Delta format :)
@stijnwynants7307
@stijnwynants7307 2 жыл бұрын
It should be fixed :)
@Vikasptl07
@Vikasptl07 2 жыл бұрын
Microsoft is catching up with databricks on delta part😛
@joshbennett5908
@joshbennett5908 2 жыл бұрын
Creepy breathing
Azure Synapse Analytics - Lake Database Map Tool
20:46
Advancing Analytics
Рет қаралды 6 М.
REAL or FAKE? #beatbox #tiktok
01:03
BeatboxJCOP
Рет қаралды 17 МЛН
Quando eu quero Sushi (sem desperdiçar) 🍣
00:26
Los Wagners
Рет қаралды 15 МЛН
Try this prank with your friends 😂 @karina-kola
00:18
Andrey Grechka
Рет қаралды 9 МЛН
Top 5 things to get started with Azure Synapse Analytics
9:18
Guy in a Cube
Рет қаралды 12 М.
Core Databricks: Understand the Hive Metastore
22:12
Bryan Cafferky
Рет қаралды 18 М.
Azure Synapse Serverless vs Dedicated SQL Pool
10:25
Guy in a Cube
Рет қаралды 26 М.
🚨 YOU'RE VISUALIZING YOUR DATA WRONG. And Here's Why...
17:11
Adam Finer - Learn BI Online
Рет қаралды 258 М.
Azure Data LakeHouse in an Hour Virtual Workshop
1:05:31
Insights & Outliers
Рет қаралды 6 М.
Data Warehousing on the Lakehouse
39:15
Databricks
Рет қаралды 6 М.
REAL or FAKE? #beatbox #tiktok
01:03
BeatboxJCOP
Рет қаралды 17 МЛН