Microsoft Fabric: Inspecting 28 MILLION row dataset in Bronze Lakehouse - Part 2

  Рет қаралды 6,869

endjin

endjin

Күн бұрын

Пікірлер
@endjin
@endjin 6 ай бұрын
Thank you for watching, if you enjoyed this episode, please hit like 👍subscribe, and turn notifications on 🔔it helps us more than you know. 🙏
@ThiagoOliveira-fj9me
@ThiagoOliveira-fj9me 2 ай бұрын
That is brilliant -- thanks for making this series of such quality on Microsoft Fabric.
@rajavarman4657
@rajavarman4657 Жыл бұрын
Nice representation of the medallion architecture using Microsoft Fabric!
@1974mkc
@1974mkc Жыл бұрын
Excellent Demonstration. Will look forward to the upcoming episodes. Many Thanks
@endjin
@endjin Жыл бұрын
Thank you! There's a lot more to come!
@phoenixoo7
@phoenixoo7 Жыл бұрын
Very nice presentation... Looking forward to the upcoming episodes...
@endjin
@endjin Жыл бұрын
Thank you! Next episode out next week!
@MuhammadZeeshanMalik-u4u
@MuhammadZeeshanMalik-u4u Жыл бұрын
That's a great video. However, I am still not able to understand the presence of separate semantic layer. It seems like presentation layer not semantic. Need to understand this confusion please. Looking forward to your response as I am currently going to implement it.
@endjin
@endjin Жыл бұрын
Thanks for the kind words! To address your statement: Power BI has effectively branded itself as a "semantic layer" for a number of years now. Being able to capture domain logic, user-friendly naming, calculations and relationships all encapsulates the "semantic layer", which goes above and beyond what you can generally do in your upstream data store. The beauty of Fabric and Direct Lake is that you don't actually need to import a copy of the "Gold" data into the semantic layer - it's the same copy of data and it queries it directly. But what the semantic layer is still useful for is augmenting the Gold layer with additional metadata - column/table renames, additional table relationships and measures. Think of Gold as your "serving" layer. You've processed it into the structure where you don't have to do much in the downstream BI layer to get the data into the correct shape (oftentimes your M queries will be nothing more than just pointing to the tables in the data store). All you need to focus on then is adding the finishing touches required for the end-user to consume.
@KickersKaiser
@KickersKaiser 10 ай бұрын
Thank you for your video! How would you recommend organizing the 3 lakehouses in terms of data governance and data access? 3 separate workspaces or one workspace? If one workspace, how do you organize data access and governance?
@endjin
@endjin 6 ай бұрын
We would default to one workspace containing all Lakehouses. We'd then have a Dev, Test and Prod version of this workspace. A separate workspace per Bronze, Silver and Gold would increase maintenance complexity, especially when taking into account a Dev/Test/Prod version. W.r.t. Data access - this one's tough to answer, because it totally depends on your security requirements. Fabric offers Workspace-level roles, artifact-level permissions, and more recently Data Access Roles (blog.fabric.microsoft.com/en-us/blog/9046/). In our experience, if you're developing an Enterprise solution, developers will tend to have workspace level roles. End users will tend to have Artifact-level or more granular Data Access roles within an artifact. E.g. an end-user would be given read-only access to the "Gold" Lakehouse, for example. For Self-Service solutions (managed or business-led), you might want to loosen your restrictions to enable users to create items within a workspace, in which case they'd need a workspace level role. Or you might want to just provide write access to existing artifacts - in which case you'd use artifact level permissions. My point is - it totally depends. But there are currently quite a few ways to implement security. I would suggest mapping out your roles/personas and understanding what each persona needs to be able to do, and then try to map that to Fabric's permissions model. From a Data Governance perspective, as ever, it's all about consistency. There are various features in Fabric to help with Governance and Discoverability (e.g. Domains, Workspace Contact Lists, Certification/Endorsement, Information Protection (Sensitivity Labels), Purview Hub in Fabric, Purview Compliance Portal (for auditing), Metadata scanning). I recommend you read through this article: learn.microsoft.com/en-us/fabric/governance/governance-compliance-overview Hope this helps!
@endjin
@endjin 6 ай бұрын
Part 8 - Good Notebook Development Practices - is now available: kzbin.info/www/bejne/i6q2Z42LnM1gjcU
@clyderodrigo9183
@clyderodrigo9183 10 ай бұрын
👍👍
@endjin
@endjin 6 ай бұрын
Thanks!
@raviv5109
@raviv5109 Жыл бұрын
Really great! But request you to include the link of previous parts and also mention in title. In this way we know it is series and there are other parts. Thx!
@endjin
@endjin Жыл бұрын
There should be a link to the previous part in the description, but thanks for the suggestion
@MucahitKatirci
@MucahitKatirci 8 ай бұрын
Thanks
@endjin
@endjin 6 ай бұрын
There should be a new video dropping soon, seeing that you've binged everything so far!
@michaelmurgado
@michaelmurgado Жыл бұрын
Would you mind sharing the visio template used?
@endjin
@endjin Жыл бұрын
Yes, we'll share that soon! If you subscribe to our blog, we'll post when it's available: endjin.com/rss.xml
@ChrisDowns88
@ChrisDowns88 Жыл бұрын
@@endjin would be super useful! Has this been released yet?
@endjin
@endjin Жыл бұрын
@@ChrisDowns88 Not yet. Barry is working on it.
@mnhworkdev1652
@mnhworkdev1652 Жыл бұрын
Not too familiar with Visio however I know it has a components library, where did you find all of the icons and HM Land Registry logo
@endjin
@endjin Жыл бұрын
You can download Microsoft Fabric icons from learn.microsoft.com/en-us/fabric/get-started/icons
@GuillaumeBerthier
@GuillaumeBerthier Жыл бұрын
@@endjin thanks for this video series it's very interesting! is there any advantage to break down Bronze and Silver across 2 different Fabric Lakehouse items ? why not just considering 1 unique Fabric item where the File zone will be the Bronze layer and the (managed) Table zone would be the Silver layer ? PS: and thanks for the Fabric item icons link ; would you mind to share your End2End Demo architecture Visio file in the description section as a reference for this video eventually?
@endjin
@endjin Жыл бұрын
One reason is that some folks do want to store their `raw` data in a queryable format (i.e. Delta tables). If we were to share a Lakehouse with the `Silver` layer and then create formal tables for our raw `Bronze` data, then we'd have `Bronze` and `Silver` tables intermixed, which is a bit of an organizational headache in a single Lakehouse (since there's no way to group the separate tables other than by table name convention - i.e. there's no concept of a "schema" in a Lakehouse). ​ There's also a clear separation of data when using separate Lakehouses, which can mean that different security provisions can be put in each zone. You might not want people to have access to `raw` (i.e. `Bronze` data), but you might want them to be able to access `Silver` data. Having everything in a single Lakehouse would make this a little tricky. ​ W.r.t. the Visio diagram - yes, we'll share that soon! If you subscribe to our blog, we'll publish a post when it's available: endjin.com/rss.xml
@GuillaumeBerthier
@GuillaumeBerthier Жыл бұрын
@@endjin I did subscribe to your YT Channel and RSS feed but I didn't see the Visio diagram , any chance I missed just it ? Thanks😜
@andreanneee1995
@andreanneee1995 8 ай бұрын
CAN you share the visio diagram?
@endjin
@endjin 6 ай бұрын
Ed's planning to release it once he gets to the end of the series.
@brianmunyao
@brianmunyao Жыл бұрын
What are the pros and cons of using a lakehouse for each piece of the medallion architecture vs using a single lakehouse and a file folder for each piece?
@endjin
@endjin Жыл бұрын
Hi, thanks for the comment! This question has been asked by other commentors too... so we'll just reshare our answer... The reality is that there's no one-size-fits-all approach to architecting solutions in Fabric. When it comes to your specific question about how the Medallion architecture maps to Lakehouses in Fabric: we default to one Lakehouse per layer. That's mainly because of two things - organizational flexibility and security flexibility. If we start combining the layers into a single Lakehouse, we lose flexibility on the organization of the managed Delta tables that we create. That's because within a single Lakehouse there's no way to group related tables other than by using a table naming convention - i.e. there's no equivalent of a T-SQL "SCHEMA" in a Lakehouse, nor is there a notion of sub-folders. Generally in the Silver and Gold layers we're creating Tables as outputs, rather than Files, so this flexibility is useful in order to separate Tables from one layer to the next layer. If you're purely dealing with "Files" then there's less of an obvious benefit of creating separate Lakehouses. But I would question why Silver/Gold datasets are being stored in the Files section rather than the Tables section - the Tables section is written in Delta format and heavily optimized for reporting purposes. From a security perspective, there's a clear separation of data when using separate Lakehouses, which can mean that different security provisions can be put in each layer. You might not want people to have access to `raw` (i.e. `Bronze` data), but you might want them to be able to access `Silver` data. Having everything in a single Lakehouse would make this a little tricky. In reality, though, "it depends". If you have a really simple use-case, one Lakehouse could be sufficient. I've also seen people suggest combining "Bronze and Silver" into a single Lakehouse and "Gold" into a separate Lakehouse, since in "Bronze" you're usually only dealing with "Files", and therefore you may as well utilize the "Tables" section for your "Silver" layer. You just need to determine what works best for your use-case, factoring in the above points alongside other concerns such as data residency and cost management/chargeback. And its likely that even internally, your Lakehouse architecture will differ from one project to the next. Remember in Fabric we have the power of "Shortcuts", which allows us to seamlessly combine data from other Lakehouses. So you can't go far wrong whatever architecture you choose!
@shivaog007
@shivaog007 10 ай бұрын
Can you share the architecture link here?
@endjin
@endjin 6 ай бұрын
I think the plan is to release some assets once the series is complete.
Car Bubble vs Lamborghini
00:33
Stokes Twins
Рет қаралды 31 МЛН
Seja Gentil com os Pequenos Animais 😿
00:20
Los Wagners
Рет қаралды 90 МЛН
Миллионер | 2 - серия
16:04
Million Show
Рет қаралды 1,9 МЛН
Evolutionary History of Microsoft Fabric - Spreadsheets to Lakehouse
33:37
Insights & Outliers
Рет қаралды 10 М.
Organize a Fabric Lakehouse using Medallion Architecture Design
36:06
Kamil Data Geek - Azure explained
Рет қаралды 5 М.
A 10 minute Tour Around Microsoft Fabric
12:37
endjin
Рет қаралды 6 М.
Microsoft Fabric: Local OneLake Tools
15:12
endjin
Рет қаралды 2,8 М.
I can EXPLAIN Microsoft Fabric with ONE WORD
18:25
Curbal
Рет қаралды 22 М.