Azure Purview - First Look at Dataset Lineage

  Рет қаралды 14,317

Advancing Analytics

Advancing Analytics

Күн бұрын

Пікірлер: 37
@6Sloth9
@6Sloth9 3 жыл бұрын
Thanks Simon. Your channel and all your videos do a great job and are a big help for the community. It always brightens my day when I get a notification for a new video on your channel. Minor "complaint": the moving background is a bit distracting when you're presenting the slides. A static one woulr be better. Keep up the good work
@ijanrichowe
@ijanrichowe 3 жыл бұрын
Thank you, Simon. You were indeed born to do this! Thanks for all you've taught me over years! And good luck in this venture! 🤗
@saurabhkp89
@saurabhkp89 4 жыл бұрын
Thanks Simon, I look forward for the Atlas API lineage push for Spark and Delta datasets.
@HarmonicsOfNature
@HarmonicsOfNature 3 жыл бұрын
Good demos and explanations. Thanks! 1. One area I'd be interested in is how Purview helps with the collaborative development of the Business Glossary itself. Getting a cadre of business people from different parts of the organisation to agree on definitions, calculation and aggregation logic, etc. How do you define stakeholders for a domain or a system and notify them of a definitional proposal that effects them? 2. Also, for the relationships between Business Terms: can you create custom relationship types, such as Synonym, Type-Of, Calculated from, if those don't exist out of the box? 3. Lastly, can those relationship types have properties - so you could capture the "weighting" or importance of a factor for a Rule - and thereby use Purview not just as a Glossary, Ontology and Data Catalog but also as a repository of Rule logic, such as for scoring and weighting the importance of business terms on each other.
@chinmoysarangi9399
@chinmoysarangi9399 4 жыл бұрын
Thanks a lot Simon for your videos on Azure Purview. they helped me immensely, however, as you rightly pointed out, it does not detect Databricks NBs. Just supports only Copy Data operation. Wish they had a wee bit more options than just Copy Data. Thanks a lot for your videos and will wait for the next one.
@sunilgidwani6321
@sunilgidwani6321 3 жыл бұрын
Excellent Simon, really interesting video. I'm curious to see how to see the lineage work in Databricks & Stored Proc .. waiting for your next video.
@woutvaeremans7312
@woutvaeremans7312 4 жыл бұрын
Thank you Simon, a really interesting video. I'm curious to see if you can make that lineage work in Databricks...
@AdvancingAnalytics
@AdvancingAnalytics 4 жыл бұрын
My main enemy is time... but we'll see how I go :)
@andersboje
@andersboje 4 жыл бұрын
I just created a integration. This is how the end result looks like. Screenshot: www.evernote.com/l/ALVDHvuwFr1LMJfsBRLo7KRLS8d6hFdOimY
@surojitsengupta5162
@surojitsengupta5162 2 жыл бұрын
How do we deal with integrating delta tables on data bricks, with Azure purview?
@Sangeethsasidharanak
@Sangeethsasidharanak 4 жыл бұрын
great video....Could you please create a video on right way to do unit testing in Databricks notebook .with out using databricks connect
@AdvancingAnalytics
@AdvancingAnalytics 4 жыл бұрын
I'll have a think - honestly if there are any functions we want to wrap with unit testing, we pull that out into an isolated python wheel with unit tests baked into the build pipeline, rather than within the notebook. But if I think of a good pattern, I'll pop a video up :)
@lwhieldon1
@lwhieldon1 2 жыл бұрын
How do I access the data factory embedded into Azure Synapse? I don't see the pipelines I create there available in purview.
@abcoura
@abcoura 2 жыл бұрын
How about the same test with a synapse notebook. If the lineage will appear.
@marcocaviezel2672
@marcocaviezel2672 4 жыл бұрын
Hi Simon, Great Video, thanks a lot! have you tried using Dataflows in Data Factory to see if it can detect transformations? Maybe this could work since it is also in the ADF environment.
@AdvancingAnalytics
@AdvancingAnalytics 4 жыл бұрын
Hey - not tried yet, but Dataflows are on the list of the initial supported lineage objects so I'm assuming they work out of the box (dangerous assumption, I know!). There's a more detailed breakdown of what's currently supported by the Purview ADF integration here - docs.microsoft.com/en-us/azure/purview/how-to-link-azure-data-factory
@MortenHannibalsenOlsen
@MortenHannibalsenOlsen 4 жыл бұрын
@@AdvancingAnalytics I had the exact same thought as Marco. Would be a huge boon for Dataflows in ADF if it means that full lineage is documented automatically in Purview. Hope you get around to testing that as well
@QuanNguyen-wx9ls
@QuanNguyen-wx9ls 3 жыл бұрын
Hi Simon, Thanks for the very informative video! How does lineage work though for other movements of data, like say movement of data from a storage system into PowerBI for visualisation, or through SSIS? I can't seem to see a way to associate those data movements.
@AdvancingAnalytics
@AdvancingAnalytics 3 жыл бұрын
So if SSIS is run via ADF, it should be able to scrape it via the ADF linking, if it's an on-prem/virtual machine ssis then it's not covered. PowerBI pushes a load of info over, but I have NO idea what it does with storage system files. I'll look at Power BI scraping soon, see what I can find! Simon
@yadhurammc
@yadhurammc 4 жыл бұрын
Hi Simon, Have you tried checking what details we can get from Power BI data set perspectives. It would be nice if we can get details similar to what we get from dmv ;) Like Tables in a Power BI daataset, measure details, etc
@AdvancingAnalytics
@AdvancingAnalytics 4 жыл бұрын
I've not had a play yet, but the docs & screenshots talk about being able to track lineage into Power BI datasets, the reports that use them etc. I'll need to do a little setting up to get a reasonable demo working then we'll see how it looks end to end!
@ravilobo7977
@ravilobo7977 3 жыл бұрын
Red hexagon animation is a bit distracting. Would it be possible to remove that?
@anildangol
@anildangol 4 жыл бұрын
great! how about transformation happening inside stored procedures? i don’t think it can detect that
@AdvancingAnalytics
@AdvancingAnalytics 4 жыл бұрын
Yep, that won't be detected currently, but it's mentioned a lot in the docs so I'm curious how that will look. Worst case, we should be able to push lineage connections manually through the API!
@preetmistry
@preetmistry 4 жыл бұрын
Hi SImon, How can we fetch data from purview catalogue using rest api....I tried basic rest ...but I am interested in fetching data from scan result. Is it possible ?
@AdvancingAnalytics
@AdvancingAnalytics 4 жыл бұрын
You can bulk extract entities, or get the most up to date info for a single entity, definitely. There doesn't look to be an API for the results of a single scan (nor can you dig into the outcome of a specific scan in the GUI). I guess you could bulk extract entities and filter by their updated date to figure out what changed after a scan was run? The full swagger definition is available through the docs here: docs.microsoft.com/en-us/azure/purview/tutorial-using-rest-apis#view-the-rest-apis-documentation
@preetmistry
@preetmistry 4 жыл бұрын
@@AdvancingAnalytics can you please give any example using api url to extract bulk data ?
@LHSlash
@LHSlash 3 жыл бұрын
It doesn't pickup views using the data, right? So if I use a view in a powerbi it doesn't show me the lineage to the source data... that seems something quite critical before this is useful (both for views and stored procedures)
@AdvancingAnalytics
@AdvancingAnalytics 3 жыл бұрын
Yeah that's right, so views & procs would currently need manual lineage logging. Hoping this is a fairly easy one for them to sort out, given database objects have lineage metadata! Simon
@LHSlash
@LHSlash 3 жыл бұрын
@@AdvancingAnalytics Exactly, thanks Simon!
@williamsniekers
@williamsniekers 2 жыл бұрын
Thank you for this insightfull video - One comment: please remove these "moving parts" that have no use on the background. I suppose it's just me, but those are constantly pulling my attention away 🙂
@AndreasBergstedt
@AndreasBergstedt 4 жыл бұрын
Hey Simon, !=1st today, I get these questions all day long from my customers and some of the local sales and CSA teams, The Atlas API is definatly supported and there are a few Python Git repos with example code out there for Purview and if you are after spark libraries then... Merry Christmas ?! pypi.org/project/pyapacheatlas/#description I can't disclose what is in the backlog and in the pipeline but it is worth noting that what you mention about mentioning of things in docs.mircrosoft.com etc is a good indication of what is to come. Cheers / Andreas
@AdvancingAnalytics
@AdvancingAnalytics 3 жыл бұрын
Whoops - this had fallen into the "spam" comments! Thanks Andreas, super useful. I've seen quite a few python libraries for working with Purview, haven't looked at a native spark one yet! I'll have a play! Simon
Azure Purview Pricing - How much does it cost?
16:05
Advancing Analytics
Рет қаралды 9 М.
Azure Synapse Analytics - The first 20 minutes!
30:13
Advancing Analytics
Рет қаралды 26 М.
Andro, ELMAN, TONI, MONA - Зари (Official Audio)
2:53
RAAVA MUSIC
Рет қаралды 8 МЛН
Advancing Spark - Give your Delta Lake a boost with Z-Ordering
20:31
Advancing Analytics
Рет қаралды 30 М.
Advancing Spark - Manual Lineage with the Purview PyApacheAtlas API
29:04
Advancing Analytics
Рет қаралды 9 М.
Understand Where Azure Data Services Really Fit in Your Data Flow!
21:10
John Savill's Technical Training
Рет қаралды 21 М.
Azure Synapse Analytics - Introduction to Azure Purview
33:35
Advancing Analytics
Рет қаралды 21 М.
Data Governance using Microsoft Purview and Fabric
55:50
Netwoven Inc
Рет қаралды 5 М.
Azure Data Factory Tutorial | Introduction to ETL in Azure
24:59
Adam Marczak - Azure for Everyone
Рет қаралды 882 М.
How To Setup Microsoft Purview & Scan Against OnPrem SQL Server
41:13
Azure purview architecture review & demo
57:36
Microsoft Security Partner Development Team
Рет қаралды 21 М.
AZ-305 Designing Microsoft Azure Infrastructure Solutions Study Cram - Over 100,000 views
3:38:35
John Savill's Technical Training
Рет қаралды 476 М.
Andro, ELMAN, TONI, MONA - Зари (Official Audio)
2:53
RAAVA MUSIC
Рет қаралды 8 МЛН