Advancing Spark - Implementing Row Level Security in Databricks

  Рет қаралды 7,540

Advancing Analytics

Advancing Analytics

Күн бұрын

RLS, or Row Level Security, is another one of those "maturity features" that is often used as an argument to demonstrate how lake-based platforms are still behind the more mature relational data stores, however from Databricks Runtime 7.3, we now have a solution!
This week, Simon looks into the is_member() function and how we can use it to implement secure, performant security within a lake-based data model! This has HUGE impacts on how successfully we can model warehouses within a lake structure, and is a great thing to see!
More info on those Dynamic View Functions here: docs.databrick...
And as always, don't forget to Like & Subscribe, and stop by our site to see if we can help you on your Data Lakehouse journey - www.advancinganalytics.co.uk

Пікірлер: 22
@Obizzy8
@Obizzy8 3 жыл бұрын
Very useful feature for a Enterprise lake house approach! Thanks for the constant great content 👍
@nickhurt8416
@nickhurt8416 3 жыл бұрын
Awesome new capability - thanks for sharing!
@drummerboi4eva
@drummerboi4eva 3 жыл бұрын
Nice stuff !! Super encouraging to see performance is not deterred while using RLS in Databricks
@gulamsardar7799
@gulamsardar7799 3 жыл бұрын
Thanks Again Simon, keeping upto date with Spark has become so simple because of you. Though I am a sql expert, I am struggling a little with scala, can you please guide to some courses which can help me in learning scala(hands on), thank in advance !!
@film-masti-777
@film-masti-777 Жыл бұрын
This is good but very basic level. is there any advance use case you can present pls? which includes CLS, table level, RBAC etc.
@Monsalvo888
@Monsalvo888 3 жыл бұрын
Really useful, thanks!
@SAMSARAN2108
@SAMSARAN2108 Жыл бұрын
Thanks for sharing the RLS concepts in Databricks in this video. I have a requirement like I will create an Azure AD group (SalesAPAC) for Sales Domain- NAM Region combination at Power BI level, then users from this group should have access only Sales related workspace and its reports with NAM region data only from reports/visualizations. The same logic I need to apply the same logic here in Azure Databricks by passing userid, domain and region into Databricks. so user should be able to see only Sales related tables/views/other objects and should fetch only NAM data from those tables. As per this video, we need to create a new group for RLS it seems. Is there any way to sync the Azure AD, use it inside Azure Databricks and define the RLS logic based on the Azure AD group? Regards, Saravanan.S
@bbrocks5530
@bbrocks5530 7 ай бұрын
Did you get the solution?
@SAMSARAN2108
@SAMSARAN2108 7 ай бұрын
@@bbrocks5530 Actually our organisation didn’t move towards Azure Databricks, but we may require this for snowflake. Thanks for reaching me.
@viveksomvanshi3767
@viveksomvanshi3767 3 жыл бұрын
As always, very well summarized. Thanks Simon. Does this mean if we get it working at enterprise level then we don't need any sort of OLAP engine i.e. Sql db, warehouse? Also, in general if I can achieve similar objectives through schema and views in sql then what will be the advantages of RLS since organization structures are generally complex than demonstrated particularly in any RLS concept?
@AdvancingAnalytics
@AdvancingAnalytics 3 жыл бұрын
I'd hedge my bets more - "in some circumstances, Databricks can function as your OLAP", not absolutely every case, depending on user volumes, query frequency, latency requirements etc, it's a complex question for a yes/no answer! And complexity is what it's about with RLS vs Views too. If I have three company segments, managing that via different views is easy, if I have 100, and they frequently change and evolve, that's a huge headache in code maintenance. Also affects who can do the change - a support team can easily add new groups, add/remove members, but asking support teams to define and maintain views (and apply the relevant security to the object!) is harder. As always, loads of different ways to approach it, it's just another tool in our belt to design the right security model for the problem. Simon
@l_combo
@l_combo 3 жыл бұрын
Thanks for sharing, this is a great start, how do you see this scaling to n dimensions e.g. member of a country (shown), member of a business unit, member of a role etc. otherwise I suspect the more traditional security makes more sense on the layer where the data is being analysed / view such as BI.
@AdvancingAnalytics
@AdvancingAnalytics 3 жыл бұрын
Yeah, it scales as far as you can create groups to back it up. There's a fair bit of potential given you can now create & map users to groups through the API, so a little python utility can do the user mapping for you... but you're right, when you get into decent numbers of different roles, it'll become fairly difficult to look after. That said, the alternative to the "is_member0" function looks at the current user instead, so you could change it to a full, many-to-many user table that does the security, giving you full flexibility inside you model - it's slightly more of a pain to implement though :)
@zycbrasil2618
@zycbrasil2618 3 жыл бұрын
Hi Simon.. Data object privileges right? Does it support column level security?
@nickhurt8416
@nickhurt8416 3 жыл бұрын
Yes see docs.microsoft.com/en-us/azure/databricks/security/access-control/table-acls/object-privileges#column-level-permissions
@mohdshoaib3296
@mohdshoaib3296 3 жыл бұрын
on global temp view is-member is not working ..any guidance.thx
@AdvancingAnalytics
@AdvancingAnalytics 3 жыл бұрын
Never tried it on a global temp view, it's worth getting in touch with Databricks to discuss the use case. Worst case... just save it as a persisted view? :)
@umarhussain9334
@umarhussain9334 3 жыл бұрын
Awesome, how expensive is this compared to an analysis service with RLS (say S1)
@AdvancingAnalytics
@AdvancingAnalytics 3 жыл бұрын
If you take a blunt example of an S1 AAS and a 2-worker Databricks cluster, both ~25Gb RAM. Then it's around £1,100 AAS compared to £1,246 Databricks. But that assumes they're both turned on - Databricks has much better scaling and - the killer advantage - doesn't need to hold all the data, so can be using this technique over masses of data stored cheaply in the lake. If used right, I'd say Databricks is the much cheaper option.
@umarhussain9334
@umarhussain9334 3 жыл бұрын
@@AdvancingAnalytics my thoughts exactly with a properly partitioned using the schema of your choice this becomes so much more flexible. Throw in the power of python and this becomes a good sell. Thanks for the video v helpful
@BitaRastgar
@BitaRastgar 3 жыл бұрын
it is nice but it is not dynamic. create a view/table using is_member(country) , then if you remove a user from a group, that user will still have access to all data when the view/table was created! it would have been nice if a user is removed AFTER view/table creation, then that user would be allow to see what he/she allowed to see right now. by the way, is this part of SQLAnalytics also?
@gran_turing
@gran_turing 3 жыл бұрын
That's exactly how it works, the filtering happens at runtime not at view creation time dynamically based on the user querying the data. This is part of the security model that is used with SQL Analytics as well as Table ACL clusters.
Advancing Spark - Row-Level Security and Dynamic Masking with Unity Catalog
20:43
Advancing Spark - Databricks Delta Change Feed
17:01
Advancing Analytics
Рет қаралды 14 М.
Самое неинтересное видео
00:32
Miracle
Рет қаралды 838 М.
Bend The Impossible Bar Win $1,000
00:57
Stokes Twins
Рет қаралды 42 МЛН
❌Разве такое возможно? #story
01:00
Кэри Найс
Рет қаралды 6 МЛН
Advancing Spark - Understanding the Spark UI
30:19
Advancing Analytics
Рет қаралды 52 М.
Advancing Spark - Getting Started with Ganglia in Databricks
24:49
Advancing Analytics
Рет қаралды 11 М.
Advancing Spark - Databricks Delta Streaming
20:07
Advancing Analytics
Рет қаралды 28 М.
Azure Databricks Security Best Practices
24:27
Databricks
Рет қаралды 14 М.
Advancing Spark - Delta Merging with Structured Streaming Data
17:20
Advancing Analytics
Рет қаралды 18 М.
Advancing Spark - Setting up Databricks Unity Catalog Environments
21:21
Advancing Analytics
Рет қаралды 17 М.
Row and Column Level Access Control in Databricks
11:33
Insight into Data
Рет қаралды 2,1 М.
Advancing Spark - Delta Sharing
26:12
Advancing Analytics
Рет қаралды 9 М.
Databricks Unity Catalog: A Technical Overview
17:29
Pathfinder Analytics
Рет қаралды 24 М.
Meshing About with Databricks
35:49
Databricks
Рет қаралды 7 М.
Самое неинтересное видео
00:32
Miracle
Рет қаралды 838 М.