Advancing Spark - Dynamic Data Decryption

  Рет қаралды 3,844

Advancing Analytics

Advancing Analytics

Күн бұрын

The ability to determine whether a user sees encrypted or decrypted values of a record has been incredibly powerful in meeting data protection requirements. There are established patterns from the warehousing world that have been very tricky to adopt in the Lakehouse approach... until now.
In this video, Simon walks through the new aes_encrypt() and decrypt() functions in Databricks Runtime 10.3, before showing how they can be combined with row level security to dynamically display decrypted data to only those with the relevant access.
As always, if you need help establishing a next generation Data Lakehouse architecture, get in touch with Advancing Analytics

Пікірлер: 22
@VishwajeetPol
@VishwajeetPol 2 жыл бұрын
Thank you Simon for sharing the video, Honestly, I refer your channel to know what was launched in recent release of DBR rather than going through the databricks article. It was definetly informative to understand RLS with encyption and decryption on the fly
@drummerboi4eva
@drummerboi4eva 2 жыл бұрын
Super vidéo Simon . Row level security is simple but effective and encryption is really a need of the hour with GDPR use cases
@briancuster7355
@briancuster7355 2 жыл бұрын
I agree with the other comments in that this is monumental!
@MajdiSAADANI
@MajdiSAADANI Жыл бұрын
Hello Simon, thank you for your video, could you show please how did you generate your encryption key, I tried get_random_bytes(32) and when I pass the key to the spark native function aes_encrypt I receive this aes_encrypt`/`aes_decrypt` is invalid: expects a binary value. Thanks
@Vishal-q8e
@Vishal-q8e 4 ай бұрын
How to use aes_encrypt for boolean or integer data types and get encrypted values in same format?
@rajdeepsinghborana2409
@rajdeepsinghborana2409 2 жыл бұрын
Informative ❤️
@dmitryanoshin8004
@dmitryanoshin8004 2 жыл бұрын
Great video! Can you please make a video about local IDE development for Databricks. It works fine with a single notebook using databricks-connect. But it is not working with calling other notebooks in Repo. Maybe we should use a wheel package for this. Thank you!
@hubert_dudek
@hubert_dudek 2 жыл бұрын
I think it is better to use databricks or azure key vault for keys (better than lookup table) and than rest of logic like in Simon video
@AdvancingAnalytics
@AdvancingAnalytics 2 жыл бұрын
Oh if you have a single decryption key, certainly. If you need a different key per record, that would get awkward to code without preemptively looking up keys, or writing a very slow UDF!
@hubert_dudek
@hubert_dudek 2 жыл бұрын
@@AdvancingAnalytics it could be useful that dbutils.credentials is supported through SQL functions (other dbutils as well) so than implementation would be less ugly :-)
@becavas
@becavas 2 жыл бұрын
@@AdvancingAnalytics nice, but how to read secret scope (encryption key) from sql?
@becavas
@becavas 2 жыл бұрын
Nice. the cypher algorithm is AES256?
@guilleromero3762
@guilleromero3762 2 жыл бұрын
Super interesting! but shouldn't it be better to store the decryption keys on Azure Key Vault in order to gather all sensitive data in it? Thanks in advance, I'm a big fan of your videos!
@ivantang5795
@ivantang5795 2 жыл бұрын
Thinking out loud here. AKV have rate limiting and likely will throttle retrieval requests if we were to store those encryption keys in AKV for a fairly large table.
@singhrakeshr
@singhrakeshr 2 жыл бұрын
Unless a unique encryption key is used per customer this solution of encrypt/decrypt cant be used for gdpr. Why would you throw away key for a group if only one customer needs to be removed. Also the storage layer will still have customer info even if access via delta table is restricted/obfuscated.. so data needs to be deleted from everywhere..
@AdvancingAnalytics
@AdvancingAnalytics 2 жыл бұрын
Well... Yes, in this simple example I used group for the decryption, if you were using this for customer info your lookup would be on CustomerID or something similar? The data in storage is encrypted, so the storage layer is safe, not the Delta layer? Give it a try :)
@zycbrasil2618
@zycbrasil2618 2 жыл бұрын
@@AdvancingAnalytics yes encrypted at rest and in transit.
@paulheadey265
@paulheadey265 Жыл бұрын
I was thinking the same thing - seems a bit of an anti-pattern but perhaps you can automate generating each unique key with Pycrypto
@aragornguan7692
@aragornguan7692 2 жыл бұрын
Great video Simon, thanks! Can I have your notebook somewhere please?
@norbertczulewicz1695
@norbertczulewicz1695 2 жыл бұрын
You can easily restore the key from decrypto table history.
@AdvancingAnalytics
@AdvancingAnalytics 2 жыл бұрын
Yep, of course. If you're using Delta for security-style workloads, you need to have a vacuum policy that reflects that - with the right process & controls it's easily manageable Simon
Advancing Spark - Row-Level Security and Dynamic Masking with Unity Catalog
20:43
Advancing Spark - Automated Data Quality with Lakehouse Monitoring
17:37
Advancing Analytics
Рет қаралды 7 М.
REAL or FAKE? #beatbox #tiktok
01:03
BeatboxJCOP
Рет қаралды 12 МЛН
How to treat Acne💉
00:31
ISSEI / いっせい
Рет қаралды 42 МЛН
The evil clown plays a prank on the angel
00:39
超人夫妇
Рет қаралды 50 МЛН
Databricks News Oct-Nov 2024 - Advancing Spark
33:19
Advancing Analytics
Рет қаралды 1,2 М.
Databricks Quick Tips: Column Level Encryption | Protect your data
12:17
Apostolos Athanasiou
Рет қаралды 264
Advancing Spark - Implementing Row Level Security in Databricks
17:34
Advancing Analytics
Рет қаралды 8 М.
Advancing Spark - Engineering behind Featurestore
19:20
Advancing Analytics
Рет қаралды 1,4 М.
Databricks Apps First Look - Advancing Spark
22:44
Advancing Analytics
Рет қаралды 3,4 М.
97. Databricks | Pyspark | Data Security: Enforcing Column Level Encryption
11:48
Raja's Data Engineering
Рет қаралды 9 М.
Dynamic Databricks Workflows - Advancing Spark
21:56
Advancing Analytics
Рет қаралды 5 М.
Exploratory Data Analysis with Pandas Python
40:22
Rob Mulla
Рет қаралды 509 М.
OAuth 2.0 and OpenID Connect (in plain English)
1:02:17
OktaDev
Рет қаралды 1,8 МЛН
REAL or FAKE? #beatbox #tiktok
01:03
BeatboxJCOP
Рет қаралды 12 МЛН