Рет қаралды 443
This talk discusses managed-commits, a new commit protocol for Delta Lake that changes the source of commit atomicity from the object store to an external commit owner (e.g., HMS/Unity Catalog/Glue) that will help us provide flexibility in how transactions are performed, laying out the foundation for advanced features such as multi-statement transactions. Delta was originally built on the premise that cloud storage is the source of truth. However, cloud storage has limited primitives for atomicity; more specifically, object stores lack the means to perform atomic commits for more than a single write/statement. In this talk, we talk about the new commit protocol, managed-commits, that aims to solve the following: Support multi-table-multi-statement transactions. Provide reliable commit semantics even when the underlying object store lacks put-if-absent semantics (e.g., S3). Data governance overwrite operations.
Talk By: Prakhar Jain, Staff Software Engineer, Databricks
Here's more to explore:
Rise of the Data Lakehouse: dbricks.co/3NH...
Lakehouse Fundamentals Training: dbricks.co/44a...
Connect with us: Website: databricks.com
Twitter: / databricks
LinkedIn: / data…
Instagram: / databricksinc
Facebook: / databricksinc