Advancing Spark - Databricks Runtime 7 2 & Delta Cloning

  Рет қаралды 1,511

Advancing Analytics

Advancing Analytics

Күн бұрын

What? It somehow passed us by that not one but TWO new Databricks Runtimes have been released since the Spark & AI Summit. With Databricks Runtime 7.1 now live, and Runtime 7.2 already available in Beta, Simon takes a look through the new features, including the vast potential of the new Delta Clone functionality!
If you're hearing "table clone" and thinking "well that's not very exciting", you might want to watch and see just how useful it can be! Don't forget to Like & Subscribe while you're there!
UPDATES - Since recording we've validated the following:
1 - CLONE is now available as SQL, python, scala and java!
2 - DEEP CLONE takes a copy of the active/current files, not the full history!
The Databricks runtime docs can be found here:
docs.databricks.com/release-n...
docs.databricks.com/release-n...
And don't forget to check out our Databricks Training, over at:
www.advancinganalytics.co.uk/...

Пікірлер: 6
@etech-ej2yj
@etech-ej2yj Жыл бұрын
shallow clone seems quite useful actually, perhaps there's a way to configure the clone to disallow any vacuums 🤔 thanks for the video!
@surbhisingi5214
@surbhisingi5214 3 жыл бұрын
Hi Simon - I really enjoy watching your videos on Databricks + Spark and have learnt a lot about features available within Databricks. Currently I am exploring Docker in Databricks. Could you please create a video how Docker can be used within Databricks and all the features which are available? Thanks in advance!
@ynwtint
@ynwtint 4 жыл бұрын
What essentially the difference will be between Delta Clone and normal CREATE AS new table?
@AdvancingAnalytics
@AdvancingAnalytics 4 жыл бұрын
Hey both - there's a bit of clarity in the clone syntax blog: "Cloning a table is not the same as Create Table As Select or CTAS. A clone copies the metadata of the source table in addition to the data. Cloning also has simpler syntax: you don’t need to specify partitioning, format, invariants, nullability and so on as they are taken from the source table." So - it's different to CTAS in that it copies over the trans log details and a load of the metadata in there, and it shortcuts having to provide a load of settings for the new table. That said, time travel does not work on the new table - it looks to copy over only the new files, and your trans log will have a single "CLONE" activity, rather than the actual history. That's when we're talking about DEEP CLONE that is, SHALLOW will have other implications! Simon
@sid0000009
@sid0000009 4 жыл бұрын
Shallow cloning - If we can access a Delta table using versioning, not sure what is additional advantage we get access the same with the only addition of a new cloned table. Deep Cloning - It doesn't copy the entire data history of a delta table..which seem bit dis-appointing...( copied data as per the version specified or the latest delta data snapshot )
@AdvancingAnalytics
@AdvancingAnalytics 4 жыл бұрын
So Shallow Cloning - it's all about changing the data at the same time. So we can make two entirely different changes to the data and compare, without having to copy the whole thing. If we tried this with versioning, we'd have to make the first change, undo that change then make the second. Gives a whole safety net about trying new things in a stable environment. Might be a bit niche, sure, but certainly solves a few problems I've seen. And yeah, Deep cloning not taking the history, I guess it's one way or the other. If it took the full history I'd complain that it takes redundant files! Would be good to have a history/no history option with Deep to give the flexibility. Either way, definitely has a couple of uses. Simon
Advancing Spark - Databricks Runtime 7 3 Beta Review
16:47
Advancing Analytics
Рет қаралды 613
Advancing Spark - Databricks In-Browser Interactive Debugger
14:49
Advancing Analytics
Рет қаралды 2,6 М.
Stay on your way 🛤️✨
00:34
A4
Рет қаралды 13 МЛН
OpenAI's New SearchGPT Shakes Up the Industry, Google Stock CRASHES!
10:10
Introduction to Data Mesh with Zhamak Dehghani
1:05:31
Stanford Deep Data Research Center
Рет қаралды 30 М.
Advancing Spark - Data + AI Summit 2024 Key Announcements
28:38
Advancing Analytics
Рет қаралды 6 М.
Advancing Spark - Databricks SQL Variables & Dynamic WHERE
13:36
Advancing Analytics
Рет қаралды 3,9 М.
"Brian Cox Warn: Betelgeuse Supernova Explosion Imminent!"
13:29
Interstellar Insights
Рет қаралды 2,7 М.
Was Penrose Right? NEW EVIDENCE For Quantum Effects In The Brain
19:19
PBS Space Time
Рет қаралды 336 М.
Advancing Spark - Azure Databricks News Apr - May 2024
29:28
Advancing Analytics
Рет қаралды 1,9 М.
Advancing AI - Beginners Guide to MLflow
13:47
Advancing Analytics
Рет қаралды 1,1 М.
Stay on your way 🛤️✨
00:34
A4
Рет қаралды 13 МЛН