Delta Tables in Azure Synapse Analytics

Delta Tables in Azure Synapse Analytics - What's Going On?

Рет қаралды 9,026

Andy Cutler

Күн бұрын

Пікірлер: 9

@geehaf Жыл бұрын

Thank you Andy. Great video as always.

@MDevion Жыл бұрын

Keep it up! What might be an useful video is explanation how to setup a Datalake house vs traditional SQL server, using mostly SQL Serverless pool. I know you done some ETL viddeos in the past, but Im still missing an end to end video. That means coming from source, bronze, silver to gold layer/PowerBI. Also a big one is, how do you deal with deletes in this architecture.

@DatahaiBI Жыл бұрын

Nice idea, might be a longer video though and definitely something I'd do

@MDevion Жыл бұрын

@@DatahaiBI would be very appreciated 👍

@germanareta7267 Жыл бұрын

Great video. Thanks.

@baklava2tummy 4 ай бұрын

What I don’t understand is why you would create the lake database in the Serverless pools however (ie not in the spark notebook. Love your videos btw!

@ManavMishra-by6ls Жыл бұрын

Hi, dataflow - adding new column to existing delta parquet sink with upsert gives error. Can someone please help on this?

@emanabela Жыл бұрын

One other annoying behaviour I observed between Apache Spark Pools and Serverless SQL Pools relates to the shared metadata table schema, specifically to the string data types. If I create a table based on Delta files within a Lake DB and define a varchar column which has a length of 100 characters, when I explore that same table within the Serverless SQL Pool the varchar length is lost and is rounded up to a length of 8000. This is very inconvenient as we'd have to create views within Serverless that redfine the proper length. This kind of defeats the purpose of having a Lake Database in the first place, at least up to a certain extent. Hope MS will rectify this in the near future. Mind you, this is not the case when the Lake DB table is based on Parquet files, in which case the varchar length is persisted in the shared metadata between both pools.

@DatahaiBI Жыл бұрын

Good point, thanks. If I had my way I'd just have a single database type for both spark and serverless, keep everything consistent