Great video, I was not aware of the filepath function.
@avikbasu89432 жыл бұрын
It would be very helpful if you have a session regarding the Delta table partitions and fetching data through the Serverless SQL pool and creation of External Table over partitioned Delta tables.
@stijnwynants73072 жыл бұрын
Hi Avik! We will definitely put it on the list! We will be handling other parts of the synapse stack in the near future!
@avikbasu89432 жыл бұрын
@@stijnwynants7307 Thank you !! Would be looking forward to an episode on External Tables over partitioned Delta Tables. 👍
@godhasowjanyakandrakota4642 Жыл бұрын
Can the partitioning be done only based on Range values. Is it possible to partition a table based on if partition column is equal to a value, then all the rows go into that partition. Example: market_id ={1,2,3,4,5} data with mkt_id = 1 in one partition, data with mkt_id = 2 in another partition and so on. Is this scenario possible with synapse tables
@germanareta72672 жыл бұрын
Hi, i am interested in the Spark way of generate partitioning tables. Great video.
@stijnwynants73072 жыл бұрын
Hi! We are planning a video on this as well! Stay tuned!
@AzureSynapse2 жыл бұрын
You can find a simple example of using Spark to partition data in the data lake in the second episode of our Azure Synapse + Power BI Datamart series: kzbin.info/www/bejne/jn7CZpaobd2SeLs. Around 5'00'' Pawel demonstrates a piece of PySpark code writing partitioned dataframe to the lake.
@germanareta72672 жыл бұрын
Thanks.
@KnowsomeLifeАй бұрын
I thought partitioning is logical division of data but according to this video as we have files of each partition in different folder seems like its a actually a physical partition of data. am I correct ?
@rezcan2 жыл бұрын
Is spark the only way to generate partition tables? Can I use sql and ADF to generate partition tables?
@stijnwynants73072 жыл бұрын
Hi Reza, you will need to initialize the tables in Spark, you can use the multiple languages you have available in the synapse notebooks. If you are more used to SQL , you can take a look at sparkSQL. If you want to know how to initialize them check out this video with Patrick kzbin.info/www/bejne/aZbJamtrlpWJm8k
@BernardoRomeroC2 жыл бұрын
I see the advantages of partitioning by date, when your queries are date-related. But is it also more efficient if I run queries on other attributes?
@stijnwynants73072 жыл бұрын
Hi Bernardo, As such the partitioning column will perform data elimination on date (If your key is on date). Other queries which do not use the date key will not benefit from the partitioning. You could add another partitioning layer below, or you could use the Optimize Z-Order to structure those files and benefit from other attributes as well. (We will be doing a video on z-Order soon.
@my2004rt Жыл бұрын
Excellent. However, if the file path was cast to a smallint why is the year 2010 a string?