Great insights, Sean and Chris. I love seeing how connection with Fabric enables users to process large datasets more easily. BTW, the visuals in the video make it so much easier to grasp the details. Excellent job, Sean!
@Untethered3655 ай бұрын
@@RafsanHuseynov thank you! It's a hairy topic that was an adventure to get the details of what was happening "under the hood".
@jaredpritchard42204 ай бұрын
... I was watching on a Friday... Great overview. Loved the short, sharp, but deep details, plus screen clips popping up showing more context / cues.
@Untethered3654 ай бұрын
Thank you! It took awhile to make but we really wanted it to pop.
@nithinkv13773 ай бұрын
I found storing the data in parquet files better specially data that has special characters like address data where having them in csv's causing all sort of data shifting issues. Also another point to note, if you purchase a fabric compute, you are paying for the whole compute at once as per your subscription and with synapse the cost is calculated when you actually run some pipelines that causes a spark pool to start. Last but not least, parquet files on average provided 6 - 10 times compression compared to the same data in csv's in my use cases.
@Untethered3653 ай бұрын
@@nithinkv1377 Excellent points thank you! In particular the pricing around Fabric compute is a great one. So the fabric compute pricing plus the Dataverse storage is what went into your costs? Without a doubt Parquet > CSV
@nithinkv13773 ай бұрын
@@Untethered365 Yes and also from Fabric compute cost perspective, with 1 year reserved compute as opposed to monthly pay as you go costs are about 40% cheaper.
@Untethered3653 ай бұрын
@@nithinkv1377 1 year reserved compute is a Fabric or Synapse piece? I was trying to clarify which was cheaper here.
@nithinkv13773 ай бұрын
@@Untethered365 1 year reserved compute was for Fabric. But with Synapse you only get charged for the compute that you use but the azure server pools do take on average 2 - 4 min to start up. So over all i think we pay less for synapse. But if you have constant workloads, Fabric reserved compute might work out cheaper. As an example, for the month of July our total resource group cost was $656 of which $239 was ADF, $226 for Storage, $94 for bandwidth, $64 for was Synapse and $33 for Microsoft Defender. If we switch over to Fabric, the cost of ADF and Synapse would be covered under Fabric compute cost which considering a F8 would cost $694 a month for a 1 year reserved compute + $26 for 1 TB of storage. Considering the numbers, if we don't exceed our current compute requirements Synapse will be cheaper and if exceed our compute requirements, Fabric is cheaper. I guess there is tipping point based on the requirements at which point Fabric works out cheaper but keep in mind you have to Pay for Fabric upfront no matter you use the compute or not unlike Data Bricks or ADF/Synapse which is purely consumption based for compute too from what i understand.
@SangeethSudharaka2 ай бұрын
Thanks for this. I'm looking into doing something similar. If you don't mind me asking, what's the Fabric SKU level you used for this? Also, did you still have to pay for the Dataverse or just the Fabric?