Thanks for the follow up, i was really confused when I was reviewing my models. To people asking about performance. If you're having to use filter within your measures over a large dataset then I would make a calculated column that returns 1 if the logic is met, then have a measure that just sums the calc column. If you are only doing arithmetic then the calc column doesn't make much sense, unless you are going to have the value presented in a raw data table in your report. With a raw data table, the measure will execute over each row, and you will possibly exceed resources or have performance issues. As Marco is pointing out, it's all a balance of storage, performance, and customer requirements. Sometimes an extra column can't be done if the table is too large.
@praveenpatil2823 жыл бұрын
Awesome Video. Marco, I'm your fan after seeing this. You do a much better job explaining than Alberto. Productive 12 mins of my life.
@princecena5 жыл бұрын
Thanks Marco and Patrick for the insights...really nice seeing both of you together
@yigitkadioglu92104 жыл бұрын
Simple and effective, thank you Patrick & Marco
@MAbdullah473 жыл бұрын
Thank you Marco for this useful introduction
@hasnain2803 ай бұрын
Thank you Marco & Patrick very Helpful👍
@guillermochez55314 жыл бұрын
An effective engine data to pack; fabulous, only the necessary. Thanks Marco and Patrick for the power of vertipaq and the session with this data sample.
@SamuelRoshan5 жыл бұрын
Mind blown! Thank you Patrick and Marco!
@tomaskavicky3 жыл бұрын
great help guys! thanks
@DIGITAL_COOKING5 жыл бұрын
First nice t-shirt Second, your videos with Marco are a great good one, Patrick 👍👍👍
@TestPBI Жыл бұрын
Hello, thank you for the video. In the last portion of the video, by using SUMX in a measure instead of a calculated column - granted, the model size is reduced. But by doing this, won't you put additional strain on the formula engine, and user interactions that use the measure with SUMX be significantly slower?
@marcmaurice89424 жыл бұрын
Hi! Is there an easy way to see which columns in the data model are not used? Thanks!
@alpesh417rt3 жыл бұрын
great value in video - implemented !!!!
@mathew96655 жыл бұрын
This is one of the most important item when building a report thank you
@nikolaidedenkov84144 жыл бұрын
Thank you guys, great as usual!
@noahhadro82132 жыл бұрын
Is there a way to use PowerShell to export the vpax file from the service automatically?
@JorgeSegarra5 жыл бұрын
Fantastic job, guys, thank you!
@kurtdevotto5 жыл бұрын
Hi there, how did you get de Advance mode in Dax Studio?
@arklur31935 жыл бұрын
You need to enable it in the Options, as Marco said --> imgur.com/GnFxG1c
@robsonnascimento59355 жыл бұрын
Awesome, thank you for this video!!!
@rapatel16422 жыл бұрын
Great video.
@PedroCabraldaCamara4 жыл бұрын
awesome video!!!!! Thank you so much guys!!!
@GuyInACube4 жыл бұрын
Thanks for watching! 👊
@EricaDyson5 жыл бұрын
Great session. Nicely explained (for us lesser mortals) .. more please!
@arklur31935 жыл бұрын
What I'm thinking of is, could you mention a few regular use-cases where you would sacrifice memory for using a "calculated column" rather than an iteration function? Let's say summing up 5, 10, 20, 50 columns, or if the calculated column's logic is just "complex" (I know-I know, "What is complex?" :D), using conditions, multiplications, etc. How can you decide not to use a measure, but rather store the value in a column, so you can just sum on that, meaking the measure much more faster and less resource intensive. Thanks in advanced.
@WmRod5 жыл бұрын
I had a similar question. There has to be a trade off somewhere between file size and execution speed
@marcorusso74725 жыл бұрын
It depends. For example: with 10 billions rows a column with Line Amount could be 30% faster than multiplying the two columns. However, the storage for the additional column could be 20-30 GB. In a report, the difference can be 1-2 seconds faster, something like 3 seconds instead of 4.5 seconds. Memory is not free, CPU is not free. Customer choice, but the real issue is that you cannot continue to add memory, so you have to make choices. However, if you have 10 million rows, the difference is smaller in percentage, it could be actually faster and in any case the difference (if any) would be in milliseconds. So I would always save memory in those cases. For Power BI, it's a no brainer: save memory - you cannot have 10 billions rows.
@arklur31935 жыл бұрын
@@marcorusso7472 Thanks for the detailed answer, I'll keep it in mind!
@Prodoshdutta5 жыл бұрын
Great Optimization
@luti46895 жыл бұрын
Awesome video! Managed to get a recent report from 200 down to 40. Seeing the impact of a single column is really useful. Do I really need it, can I use a measure instead of a column? It really makes a difference :) Thanks!
@rukiatukijancollier60615 жыл бұрын
Very helpful! Thank you
@GHRichardson5 жыл бұрын
I have what I think could be a common problem to many: My millions of rows of data have a unique identifier in them. I need to keep the identifier despite its terrible cardinality :-(. However, it's worse that that. I need to add a string prefix to the unique Id field because it ends up being a lengthy URL which I display as a link icon in the UI within a table. This makes the problem much worse because the size of the column and its unique values prevent any significant compression and the file becomes huge. Is there something that I can do to improve this in the DAX instead? Maybe in a similar way to your SUMX function described in the video? My concern would be that a measure that 'calculates' the eventual URL string would get run for every row in my display table whenever a filter was changed and that would take several seconds...
@WordBaron3 жыл бұрын
I second this comment. This video kind of frustrated me a little because how can we remove IDs when we want our users to be able to identify and return to the records at the Order or OrderLine level of granularity? Is this for only summary level models that don't require drill-down? And I agree that it's even worse in most datamodels because of the hyperlinks that are created for these IDs to be used on the querystring because it's not only one column for each ID, but now two because of the URL. These ID and URL-to-ID-record columns take up the vast majority of space in our model. After thinking about this some more, I was able to make the hyperlink into a measure instead of a calculated column www.screencast.com/t/9o4sT9jV1 which saves a little bit of space, but not nearly as much as I was expecting based upon what DAX Studio showed. Also, you can't just get rid of an ID that is used to link tables together can you? Granted the SUMX function was a true size-saver in that you didn't sacrifice any functionality but could still save some space, but I left feeling that the ID removal was completely impractical. Maybe I'm just missing something... I actually hope that I am.
@nandukrishnavs5 жыл бұрын
Informative 👍
@WmRod5 жыл бұрын
What happens to rendering speed when executing the new measure? Maybe it doesn't matter so much in this example, but in general, I would think you would embed compution heavy formulas in a calculated column, especially if there are dependencies.
@marcorusso74725 жыл бұрын
It depends - see similar comments above
@ianstuart25055 жыл бұрын
Nice tips, thank you. I think there is a balance to be struck as well though... I generally advise creating calculated columns as far back in the process as possible - ideally on the database prior to connecting Power BI. This is to centralise the "single version of the truth" and minimise maintenance. This is an approach I would stick with unless performance proved to be an issue. Thoughts?
@marcorusso74725 жыл бұрын
Correct - but the memory consumption described in the video is not caused by the calculated column, it is caused by the cardinality of the column that produces low compression.
@MAbdullah473 жыл бұрын
But the Order Number if it is the foriegn key how you deleted it?
@kasunmanchanayake84645 жыл бұрын
Wow marco and Patric together
@GregKramerTenaciousData5 жыл бұрын
nicely done fellas....for what it's worth, 'yo Marco' is much more approachable than 'professor Marco' :-)
@Lixia1234 жыл бұрын
love it!
@vog51975 жыл бұрын
Thanks for the video, always love the performance optimization videos! I was just wondering though; is it always better to replace calculated columns with measures? Isn't a calculated column calculated once (on load of the initial report), whereas measures are recalculated with constantly when you navigate the visuals pane? I'd imagine there would be some cases where I'd rather want to wait a bit longer initially to have snappier performance navigating the dashboard once loaded.
@dbszepesi5 жыл бұрын
Calculated columns are not compressed with as much efficiency as the regular data due to when they are created in the build process. Therefore from a compression standpoint, they are not preferred.
@arklur31935 жыл бұрын
@@dbszepesi This is very specific to calculated columns, if you create it in PQ, you are "fine". Your model will be still bigger, obviously, but the engine can compress your model therefore making it smaller compared to a DAX calculated column.
@marcorusso74725 жыл бұрын
@@arklur3193 Not really, the biggest difference in compression is the number of unique values of the column, which is bigger in Line Amount rather than Quantity / Net Price if you consider this example. The difference between computed columns and calculated columns is minimal and relevant only for columns that have a small number of unique values (up to 10/100, certainly not 1000 or more).
@JackOfTrades125 жыл бұрын
If you're having to use filter within your measures over a large dataset then I would make a calculated column that returns 1 if the logic is met, then have a measure that just sums the calc column. If you are only doing arithmetic then the calc column doesn't make much sense, unless you are going to have the value presented in a raw data table in your report. With a raw data table, the measure will execute over each row, and you will possibly exceed resources or have performance issues.
@spilledgraphics4 жыл бұрын
Amazing!!!!!
@alexbehrmann95575 жыл бұрын
Do the same rules apply with direct query models?
@marcorusso74725 жыл бұрын
No, DirectQuery doesn't import data and you cannot use aggregations for measures that have row-level calculations like in this demo.
@malikakunseitova70883 жыл бұрын
Amazing
@nicolassavignat60915 жыл бұрын
Thanks all
@stephenbrincat71245 жыл бұрын
Following Marco's instructions but cannot view result, i.e the VertiPaq Analyzer Preview is not showing....
@marcorusso74725 жыл бұрын
Please read this thread and check possible solutions or post more details: github.com/DaxStudio/DaxStudio/issues/235
@ianpollard41385 жыл бұрын
I had the same problem. But then downloaded and installed 2.9.5 from ci.appveyor.com/project/DarrenGosbell/daxstudio/builds/28512060/artifacts and it worked as expected.
@DarkOri3nt5 жыл бұрын
So basically Marco recommends not using calculated cols unless completely necessary because of their memory consumption. What's to stop you from computing this in the source and importing it in ready to use. I would assume you can use this against tabular models aswell ? 12 millions rows is not big in pbi.
@arklur31935 жыл бұрын
If you create your calculated column in PQ rather than DAX, your modell will be still bigger but not just that big as if using DAX. And yes, 12M rows is not that much, but this is just a video explaining this stuff, you only need to worry about these if your model is much bigger.
@marcorusso74725 жыл бұрын
@@arklur3193 See comment above - the difference would be minimal in this case, computed columns in PQ are equally expensive to calculated columns if there is a medium-high cardinality.
@ashutoshready5 жыл бұрын
Hey, How to enable Vertipaq Analyzer Preview ?
@DreeKun5 жыл бұрын
Quoting a comment from below: "I had the same problem. But then downloaded and installed 2.9.5 from ci.appveyor.com/project/DarrenGosbell/daxstudio/builds/28512060/artifacts and it worked as expected."