3 ways to REDUCE DATA In Power BI Desktop (No Premium needed!)

  Рет қаралды 58,549

Guy in a Cube

Guy in a Cube

Күн бұрын

Пікірлер: 65
@hatimali4945
@hatimali4945 3 жыл бұрын
Your videos has really helped me grow in power bi practice in my organisation.. thank you so much😀
@dinoben9527
@dinoben9527 3 жыл бұрын
I have been using date parameters for a proof of concept for my client and it works very well. It allowed me to avoid technical limitations of the relational database in use at the organization. Once in the service, all the data was pulled. I didn't know there was a simple way to limit the number of rows directly via Power Query. I'll test that now. Thank you for the useful information.
@SlothyUK
@SlothyUK 3 жыл бұрын
Thanks so much... spent so much time working in editor making small changes and waiting for the changes to apply on 6MM rows... cheers
@ayouberrarhbi8912
@ayouberrarhbi8912 3 жыл бұрын
i was looking for this trick for a long time. Hours and hours wasted waiting for refresh. Thank youu !
@getusama
@getusama 2 жыл бұрын
That is some great advice - thank you. I was also wondering in the third way using start and end dates; are you able to dynamically set the values to be current date for end date and say the date starting from the last 3 months? This is so that once you are done with uploading the report and you have set the schedule the start and end date parameter in power bi service could always use the current and last date (which will be different every day).
@ruelformales5184
@ruelformales5184 Жыл бұрын
I built upon your question, because you are correct you need the last updated data. 1. Create a parameter as "Date" which will act as your start date (don't worry about the end date) 2. On your transformation, look for a date column and choose "Is After" and use your Parameter Date. - let's say put a date 3 months ago 3. Publish your PBIX, then change your parameter date as far as you need( e.g. 5 years ago) 4. Let the heavy lifting happen on Powerbi service - you should get the latest data from 5 years till today
@miguelguillenpaz
@miguelguillenpaz Жыл бұрын
Hi Patric, thank you so mucho. I was looking for this information before
@mcnater
@mcnater 3 жыл бұрын
Big problem with 2 and 3 at our org is that we have millions of rows and those queries run first before it can get to those steps limiting the data. So it's still a huge pain on desktop to sit there and wait. We also don't have access to views so that's out the window. I have made some parameters that will switch between "ROWNUM < 100" or "TOP 100" depending on our server and that works OK, but it's still kind of a pain.
@akhilannan
@akhilannan 3 жыл бұрын
I think query folding should work here, which should push back the filter directly to the source.
@mcnater
@mcnater 3 жыл бұрын
@@akhilannan most of our code is done with custom SQL from the start so I don't think so in that case.
@HenriqueAzvdo
@HenriqueAzvdo 2 жыл бұрын
What did you do? I have the same problem, I limit it by date (start date and end date) and first it loads the entire query, which takes hours and hours
@cowpoke98
@cowpoke98 Жыл бұрын
Bingo, the video is a little misleading because it says "limit the # of rows being brought back from the data source". But thats not what these are doing, its applying the filter AFTER all of that data is queried. Every time you modify a table or anything and "Apply changes" you have to sit through all that data being brought in again. There needs to be a simple way to limit the # brought over on import.
@mcnater
@mcnater Жыл бұрын
100% correct@@cowpoke98
@intelabhinav
@intelabhinav 3 жыл бұрын
I love the fact that you can select the table that you want to load.. as in the one which is filtered or not filtered... I am using this feature for testing putt he load with fewer rows.
@samiphoenix5268
@samiphoenix5268 3 жыл бұрын
Another easy way to reduce dataset size without reducing number of rows is to convert decimal values to whole numbers (assuming you don't need the additional accuracy)
@Acheiropoietos
@Acheiropoietos 3 жыл бұрын
Yes, I discovered this recently - who needs pennies with a bottom line in the millions?
@frankcook5884
@frankcook5884 3 жыл бұрын
@@Acheiropoietos Finance
@jaypollock78
@jaypollock78 3 жыл бұрын
Definitely! I was able to cut my dataset size down by 40% by doing this.
@D_ytAcct
@D_ytAcct 2 жыл бұрын
thank you for this - I'm going to try and set up some refresh parameters now on the report I'm building because it links to Salesforce but I only need to see one of the territories/divisions so hopefully, hopefully, this will help reduce the thousands of lines it pulls through!
@D_ytAcct
@D_ytAcct 2 жыл бұрын
it worked! Thank goodness, that's definitely reduced the strain on this(!)
@TitusRex
@TitusRex 3 жыл бұрын
I was using exactly this approach (parameter), I'm glad I'm not the only one.
@otakarveleba7202
@otakarveleba7202 3 жыл бұрын
Hi Patric, great video as always. I totally forgot about query folding, as I'm usually making custom SQL queries that PBI can't fold anyway... so I'm usually injecting my parameter into SQL query (where parameter) directly. Unfortunately, this approach makes SQL query quite badly human-readable (adding lots of code like #(lf) instead of new line etc.). Thank's for bringing up again the possibility to fold our queries in an intuitive way.
@apekshaawaji185
@apekshaawaji185 3 жыл бұрын
Thank you so much Patrick😀 This is really helpful! Would love to see videos on customer attrition analysis. Cheers👍
@vpwl
@vpwl 3 жыл бұрын
Another amazing video! Great job Patrick! I always wondered how models with large data sets were created and published without a powerful machine.
@pabeader1941
@pabeader1941 3 жыл бұрын
OOOOOHHH! If only I was allowed to use the service!!! So many things would be possible. Great content and I will keep it in mind if I ever get them to change their minds.
@premjitchowdhury262
@premjitchowdhury262 2 жыл бұрын
This channel is making power bi no1 in the world.
@vpwl
@vpwl 3 жыл бұрын
Question- what if my model includes multiple tables and data sets, would I need to create separate parameters for each table? Will I be able to adjust them all in the service?
@sands7779
@sands7779 3 жыл бұрын
Really useful tips that will help people working from home and working with slow broadband through vpn develop and deploy reports and take less time waiting for rows to load, thanks Patrick 😊
@tymccurdy8932
@tymccurdy8932 2 жыл бұрын
Done something similar with the parameter values. Instead, we set up two dataflow entities to work off of, a full set of the data, and a sample set of the data. Names Source & Source_Sample. Then we can just parameterize the source. So start with the parameter point to Source_Sample, when it's all done and published - update parameter to Source. I think this way, as it can standardize the process for all report builders in our company and we can ensure they're getting a good distribution of the data within the Sample. Report writers just toss in a couple standard lines in Advanced Editor (copy/paste) and they're all set up. We don't need to rely on them to find janky ways to try to get their own sample set.
@KateMikhailova
@KateMikhailova 2 жыл бұрын
Thanks for great idea (2-nd way with parameter)! I can't use it "as is" because Report Server (on-premises) haven't parameter's option for scheduled refresh. But (as always) xlsx-file with number of rows in network folder works well :) According to Matthew's "upstream" maxima I included number of rows from xlsx-file in SELECT-section of query. Now I know that "first 100 rows" has different translation for MS SQL-source and for Oracle-source. Interesting experience!
@runexn5940
@runexn5940 9 ай бұрын
Q: I had a question regarding using the first tip with incremental refresh. The table I want to implement incremental refresh on takes too long to load data on my desktop file so I am using an sp to load only a small subset of the data on my desktop file and then after implementing incremental refresh on it I publish it and then make changes to my sp to bring in all the data and refresh the data on service. This method worked before without incremental refresh but now refresh the initial refresh fails after some 8-10h and I am not sure what the exact cause may be. Is it because I am using an sp with incremental refresh? Or I have to load all the data in the file before publish it for thw initial refresh to work fine?
@watkinson8230
@watkinson8230 3 жыл бұрын
Hey Patrick does this work if you are appending multiple CSV's and then various transformations i.e. would the filter on the final table be enough to limit the refresh or do I need to do it in each source import?
@naveenkumarm.r4692
@naveenkumarm.r4692 3 жыл бұрын
Hi Patrik, I need a solution for monitoring scheduled refresh from multiple workspaces.. can you please make a video on this...
@AlbertoTrujillo
@AlbertoTrujillo 3 жыл бұрын
Great Video Patrick!
@marounsader318
@marounsader318 3 жыл бұрын
hi matt, are you running the refresh on a laptop or server, because the refresh of the 500k rows is pretty fast? if it on laptopn can you please share the specs, as even 100k rows its takes alot of time on my working laptop
@sumitchoubey3859
@sumitchoubey3859 3 жыл бұрын
Informative session
@muralichiyan
@muralichiyan 2 жыл бұрын
Nice, but end user wants to download normal export without reduce data how to do that
@arnaudcampestre
@arnaudcampestre Жыл бұрын
Would this first method help to publish a pbix file that is above 1gb?
@Carito_k
@Carito_k 11 ай бұрын
If I make a parameter for selecting only 100 rows does it affect to my final result?
@aliramadan7425
@aliramadan7425 3 жыл бұрын
Hello Patrick. Interesting stuff. Do you have any advice on optimizing performance between Power BI Desktop and AWS Athena using ODBC connection? Would the reduce rows option you showed be a good option in case of 150+ million rows? Appreciate your feedback.
@pankajsonawane3958
@pankajsonawane3958 3 жыл бұрын
Hi, Patrick huge fan of yours.. I am stuck with a time Slicer. I have static time column which has values from 12:00:00 AM to 12:59:59 PM in the time table. I want to create a time slicer that will look like a range slicer. If I do that using between option in the sclicer I am not getting time values to select in the boxes I see date values there. My goal is to filter out the page by a time range. Please help!!!!!!!!!
@lestatelvampir69
@lestatelvampir69 3 жыл бұрын
How would you configure the editor when the source is not a single file but a folder? I can't see how to do it :(
@saikirankusumanchi4522
@saikirankusumanchi4522 3 жыл бұрын
this helps. thanks for this trick
@singhraunak09
@singhraunak09 3 жыл бұрын
Wow superb video ❤️
@haydehkaveh3047
@haydehkaveh3047 3 жыл бұрын
I have done the parameter adjusted to have 100 rows, but on the Cloud service does not let me to refresh because of credential pass. any thought?
@akhileshkumar_gupta
@akhileshkumar_gupta 3 жыл бұрын
Patrick - plz start a tutorial series step by step video to reach the goal as pro on Power BI. I have checked all through your uploaded videos and unable to find out the tutorials series on Power Bi. if you have all modules published via through one can be professional then plz arrange the module in order like ascending to descending to be a pro.
@felipefrancisco01
@felipefrancisco01 3 жыл бұрын
that's awesome, man.
@marymurphy9014
@marymurphy9014 2 жыл бұрын
can the parameter(s) be a list of 500 or so employee ids? The only thing I have to limit the 6M+ record pull is the employee id or employee email. I don't need all 125,000 employees - just our little 500+ person group.
@AbouAli01006300091
@AbouAli01006300091 Жыл бұрын
The 3rd method is the most interesting "BUT" if I want the complete period (5 years for example) but with a reduced dataset. The workaround I used was: 1. sort the dataset using the date column 2. create an index column 3. add a helper column in which I divide the index values using Number.Mod (lets say over 10 which means I need to keep only one row out of every 10 rows) 4. filter the helper column to keep 0 values only BINGO: I reduced my data to 10% sample of the total volume and kept all the date range Advantage is: having sample data across the entire time Disadvantage is: step no.1 above means the full dataset is loaded in first step :( I hope if someone can achieve the same goal without loading the full dataset.
@ronvdlaan1
@ronvdlaan1 3 жыл бұрын
Hallo Patrick, i set this up and it words fine with a value in the parameter, but when i set the parameter to 0 then all my previous steps in the Query Desiner are lost. What cab be my problem?
@LandscapeInMotion
@LandscapeInMotion 3 жыл бұрын
Want about joining tables with millions of rows? Performance? Normally need SSAS..
@GuyInACube
@GuyInACube 3 жыл бұрын
SSAS Tabular is the same VertiPaq engine as Power BI. It works :)
@meeradominic9094
@meeradominic9094 3 жыл бұрын
Hi, I'm looking for changing an existing report from odbc to powerbi. The source is postgresql. After I installed the connector in my local machine, I could successfully connect to DB by changing properties in advanced editor of the query. But when I published the report to the Service, it is not working. Could you help with it?
@sayelichakraborty7222
@sayelichakraborty7222 Жыл бұрын
Hi Patric, I Am having a weird error, When I put NumberOf Rows in Advance Query Editor.. and click ok, it shows an error " The import NumberOfRows matches no exports. Did you miss a module reference?" Please Help
@ahmadbadalov8489
@ahmadbadalov8489 Жыл бұрын
This didn't work for me, because I was calculating last 30 days, filtered from 1 jan 23, only on 1 jan 23 doesn't show correct on the service.
@haydehkaveh3047
@haydehkaveh3047 3 жыл бұрын
Even the credential password now is working, modifying the parameter does not still refresh my report.
@rahulmaurya9444
@rahulmaurya9444 3 жыл бұрын
Can we apply custom last one month filter in advance editor
@Bharath_PBI
@Bharath_PBI 3 жыл бұрын
Yes
@Kaiyening
@Kaiyening 3 жыл бұрын
Dat Power BI yellow Apple Watch strap is glorious! 😆 Also I’m anxious to upvote this video as it’s 666 likes now ._.
@CuddleStories
@CuddleStories 3 жыл бұрын
Will Smith knows powerbi?!?
@GurpreetSingh-jg9ll
@GurpreetSingh-jg9ll 3 жыл бұрын
👌🏼👍🏼
2 ways to reduce your Power BI dataset size and speed up refresh
17:24
Do's and Don't with cascading slicers in Power BI
9:28
Guy in a Cube
Рет қаралды 85 М.
1% vs 100% #beatbox #tiktok
01:10
BeatboxJCOP
Рет қаралды 67 МЛН
Правильный подход к детям
00:18
Beatrise
Рет қаралды 11 МЛН
REDUCE Power BI dataset size by 60% with ONE CHECKBOX???
7:49
Guy in a Cube
Рет қаралды 120 М.
Try limiting rows when creating reporting for big data in Power BI
5:01
Handling MULTIPLE fact tables in Power BI
9:02
Guy in a Cube
Рет қаралды 325 М.
Power BI Get Data: Import vs. DirectQuery vs. Live (2021)
8:31
Guy in a Cube
Рет қаралды 326 М.
Migrate your Power BI Semantic Models to Direct Lake
11:17
Guy in a Cube
Рет қаралды 10 М.
NEXT LEVEL Conditional Drill in Power BI Desktop with buttons
13:13
Guy in a Cube
Рет қаралды 160 М.
REDUCE the # of measures with Calculation Groups In Power BI
9:24
Guy in a Cube
Рет қаралды 231 М.
Why Power BI loves a Star Schema
8:10
Guy in a Cube
Рет қаралды 131 М.
STOP publishing your Power BI report until you do these 5 things!
10:15