Create custom keys for your Power BI relationships

  Рет қаралды 115,418

Guy in a Cube

Guy in a Cube

Күн бұрын

Пікірлер: 213
@GuyInACube
@GuyInACube 4 жыл бұрын
Note that doing this approach could result in the underlying table being queried multiple times on your data source to satisfy the multiple Power Query queries.
@danneubauer6474
@danneubauer6474 4 жыл бұрын
Could you create one query of the source in power query and then "refer" to it multiple times rather than copy? Would this reduce the number of times the source is queried? I've done this many times for consistency sake, but am not sure if it results in the source being hit multiple times. I usually then disable load of the "source" query so it does not become a table in my data model.
@subanark
@subanark 4 жыл бұрын
@@danneubauer6474That won't help. Each query is evaluated in isolation. Also for cleaner reference to just a single column use: MyTable[[MyColumn]]. This way you don't have to convert list to table vs MyTable[MyColumn]
@krishorrocks639
@krishorrocks639 2 жыл бұрын
From what I can tell the only way to truly avoid this is with premium capacity or premium per user dataflows which then do allow you to do true query chaining with the results of each query stored to disk for use as a source by downstream queries.
@MrGeneralLedger
@MrGeneralLedger 4 жыл бұрын
Rather than creating a new item "Not Supplied", I use "" or "". That way those items appear first in an ascending sorted list or a slicer. Plus they stand out when looking at many rows in a table. Now that it is clear they exist in my data, I can take steps to address them.
@GuyInACube
@GuyInACube 4 жыл бұрын
Thanks for sharing that John! nice trick to get them to the top.
@akshaychalke8122
@akshaychalke8122 Жыл бұрын
AA😂😂😊😅😊😅,p
@akshaychalke8122
@akshaychalke8122 Жыл бұрын
S
@akshaychalke8122
@akshaychalke8122 Жыл бұрын
,,,zx,,,,,zzzz**
@zachj1217
@zachj1217 4 жыл бұрын
Dude I can't tell you how much your videos have helped me. I inherited a mess of a database in my new position and had no one to really learn from. You rock at teaching
@GuyInACube
@GuyInACube 4 жыл бұрын
That's awesome to hear. Thanks for watching! 👊
@rushmuzik
@rushmuzik 2 жыл бұрын
What I like is how the star schema changes how you look at you data. It organizes your thought. Segments your perspective.
@justapasserby69420
@justapasserby69420 2 жыл бұрын
Thanks for this, now I know how to narrow my fact tables down
@MzDorsey
@MzDorsey 8 ай бұрын
so thank you for this video. there was not too much talking, but only the amount to provide support for the steps. i love how you broke the task down into easy to follow steps and explaining why it was done that way. 🙂
@BrenSunshine
@BrenSunshine 3 жыл бұрын
Hi Patrick! I had to create IDs and I did a very similar process but instead of 'right click -> add as a new query', I've duplicated the entire table ('right click on main table-> duplicate') and from them I've performed exact same steps that you, what is the difference? Thanks for creating such great videos!
@malk5434
@malk5434 2 жыл бұрын
Dude, you just saved my life! I was looking for it and all the results I got were like "how to replace ID with the name", and I wanted the opposite! You just got a subscriber from Brazil! Great content! Cheers!
@arnohoedelmans
@arnohoedelmans 3 жыл бұрын
Thanx Patrick, when comparing this method with using Dax combine values is there a performance difference?
@samuelmanseau6905
@samuelmanseau6905 4 жыл бұрын
Thanks ! Excellent advice at 6:45 !
@tashaparker5157
@tashaparker5157 Жыл бұрын
Awesome!! I am going to try this method! Thank you for walking through it.
@jlmacalisang
@jlmacalisang 3 жыл бұрын
Again, I went back here to refreshen up the ideas you got here Patrick. It really helped me a lot with my stuff! How about using this method in Import mode of connection then the data is updated, Does the other table will also be updated together with your other keys? Thanks a lot man!
@inkerisandberg1114
@inkerisandberg1114 3 жыл бұрын
I’m just learning Power BI and your videos are so helpful and fun to watch! Thank you so much!
@MrAszpic
@MrAszpic 4 жыл бұрын
Thanks for confirming I wasn't that nuts when doing exactly this. The merge step can take a LONG time for bigger tables though. Nice video!
@GuyInACube
@GuyInACube 4 жыл бұрын
Indeed
@hukumka2601
@hukumka2601 2 жыл бұрын
That is why I doubt if it really useful to use that technique. I see huge cons (long merging in PQ) and tiny pros (a little bit more usability). If we speak about a small model, there should not be any noticable difference in productivity, but if we consider a big one, we could merging in PQ would be painful. So why should we do all of that and pay more then receive? Or when this approach really matters and helps?
@tomoleusz
@tomoleusz Жыл бұрын
@@hukumka2601 I have the same struggle currently, as I need to decide which approach to take for generating relationship keys (create integers vs concatenated keys). I like integer approach however it slows down refresh time significantly. Luckily I have a premium PBI capacity so I am considering moving most of the data transformations to data flows.
@toyotakande5608
@toyotakande5608 3 жыл бұрын
Hi Patrick , I love your explanation very much , actually im beginner , pls help me below , i want to lookup one particular product in another table , but that product was booked by two different customer , finally it was sold to one customer , how to create a relationship for this from one table to another table
@clarasdk
@clarasdk 3 жыл бұрын
Is having the relation on ID giving a better performance than just having it on airlinename in the airline helper table? Is the performance gain of this worth the performance overheat you mention for generating the keys? I would (until I saw this video) just have made the link on airline name....
@amarkhaliq641
@amarkhaliq641 3 жыл бұрын
This video helped me a lot thanks was getting low percentages merging tables
@alexrosen8762
@alexrosen8762 2 жыл бұрын
Really useful tutorial for messy data. Thanks!
@EST1865
@EST1865 3 жыл бұрын
Brilliant way to generate look up tables. Thank you
@mako5708
@mako5708 4 жыл бұрын
Maybe a newbie question, but still. I come from SAP BW world. How to ensure a new index will be automatically created and a new entry will be automatically added into this Airline dimension table when a new unique Airline name appears in source data (excel, csv, table,...).?
@wmfexcel
@wmfexcel 4 жыл бұрын
Hi Patrick, thanks for the video. I have one question: Why we don't use AirlineName directly as the "Key"? We can skip the step of merge and it should be faster. Isn't it? Or I miss anything?
@GuyInACube
@GuyInACube 4 жыл бұрын
Yeah it was just the example that was used. Definitely different ways you can do it.
@wmfexcel
@wmfexcel 4 жыл бұрын
Guy in a Cube thanks for confirming! 🤗
@nimaiahluwalia5678
@nimaiahluwalia5678 4 жыл бұрын
Nice video, do we have other methods to remove many to many in power bi?
@felixsaint-gelais-nault3028
@felixsaint-gelais-nault3028 Жыл бұрын
I do this also, but instead of merge I do a transform with my buffered table. So if I have multiple columns, it's one step. I usually do TblID[ID]{List.PositionOf(TblID[Element], [ThingToReplace]} I don't know if the merges would be faster
@fabiovanroon1524
@fabiovanroon1524 2 жыл бұрын
Great video! If you get additional data, let’s say, with a new airline, will the refresh process take care of everything? Meaning add the ID to your airline table?
@Silverlythia
@Silverlythia 4 жыл бұрын
Great video, I do this when I want to split up a column that has multiple values, such as a tags column that would have a list of tags delimited by semi-colon. That way the user can select a single tag and see all matching rows that have that tag. Question: Why duplicate instead of reference if you are doing multiple columns?
@luisalejandrorodriguezcamp9516
@luisalejandrorodriguezcamp9516 4 жыл бұрын
Hi Zoe, as far as I know, you cannot merge referenced queries, only duplicated ones
@Silverlythia
@Silverlythia 4 жыл бұрын
Thank you Luis! That makes sense
@forallyoutuber2724
@forallyoutuber2724 2 жыл бұрын
Thank you for giving good information
@TracyOsimowicz
@TracyOsimowicz Жыл бұрын
How do you add the ID back into the fact table if you want to avoid (merging for query load time reasons)?
@PlatinumDragonProductions999
@PlatinumDragonProductions999 3 жыл бұрын
Excellent video! Exactly what I needed with no unnecessary filler. As a budding data engineer, this was a huge help! You are both a scholar and a gentleman! :-D
@malakaclothing4336
@malakaclothing4336 6 ай бұрын
Thanks man, one hour looking for this
@stevenjeppesen4563
@stevenjeppesen4563 Жыл бұрын
Just what I needed. Thank you
@markhenderson3771
@markhenderson3771 2 жыл бұрын
Great job, Patrick. This is helpful. I am going to use this technique in my dataflows so that it doesn't slow down the refresh. My question is about CamelCase. I heard (from the Tabular Editor Best Practices Analyzer) that CamelCase is not best practice. Why do people say that and what do you think?
@evelic
@evelic 2 жыл бұрын
How would this work with multiple columns. Columns example: cost center, cost center mapping and period.
@sambasiva907
@sambasiva907 2 жыл бұрын
Hello All, Is there a method of automating the process of Merging two tabular model ? I am using manual method in BISM normalizer
@johnconroy3917
@johnconroy3917 4 жыл бұрын
Hi, Patrick, if new Airline is added to original table will be it auto added to new query Thanks John
@yvesAnguilet
@yvesAnguilet 3 ай бұрын
you just help me a lot
@chelliebradshaw8721
@chelliebradshaw8721 5 ай бұрын
This was so helpful! Now I’m trying to add more columns from my two fact tables to the new tables 😅 without my PK’s yet and having some difficulty 😢
@atharvapawar7
@atharvapawar7 2 жыл бұрын
Thank you so much ! This helped me a lot.
@sureshful
@sureshful 2 ай бұрын
Does it work if new rows get added in dimension table and fact table. Will the new ids automatically get mapped?
@nelsonma4711
@nelsonma4711 4 жыл бұрын
Good Stuff Patrick!!
@GuyInACube
@GuyInACube 4 жыл бұрын
Appreciate that Nelson! 👊
@g986
@g986 4 жыл бұрын
Come back to the UK Nelson 😊
@nelsonma4711
@nelsonma4711 4 жыл бұрын
Wolfstar eheh I will eventually! But for now I’m enjoying this February’s - almost summer weather - in Lisbon :)
@Marc-gu4dh
@Marc-gu4dh Жыл бұрын
Can you still use this method if the incoming values for Airline are constantly changing? (e.g., new airlines are regularly being added to your original table)
@zx600e93
@zx600e93 3 жыл бұрын
Thanks, I actually used this yesterday and your steps worked like a champ, YA HIGH FIVE! Thanks for explaining what trim does too, that tool is very helpful. Going to put my gloves back on, clean clean clean data haha good thing I have a janitorial degree from the Corp. haha
@mohammedimran4257
@mohammedimran4257 4 жыл бұрын
Thank you so much for the video. How to connect Dynamic folder(File name changes day to day) ? Extracting the data from new file through refresh is getting failed?
@bcippitelli
@bcippitelli 4 жыл бұрын
Thanks dude. I didn't know about this method which doesn't use the "duplicate function". Much easier!
@GuyInACube
@GuyInACube 4 жыл бұрын
Love it! 👊
@ThinkwithLex
@ThinkwithLex Жыл бұрын
I have been searching for you. Great video🥰🥰 thanks a lot
@TainuiaKid1973
@TainuiaKid1973 3 жыл бұрын
Great video, Patrick!
@nicolaimller9791
@nicolaimller9791 2 жыл бұрын
Hey Patrick, great video! Do the Airline query update the names when new ones is added in your ERP system?
@chelliebradshaw8721
@chelliebradshaw8721 5 ай бұрын
This was so helpful! Now I’m trying to add more columns from my two fact tables to the new tables 😅 without my PK’s yet.
@biexbr
@biexbr 4 жыл бұрын
Yooow! Hi Patrick. You asked if I would make this in a diferant way. Yes I would. Until 5:21 I make just like you. But after, I wold add a custom step as "Table.Buffer()", and I wouldn't "add as new query". I wold make a Reference, rename this new query as "Arlines", remove other coluns, and make the same you do until 6:41. Ok, ok calm down you are thinkin "But Daniel, this would make a 'A cyclic reference ' and this won't work". So, to resolve this I make a new Reference to the 1st query, rename to "fData" (or something) and then I Merge Query with "Arlines". To end I would hide my 1st table from my data model. So let me justify, I wold do all this work because if I find out that I forgot to make a repalce, I would just need to make this new step in one Query. The way you did if my file change just a little bit, I would need to change twice. You think that this would gain or loss process time? (sorry possible spelling errors, I don't speak english very well)
@biexbr
@biexbr 4 жыл бұрын
Oh, and maybe I would add a new step to Capitalized Each Word with Text.Proper.
@nico1z95
@nico1z95 4 жыл бұрын
@@biexbr I still can't understand what does table.buffer() does. When I update the list with a new different row (ie: New airline) the lookup does not update. so after merging the rows became null.
@Mahi_RSV
@Mahi_RSV 2 ай бұрын
Thanks a ton GuyInACube! Super useful video here :)
@anirbna
@anirbna 4 жыл бұрын
Great video Patrick. Such a clean way to create lookup table and join. I have 2 related questions. 1. If I need to join 2 tables on multiple columns (e.g. composite keys), do I create a lookup table with those columns from the 1 side of the 1:N relationship? 2. If I need to join on BETWEEN clause, e.g. table1.date between table2.startdt and table2.enddt, what would be the best approach? Thanks in advance.
@TainuiaKid1973
@TainuiaKid1973 3 жыл бұрын
@ANIRBAN PAL, did you solve your two challenges?
@spacial7777
@spacial7777 Жыл бұрын
Is there a way to automate this , i have 40 tables I need to move from the flat-file
@9zQx86LT
@9zQx86LT 2 жыл бұрын
hey Patrick... would about joining on alphanumeric keys with Tpye as "any" ?
@ArtificialFertilizer
@ArtificialFertilizer 3 жыл бұрын
The problem id that with any big dataset this method will make the data refresh terribly slow.
@harmonizewithme
@harmonizewithme 4 жыл бұрын
How does this affect performance?
@ZachRenwickData
@ZachRenwickData 4 жыл бұрын
Your report should be faster when interacting with visuals. This is because joins are more performant when using integer data types instead of text strings (especially for bigger models with millions of records). On the negative side, it can make your dataset refreshes slower because of the additional steps needed to create these keys.
@ClubSoundsForever
@ClubSoundsForever 3 жыл бұрын
Will a relationship between two integers in PowerBI perform faster than a relationship between two strings? In SQL I would say yes, but for Power BI - I don't know.
@uzmarat
@uzmarat 4 жыл бұрын
Cool video, thanks! Do you have a video about caveats of joining on strings? Tnx!
@GuyInACube
@GuyInACube 4 жыл бұрын
We do not. We should definitely do something about strings. Lots of things to consider.
@ShijuKattarkandy
@ShijuKattarkandy 3 жыл бұрын
Hey Patrick, what if new airlines get added to the flat table? Will it update the index table? Or do we have to do all this process again?
@krishorrocks639
@krishorrocks639 2 жыл бұрын
All the IDs get regenerated with each refresh so new values shouldn't present a problem. Because of this you definitely don't want to take a dependency on ID values in your reports since "US Air" could = 1 today but = 2 tomorrow. In dimensional modeling parlance these are "Surrogate Keys" and should never be exposed to users. It is best practice to hide surrogate ID columns in the model.
@bradentilley2238
@bradentilley2238 2 жыл бұрын
@@krishorrocks639 Thanks for commenting on this. I’m new in this field and looking for further clarification. If I do want to depend on an ID from refresh to refresh, how is this typically done?
@MiguelMartinez-sh8gz
@MiguelMartinez-sh8gz 3 жыл бұрын
Great video, thanks for all the help
@sahiladya8473
@sahiladya8473 3 жыл бұрын
What if we have to consider multiple columns for this approach? Also, if we have to use a table as global filters across different sheets developed using different table. I.e. star schema.
@danneubauer6474
@danneubauer6474 4 жыл бұрын
I use this method all the time, works very well! Cool to see you guys use the same methods! What's the best practice for troubleshooting the data once you've broken everything out? For example, if you need to sift through that fact table by airline name, it become rather tedious to go back and forth between the tables matching keys. Worse yet, if you have multiple dimensions that are filtering the fact table, it can be difficult to identify the proper keys to look through the fact table. If the source is a relational database, this could be done in the database, but in this situation, the source is a CSV or other file, so that type of out-of-Power BI querying is not possible. Thanks!
@krynnadin
@krynnadin 4 жыл бұрын
I usually create a table visual in report space with the columns I need, and just add some slicers for the dimensions I care about QCing. Then I browse the data in report space rather than in query space.
@alep1186
@alep1186 Жыл бұрын
thanks this helped a lot!
@joaquinmorris6845
@joaquinmorris6845 2 жыл бұрын
This was perfect. Thank you!!
@alexiusgalloway5951
@alexiusgalloway5951 2 жыл бұрын
Your instructions are certainly stepping stones towards becoming "a big deal", keep 'em comin'
@GuyInACube
@GuyInACube 2 жыл бұрын
BAM! 👊 Thanks Alexius!
@johngriffiths4636
@johngriffiths4636 4 жыл бұрын
Nice video. I like these quick an useful data wrangling type videos. Please keep them up.
@nataschalaros4334
@nataschalaros4334 2 жыл бұрын
This is great! Thank you so much! You guys make this fun to learn. Keep up the good work!
@pchidambaram9137
@pchidambaram9137 4 жыл бұрын
Hey Patrick, I want to use a parameter to filter Top N output of a matrix ; using the parameter slicer. Could you please show me how? Thanks
@johnmclean1288
@johnmclean1288 3 жыл бұрын
Very helpful - thank you!
@zygfrodo
@zygfrodo 4 жыл бұрын
First of all, mapping tables are cool:) When your end user is someone who knows how to use PBI, then this method may come in handy to clean up the main table, however most people on the end of the chain possibly just know the frontend and only things they will change will be the filters. Therefore there is no need to create an artificial (in this particular example) mapping table. Nevertheless great video, I really enjoy your content
@impala4641
@impala4641 4 жыл бұрын
I get what you say. But will this improve performance or size of the data? Will that be a valid reason to create this kind of tables?
@krynnadin
@krynnadin 4 жыл бұрын
@@impala4641 I find star schemas useful for speeding up report performance, however when one needs to build your star schema from your fact table, this can really reduce refresh performance, so one needs to balance these two points. If you're scheduling refreshes you might be able to offload refresh performance to off peak hours. Win win.
@TainuiaKid1973
@TainuiaKid1973 3 жыл бұрын
@@impala4641 it’ll reduce the RAM required to hold the data model, and also make some DAX calculations easier. Power BI is designed to use Star Schemas.
@operacionsql8578
@operacionsql8578 3 жыл бұрын
I really liked the way of subtly taking performance into consideration (Look for int), is there a visual form of execution plan? What would be the equivalent of SQL Execution Plan to use with Power BI? Exists?
@diamonddas
@diamonddas 4 жыл бұрын
Hi Patrick can please do a video on aggregation , i have created an aggregated table using fax and want to create a dynamic filter for a column not included in the aggregation table
@hiolka
@hiolka 4 жыл бұрын
What happens if with time a new airline name appears in the source file? Will it be added automatically to the "Airline" table? Or will it result in an airline name without a related key?
@mondaynighthockeyleague
@mondaynighthockeyleague 3 жыл бұрын
Added automatically. PBI will import the data then follow the transformations, one if which will create the new key
@morinho96
@morinho96 2 жыл бұрын
Hi Patrick ! love the video ! Just one question that I couldn't find answer to : does matching on Id's rather than Airplane name improves the performance of the model ? Thanks à lot
@barttitulaerexcelbart9400
@barttitulaerexcelbart9400 4 жыл бұрын
Patrick, very cool video.. Normaly you receive multiple tables and have to do something with it, now you give an example if you receive one big (=wide) table. Thank you for this interesting example.
@maximabr648
@maximabr648 Жыл бұрын
This video helped me a LOT, thank you so much
@RJ-yf3qs
@RJ-yf3qs 3 жыл бұрын
Love every your video, huge help to me. Thank you so much!
@MucahitKatirci
@MucahitKatirci 2 жыл бұрын
Thanks for the video. It was really helpful.
@infips00
@infips00 4 жыл бұрын
Patrick, I am using your solution but I am facing performance problems when I join (inner join) both tables by the text key column (number of rows 1.000.000 aprox). Thanks
@leozaraterdz
@leozaraterdz Жыл бұрын
What if I want to make two columns as my primary key, I mean, instead of doing just Airline Name as a Primary Key, I want Primary Key and Claim Site, both of them as my primary key?
@jameskarchut3089
@jameskarchut3089 4 жыл бұрын
That is seriously good stuff! I've been thinking about something similar and now I have the ultimate solution to make this work. My only more-burning question at the moment is how do I get one of those Power BI coffee mugs... lusting after that!
@abeybrams2366
@abeybrams2366 Жыл бұрын
Very dope video. It really helped me
@allejulien645
@allejulien645 4 жыл бұрын
You don't even imagine how I learn looking your videos, thousands thanks for your great job
@seanclark7727
@seanclark7727 4 жыл бұрын
You must be a mind reading Jedi. I needed this as I am currently doing a similar approach through much more convoluted methods of duplicating tables and removing columns to get down to a basic ID Table. Your method will save me much time and my mind and emotional state are very appreciative. Have a question of how I can possibly have my company's IT department give me access to view the relationships they have built and had created through our data warehouse. Currently, they have many many tables with similar or exact names of different columns and attributes. It's making me go through a process of trying out different combinations figure out where they have pulled the data from and what relationships they have built between the two.I relate it to shooting a target with an arrow, in the dark, blindfolded, and with one arm tied behind my back. I'm not lazy, I just want to be efficient and not waste my time with the guessing game approach I find myself in.
@johngriffiths4636
@johngriffiths4636 4 жыл бұрын
ask them. Develop a relationship with someone in ICT.
@doclorianrin7543
@doclorianrin7543 4 жыл бұрын
Question. So when you make these tables with distinct values. What happens when you refresh the data model from an updated CSV or in my case a SQL Query that is returning new values every time you refresh?
@imashiah
@imashiah 3 жыл бұрын
Hello Mr. Rin- just came across your comment. I see there was no reply. Very curious of the answer. What are your thoughts now, bec I think your comment may very likely be valid?
@doclorianrin7543
@doclorianrin7543 3 жыл бұрын
@@imashiah I never got an answer. I suspect if any value in the original table that exists that does not have a unique identifier in the created table will return a "null" value in the original table column that has the key/unique identifier. It is the only logical result. What he doesn't say in this video is it comes with the assumption that the data you are associating unique identifiers to is static.
@omkarnikte1402
@omkarnikte1402 4 жыл бұрын
Is there any way i can have only one APPLY button for multiple Slicers/filters? Such that my visualisations don't get updated every time i select a filter. Thanks
@patrickkakou4649
@patrickkakou4649 4 жыл бұрын
I don't know if there's a way for that. But you could add an apply button for each slicer. Go to File --> Options and Settings --> Options --> scroll down to Current File --> Query reduction and voila. Hope this helps
@adamk6553
@adamk6553 4 жыл бұрын
Nice one Patrick! QQ, assuming I am using this method with a DW what happen if new data (Airline in your case) get added? Would the index capture the new lines? Cheers.
@Elkhamasi
@Elkhamasi 4 жыл бұрын
I second this question :)
@johngriffiths4636
@johngriffiths4636 4 жыл бұрын
@@Elkhamasi your DW should already have an index. But, if you reference the query rather than duplicate, it should be dynamic.
@b123kotey
@b123kotey 4 жыл бұрын
If I generate my dimension from my fact table why would I need to do this? Is there still an issue? Great videos!!!
@mondaynighthockeyleague
@mondaynighthockeyleague 3 жыл бұрын
Your fact table may not include a key for your dimension table. If you dont have a key, then u need to join dimension using the text field which is sloooowww compared to int join.
@amirulaizuddin5400
@amirulaizuddin5400 4 жыл бұрын
sorry i didnt get the part when we have multiple columns. What do u mean by duplicate them and then merge? is it like that?
@johngriffiths4636
@johngriffiths4636 4 жыл бұрын
I think he was saying ,,,, If you needed to create a combined column key to create a unique id. You would duplicate the query, remove all but the columns you need to create a unique id, then merge the multiple columns, then create the index, then merge the duplicate query back to the original table, so you would have a merged column unique ID and an index in the original table now, then remove the merge column from the original table just leaving the index. Hope this makes sense.
@gaillaguet
@gaillaguet 4 жыл бұрын
Hey Patrick, your videos are just awesome. Thank you so much for such easy to understand and accurate explanations ! Great Job
@GuyInACube
@GuyInACube 4 жыл бұрын
Appreciate that! Thanks for watching 👊
@extensionshk2314
@extensionshk2314 4 жыл бұрын
What if the fact table has a new Airline name, would that be automatically added when refresh or we have to change the code in PowerQuery again ? Thanks
@barttitulaerexcelbart9400
@barttitulaerexcelbart9400 4 жыл бұрын
Yes I think so, Bu because of the sorting step, all the airlines would get a different index nr. But that should not be a problem.
@Prapti706
@Prapti706 4 жыл бұрын
Hi - why not just connect the airline name from the dimension table to the airline name in your fact table since its a 1 to many instead of adding an index column ? Is it a performance thing?
@paulsimon9938
@paulsimon9938 2 жыл бұрын
There are several aspects. There is the issue of ease of use for the user of having fewer fields in the main table. Having the dimension aka lookup table is good to drive slicers. If you look at the main table, the airline name which is quite a long string is going to be stored on every record. By comparison what Guy in a Cube has done means that he is only storing a relatively short integer against each record so the record size is smaller. In a relational database this is definitely the right thing to do. However, the database behind Power BI is not relational but some version of a columnar database. This may very well apply this sort of transformation behind the scenes when the data is stored in the database. I mainly work with the multi-dimensional database called TM1 which does exactly that, so there is no need to derive numeric keys as the database does this internally. However, with the columnar database behind Power BI - SSAS Tabular, it is still going to have to do a join operation and any join operation is going to be faster used fixed length numeric keys, than variable length strings. However, there is a trade off between the time needed to do all this transformation when loading the data, vs the improved query response time. It therefore depends on what you are trying to do. If you need up to the minute reporting on operational data then speed of loading may be more important. However, in most cases you are looking for trends over a period of time and therefore you can load overnight, and query during the day so optimising query speed is more important. There is also the other issue of adding data from two different systems. This is common when companies merge. So you might have two different insurance companies merging, and both will probably have codes for airlines and airline names. The codes might already be numeric ids, but you cannot guarantee that one company used 5 to be British Airways while the other used 5 to be United Airlines, therefore you need to allocate your own unique id to avoid this homonym situation.
@billydavenport5828
@billydavenport5828 3 жыл бұрын
Would be REALLY cool if power query had an automated way of doing this. Right click on column -> "Create Unique Dim Table" and it does this all automatically.
@kaulpelly
@kaulpelly 4 жыл бұрын
Came looking for exactly this. Great stuff!!
@norpriest521
@norpriest521 4 жыл бұрын
looking for what? Surrogate key?
@EffnShaShinko
@EffnShaShinko 4 жыл бұрын
This is incredibly helpful. Thank you so much!
@GuyInACube
@GuyInACube 4 жыл бұрын
Happy to help. Thanks for watching! 👊
@mangeshmehendale4139
@mangeshmehendale4139 3 жыл бұрын
This is a beautiful and relevant video Patrick. I've often found myself thinking about this business case where the dimension data is long text strings and doing joins on such dimensions is fraught with uncertainty at the best of times. There is one use case I find myself thinking about, which the video does not address so I'm pinging to understand how you have thought about this. Imagine the flat file had different values for "TWA", "Transworld Airlines", "Trans World Airlines". This technique would create a different custom key for each of these entries - in reality however, these should point to the same key. Therefore, using this technique in power query will not cover this particular use case. Up in my head, the only way to do this is through manual intervention where the key is inserted through a manual scan of the table to ensure that "TWA", "Transworld Airlines" and "Trans World Airlines" all point to the same key. Short question - is there a way to reject this "lazy" technique and become more "efficient"??!!!?!
@johnadair4979
@johnadair4979 2 жыл бұрын
I'm sure you've already found your answer, but a creating a Transformation table would solve that problem. Power Query's documentation would show you how to do that.
@nico1z95
@nico1z95 4 жыл бұрын
Hello Patrick, I am a new subscriber. QQ, if new data Airline gets added? How will the index capture those new rows? Thanks!
@johngriffiths4636
@johngriffiths4636 4 жыл бұрын
Use a referenced instead of duplicate query
@BernatAgulloRosello
@BernatAgulloRosello 3 жыл бұрын
From data optimization (and even analysis) perspective it does not make much sense to make a single column dimension, does it? PowerBI already does dictionary compression at its vertipaq storage engine. For 2 or more columns though it might make it easier for the user to use later on, for sure.
@hk_200k
@hk_200k 9 ай бұрын
The most beautiful part is that it makes that column disappear from original table!
@davidcadman3562
@davidcadman3562 4 жыл бұрын
Thanks Patrick, really great as always. QQ at 2.54 the column profiler shows 24 distinct and 6 unique. 6 unique? Could you explain what that means please?
@brianxyz
@brianxyz 4 жыл бұрын
I believe it means those six values appear only one time each.
@davidcadman3562
@davidcadman3562 4 жыл бұрын
@@brianxyz Ah thanks - single occurences. I suppose that's useful sometimes. Seems odd (to me) to give that number the same prominence as Distinct values.
@jonprendergast7009
@jonprendergast7009 2 жыл бұрын
This is beautiful
@johnmatta9577
@johnmatta9577 4 жыл бұрын
Great one Patrick..I'd add one more step to yours and hide the airline id from the transactions table
3 REASONS to use a single dataset for your Power BI reports
8:55
Guy in a Cube
Рет қаралды 158 М.
🎈🎈🎈😲 #tiktok #shorts
0:28
Byungari 병아리언니
Рет қаралды 4,5 МЛН
How to have fun with a child 🤣 Food wrap frame! #shorts
0:21
BadaBOOM!
Рет қаралды 17 МЛН
Neovim Plugin From Scratch: How to Iterate (Part 4)
21:25
TJ DeVries
Рет қаралды 2,2 М.
The Excel Tool That Does What PivotTables Can’t (File Included)
13:59
MyOnlineTrainingHub
Рет қаралды 177 М.
Bidirectional relationships and ambiguity
14:02
SQLBI
Рет қаралды 111 М.
Looking at Power BI Many to Many
8:47
Guy in a Cube
Рет қаралды 162 М.
STOP publishing your Power BI report until you do these 5 things!
10:15
Power BI Tutorial | From Flat File To Data Model
10:27
Guy in a Cube
Рет қаралды 159 М.
2 ways to reduce your Power BI dataset size and speed up refresh
17:24
Many to many issues and alternatives in Power BI
19:57
David Benaim
Рет қаралды 15 М.
🎈🎈🎈😲 #tiktok #shorts
0:28
Byungari 병아리언니
Рет қаралды 4,5 МЛН