This is the best demo - nobody explains that the original values need to be removed from the master data table. This is not obvious to newbies. I am working with non financial data so need to add my own measures. Thank you!!!
@Oli9995 Жыл бұрын
Omg yes. I've been looking for this. How to convert denormalised table into facts and dimensions.
@oscarmacho2798 Жыл бұрын
You always need to add your measures
@RedCouchChess9 ай бұрын
I hope your day is filled with laughter and joy. Thank you for the service you provide to us unqualified data analysts :)
@Transformita20 күн бұрын
02/01/2025 - Sign up and join the waitlist ⭐ Launching soon - www.powerbischool.com/ ✅Real world datasets ✅ Datasets found in all videos ✅ Potfolio/dashboard review & feedback ✅ Power BI Training - DAX, Design, Story Telling ✅ Open Community
@MegaAbgt Жыл бұрын
What a fast and elegant tutorial. I try to find a power bi job but very hard to find one as a newbie.
@saru8773 Жыл бұрын
Same
@alimirazimzadeh544 Жыл бұрын
I have seen over 100 videos, and this is my favorite
@houseandhobby69712 жыл бұрын
New terminology for classical db concepts. I'm so old. We just covered the concept of normalizing a database in this video. The original term for dim table is parent table. For fact table, child table was the original term, but a specific type of child table that stores observations rather than child entities (in turn pointing to parent entities). Foreign key term still lives on. I would have used Excel (old school, manual, slower). Dim tables are about entities. Fact tables about data observations of the entities. Good normalization results in a hierarchy of dim tables (entities) thanks to common attributes across entities. Nice vid, just chatting some here. Thx
@raideryvs559511 ай бұрын
In a star schema, the dimensions generally undergo denormalization. For instance, consider an employee dimension where employees are associated with a department. In a normalized data model, distinct tables for employees and departments exist, connected through a foreign key relationship. However, in a pure star schema, only the employee table is present, and department data is duplicated for each employee. This optimization enhances data retrieval speed, and the additional storage overhead is deemed acceptable, especially when contrasted with the space consumed by the fact table.
@amid7757 Жыл бұрын
The best demo I've ever listenned. Thank you so muchh! God bless ya!
@krishnakantapanda6835 Жыл бұрын
That was super super super helpful.... finally I could confidently say that I have clearly understood fact vs dim, foreign vs primary key and star schema concept❤
@kristinajensen711 Жыл бұрын
Thank you, and just for this video but for the whole channel. I stumbled upon it yesterday (looking for non-pretensions explanation of variables in DAX measures) and I am a fan! I watched all your videos, some more than once - and you brought back some joy , enthusiasm and HOPE that I could understand and learn Power BI. Thank you a million again and please keep up the fantastic job!
@fahadamir63499 ай бұрын
superb brother, MashaAllah you did a great job its really helpful
@fcrz-id9vb2 ай бұрын
The best explanation I have ever heard of it. Well done mate
@Transformita9 күн бұрын
Thanks. Appreciate it 👍🏼
@MarkGoodridge-f6d Жыл бұрын
Thank you Power BI Guy, this video is helping more than you can imagine :)
@asaravana670110 ай бұрын
Very insightful information sir thank you now i am getting some clarity on this concepts
@ishneetsinghcheema672711 ай бұрын
You are the mannn!! Thanks so much. Wish you all the best for futures releases.
@Transformita10 ай бұрын
Thank you. Glad it helped!
@Tonezatl2 жыл бұрын
Excellent job, you just gave me the best lecture and gift of the month...thanks a lot bro
@Transformita2 жыл бұрын
You're most welcome! Glad it was of benefit. I've released similar videos and topics, hopefully they help. Don't forget to like and subscribe. It's appreciated! 🙂
@cparker44862 жыл бұрын
Good explanation and instruction on how to carry this out. Two things: (1) Why is it important to do this? How does it improve performance? It appears the file still has to load the same amount of data. (2) Was it important to move the shipping method into a separate table because it was only one column?
@user-re5uv2mm2h2 жыл бұрын
Same question. Hoping he would response. Our team is migrating reports from Tableau to PowerBi. When I create my data source in Tableau, I usually create one table before putting it in Tableau, but now in PowerBI they are telling me to separate the columns first. Is it better to separate the facts and dimension than create the visualization on one table? Would that greatly affect the performance? I imagine I will just need to do that if I will be using two different data source though. Please enlighten us.
@Transformita Жыл бұрын
@@user-re5uv2mm2h Sorry for the delayed response. In summary, you're massively reducing the size of the dataset by splitting your dimensions. Let me explain. Lets say you have one wide table with 1 million rows with each repersenting a sale made. You have the following columns product name | sub-category | location |store name etc. The product name, location etc. is going to be repeated 1 million times as strings (text). .In general with relationtional databases, int type (numbers) are far more compressed then strings. e.g. The product "phone" was sold and appears 300,000 times in your sales table when it's one table. By splitting your product detail into it's own table, you have 3 rows, each repersenting your 3 products - "Phone", "Laptop", "Tablet" and an ID for each. After this has been modelled, your fact table now has the number "1" appear 300,000 times not the word "phone". From a performance perspective this is significantly more compressed Hope that helps.
@demirdrama22482 күн бұрын
9:53 Why don't you use Product ID as the index? What's the logic? Because I would think there could be the same name for products, but assuming the product name is unique. Why did you insert your index and eliminate the existing Product ID? Cheers
@solomonidowu82843 жыл бұрын
Hi. You did a wonderful job here. Thank a lot.
@aziserak4743Ай бұрын
thank you for best explanation
@Transformita10 күн бұрын
Glad it was useful!
@dixitmca2 жыл бұрын
Best Demo ..thank you.
@noneB9748 ай бұрын
How would you achieve same results for a model with 2 fact tables where you have to make a common dimension from both to use that common dimension for slicing?
@tlee70282 жыл бұрын
Thank you! This is what I was looking for !
@ouaildjaidja14474 ай бұрын
Awesome ! Thank you
@hardik73852 жыл бұрын
Amazing crisp and informative
@charanmarc53769 ай бұрын
Thanks Man! Appreciate it !!!
@cumeh8212 жыл бұрын
Thank you for this video. So helpful.
@Renotalks Жыл бұрын
Thanks man appreciate your efforts
@nathanmutala29412 жыл бұрын
merci, ce video m'a beaucoup aidé.
@fmufafa Жыл бұрын
You are just the best.
@dixitmca2 жыл бұрын
please keep posting wonderful videos
@bensix64962 жыл бұрын
Very helpful thanks - question though by creating the dimension table on the fly, when I have a new country if I don’t refresh my DimCountry first the new Country Index doesn’t flow back in the source data. Is there a way to sequence or am I missing something? Thanks in advance
@felixfranke2838 Жыл бұрын
That is exactly my question as well...
@cecegaming111910 ай бұрын
Thank you for this demo, but I have a question. When I create a fact table, it causes a lot of duplicates, and I cannot load it into Power BI. It's taking a lot of time. I've tried removing duplicates in the dimension, but the issue persists. What could be the solution? Thank you.
@sidharthmandal99572 жыл бұрын
Hey man, really loved this tutorial. Learned a lot. Can i have the data set plz? I want to have handson on it.
@PersonalityDev-T2 жыл бұрын
Thanks much . very good explanation
@bchi492 жыл бұрын
Thank you. Just was what I was looking for. I have one to many table relationships, I can do the same on the many side.
@Transformita2 жыл бұрын
Glad it was helpful! If you like my content, consider subscribing 🙏, I'll have more videos on this topic that go into bridging tables etc. for modelling.
@PaoloTroia-h3e8 ай бұрын
If I have two datasets, should I create two star Schema? If yes, how can I manage Dim tables and Facts tables from both datasets?
@TheCantonese Жыл бұрын
[16:02] Oscar Wong Best practice is 1 to * (dimension table to fact table) this is 100% sure the best way BUT question it's possible to have * to * or * to 1 (between dimension table and the fact table) in a BI star or snowflake model?
@subhashisnayak8192 жыл бұрын
Amazing you are.. i subscribed for more videos
@DmytriiKutsiianov6 ай бұрын
I would really appreciate if someone could help me fix the following issue: When I click the "Close&Apply" button in Power Query Editor after adding the dimnsion tables, all the data gets loaded into Power BI with no errors. However, when I look at my dimension tables in data view in Power BI the order of the ROWS is changed. For example, in Power Query Editor, index 1 in the SensorName_Dim table corresponded to "Engine Speed" and index 2 corresponded to "Coolant Temp". After loading data into power BI the order of the sensor names got messed up so that Engine Speed no longer corresponded to index 1 and Coolant Temp no longer corresponded to index 2 . Does someone know how to fix this issue and preserve the order of rows when loading data into Power BI?
@kartikeyatiwari66788 ай бұрын
Awesome 🎉
@JonathanHansen-inmk7 ай бұрын
Great demo. I really appreciated your clear explanations of how to create dim tables. I did have a question about grouping by zip and city though. If you remove duplicates on postal code you could potentially lose some cities since there are some postal codes with multiple cities in them. Can you remove duplicates bas on city and postal code?
@akthar32 жыл бұрын
its a great video showing how the the context data is moved to a dimension table where a key is created in the dimension table and added to the fact table as a foreign key. A question of concern , should i be doing that to flattened sql server table (the main raw data source) with over 10 million rows ? - this is how the data comes to us from another team - 1 single flat table s that needs to split into several fact tables and dimensions. As I plan to do all etl in the sql server , should i materialise the fact and dimension tables (create physical static tables ) or just create views ? Should i be creating the key in the dimension table or view and then updating it or joining it to the fact table ? The processing may become a little too much for the Azure sql server capacity ? I hope my question is not too confusing . Thanks you
@JackWebb-pd3dy2 жыл бұрын
Thankyou for the video, really well explained and great demonstration of the model in effect. At the end you mention a custom date table but then video cuts out - do you have one explaining this or a video you'd recommend?
@abhisheksural9098Ай бұрын
So while designing data modelling my approach will be first analysis column of fact table then moving to the dimension table?
@Transformita10 күн бұрын
I would start with identifying the facts
@akkialekhya76382 жыл бұрын
This is so helpful to me.tq u
@awosusiomowaare8696 Жыл бұрын
would the row_ID be in the dimension or in the fact table?
@kalliduspre-sales179910 ай бұрын
Hey, great video Power BI Guy. Right at the end you say you need to create a custom date table. Do you have a video which covers this and explains why you need that?
@Transformita10 ай бұрын
Not yet!, I’ll make one soon 👍🏼
@nidhisingh49739 ай бұрын
Thank you so much this video. Got a good understanding of building fact and dimension tables from raw data. Just want to know, do we create dimension and fact tables directly in power bi in real world project?
@donharrold13755 ай бұрын
Clever approach
@NewmanOkereafor3 ай бұрын
Is there any reason you removed the alphanumeric Product ID column from the model?
@Transformita3 ай бұрын
@@NewmanOkereafor just to demonstrate
@SSRealEstates832 жыл бұрын
Hi ...After duplicating the main table is it mandatory to remove the steps till end from applied steps.what will happen if we won't delete that steps from the applied steps .thank you
@Transformita2 жыл бұрын
I suggest doing so to improve refresh times. Each applied step has to be computed therefore by not removing it, you are proccessing far more data in your transformations
@zakariaelaaliji7747 Жыл бұрын
Thank you, but what if we want to bring data from multiple sources and formats ??1!
@SDDEditzPro2 жыл бұрын
We can use reference option instead of duplicate for creating dimension tables. Right?
@akthar32 жыл бұрын
if your facts are descriptive comments (eg 500 characters) along with numerical values and measures associated to the comments ..do you keep them both in the same fact table ?
@cesarbarreradeleon4323 Жыл бұрын
where can i get the data simple ?
@mikesmith-nh1jw8 ай бұрын
Great video just one question. Wouldnt it be better to 'reference' the source data or cleaned up copy of the source data rather than 'duplicate' as then you wouldnt have to remove the applied steps later ?
@ryanjohnston96427 ай бұрын
Very interesting data model! I'm new to PowerBI and have a follow-up question... When you add the index column to a DIM query, if the source data changes such that a new ZIP Code is added (or a new client), will these index columns not change when you refresh the source data for the query? Will these index values not change and perform as relative-references? I'm concerned that the data will get mixed around when the DIM table merges into the FACT table mid-query. Or am I overthinking this? Will these relative references update safely so that everything merges correctly? Or maybe the issue is that everybody using the fact table needs to understand that Client ID# is a relative reference that can't be relied upon to stay the same from one cycle of reporting to the next.
@mojeburrahman83884 ай бұрын
The sample data is not attached...??
@oluwatuyiayomide53852 жыл бұрын
Some of the data in the fact table already have unique ID, do we need to start creating unique ID again with index number
@Transformita2 жыл бұрын
No you would not need to re-create them. I did so as an example for this video. Relational databases will have these already
@oluwatuyiayomide53852 жыл бұрын
@@Transformita Thank you
@HoaNguyen-ji5fz2 жыл бұрын
I love it. Thanks
@karmasakal9752 Жыл бұрын
Is it normal that my row entry number has increased by adding up relationships ? My dataset does not allow me to proceed exactly the way you did. I created unique IDs for all my entries in the dataset and merged the unique IDs with the duplicated unique IDs. Is it possible that the way I did it does not make sense logically and somehow increased my row number ? My data does not allow me to remove duplicates for a certain characteristic or I am mistaken idk. Thanks for the support guys I would be glad to explain my problem if there is some interest on solving the issue.
@SSRealEstates832 жыл бұрын
I have a question here,when we duplicate,remove other columns, creating inder...then after when we merge...the number of rows will get increase or not???please answer.... Thank you
@shimamoayeri87782 жыл бұрын
verrrry good. thanks
@novandc9223 Жыл бұрын
thankyou good explanation
@derbysutter2144 Жыл бұрын
Hey, I have a pressing question. Can this method be used with multiple fact tables? Or just with 1?
@Transformita Жыл бұрын
yes it can be. This is the ideal, but in the real world you can get multiple fact tables. Guyinthecube have done a vid on this -kzbin.info/www/bejne/ip_cg6aFabN9hsU
@ztymedina Жыл бұрын
@@Transformita Hi can you show details on how to do it? because i've seen the video but it does not show steps by steps.. let say if i have 4 different fact tables, then how i'm supposed to do? Do i need to consolidate all fact tables first, then delete duplicate value to create a dimension table? or how is it exactly?
@felixfranke2838 Жыл бұрын
Excellently explained, thank you. Question: If in my original data, which I refresh, a new customer shows up, or a new shipping method: How will that be related to the correct line in the facts table? We do not have an index for that. Wouldn't it be better to use data which are in the original source to connect the tables?
@Transformita Жыл бұрын
Thanks. - Because your referencing tables, PBI is dynamic and the new ID's will get populated and reflect in the fact table when it's refreshed. It's not static. - The example in this video is purely to show the concept of a fact and dim table. In the real world these ID'would already be at source where they should be and without a doubt always use those.
@souzaeq Жыл бұрын
Question: For the next step, I create a flat custom date for the dates interval I have and use it as DimTable (date, year, quarter, month, day) linking Dimdate to wich date in orders, shipping or order date? I assume Order that comes first, right? These would be the next step you said? Really nice video, thanks for sharing.
@Transformita Жыл бұрын
Are you referring to a custom date table and a relationship between the two. You can create many relationships between the dates. But only one will be active and the others inactive. Look into active and inactive relationships.
@souzaeq Жыл бұрын
@@Transformita many thanks! Do you have videos on that?
@vishalmane3139 Жыл бұрын
Whats the difference between custom column and calculated column?
@Transformita Жыл бұрын
Custom column is in Power Query, calculated column in in the normal desktop view. Functionality wise they are the same, the former using M in Power Query and DAX in Desktop
@saadfayyaz894 Жыл бұрын
Love from Pakistan😊
@busta00797 ай бұрын
Thank you. Can you explain relationships in football data please
@vishalmane3139 Жыл бұрын
creating an index column was necessary ?
@Transformita Жыл бұрын
Yes it’s necessary/recommended. Real world databases will have these and the correct term would be primary, foreign, surrogate keys
@vishalmane3139 Жыл бұрын
@@Transformita dont u normalize the tables ?
@ericaleverson94302 жыл бұрын
Wouldn't we run into issues using the customer name as a unique column when different customers might have the same name?
@Transformita2 жыл бұрын
Yes you would, you would need some sort of unique identifier in that case. Real world databases have identifiers/keys for customers etc.
@gagansingh34812 жыл бұрын
Could you please make a video on Market Basket Analysis....
@dakota34362 жыл бұрын
Awesome, thank you! I have one question. If you were to update the data from your original source file by copy/pasting over an older export or changing the source file, all of the dims you created would still function correct?
@Transformita2 жыл бұрын
Yes, as long as the column headers/structure is the same. The actual values in the cells don't matter. Think of Power BI as columns and Excel as cells.
@rasmusbendtsen59522 жыл бұрын
Thank you for a great video. I have a question. When creating the costumerID and removeing the duplicates, how would you handle it, if two people from different costumer segments, happen to have the same name?
@raghavverma1202 жыл бұрын
In the real word scenario.. already there would be a customer table with unique customer id/ email id/ username.. Since it was a flatfile, so it was implied that customers cannot have same name.. else instead of customer name, username would be there
@raghavverma1202 жыл бұрын
Else.. just dont delete duplicates.. just use a key to join tables
@Transformita2 жыл бұрын
@@raghavverma120 Exactly this.
@S3Kglitches2 жыл бұрын
best example
@raghavverma1202 жыл бұрын
Can you please share this flat file
@jamesadam71133 жыл бұрын
Thank you for this! I've watched other videos and they've not shown how to create a key / index column and I've been using the oroginal columns as a key. Is there any benefit to using a index column over the original columns with relationships.
@Transformita3 жыл бұрын
- It's faster. A JOIN on an integer is much quicker than a JOIN on a string field or combination of fields. It's more efficient to compare integers than strings. 2 - It's simpler. It's much easier to map relations based on a single numeric field than on a combination of other fields of varying data types. 3 - It's data-independent. If you match on the ID you don't need to worry about the relation changing. If you match on a name, what do you do if their name changes (i.e. marriage)? If you match on an address, what if someone moves? 4 - It's more efficient If you cluster on an (auto incrementing) int field, you reduce fragmentation and reduce overall size of the data set. This also simplifies indexes needed to cover your relations.
@pravarshakollu76462 жыл бұрын
can we create dimensions tables and fact table in power bi?
@Transformita2 жыл бұрын
That's what I'm doing in the video. General rule of thumb - Do transformations as close to source/upstream as possible.
@pravarshakollu76462 жыл бұрын
@@Transformita how to identify fact tables and dimensions in our given data
@ankurarora7632 Жыл бұрын
Please share the dataset
@vladx3539 Жыл бұрын
what about the date?
@vishalmane3139 Жыл бұрын
Dude if u can have a vfieo on normalize the table that will be good too upto 3nf.. alot of youtube vdieos ending up confusing more.
@itspurelypassionate2 ай бұрын
Thank you! It appears our TA doesn't know anything so I have to do everything by myself
@Transformita10 күн бұрын
lol I've been there before
@eddarelmbuluyo87023 жыл бұрын
Great Sir. it's possible to have the file, the dataset? thanks
@sidharthmandal99572 жыл бұрын
Did you get the data set? If yes, can you also send me ?
@vishalmane3139 Жыл бұрын
what if we have multiple tables
@Transformita Жыл бұрын
Hey, I'm not sure I'm understanding your question. There's multiple tables in this example.
@vishalmane3139 Жыл бұрын
@@Transformita im not able to break a dataset in dim and fact tables. Ive inported many dataset instead of one
@adeelharal71062 жыл бұрын
Please share dataset file
@mar259478 ай бұрын
Short n sweet! not 2 hrs long lol
@souzaeq Жыл бұрын
I have a fact that is built this way: each event takes a different quantity of days. Like event 1 takes 6 days, so the sale is for the interval of six days. Do I need to break that into daily revenue and cost by dividing the total amount by 6 and then transforming a daily and revenue? In case I need to divide the value to split among the days, wich time dax or transformation in power query should I use like to divide the total revenue by the number of days and then create a line for each recurrence. Like: USD 100/ 4 days, is USD 25/ day, how can I automatically generate starting from current information as 4/jan/2023 (end date) with 100 USD to 1/jan/2023 with USD 25, 2/jan/2023 with USD 25, 3/jan/2023 with USD 25 and 4/jan/2023 with USD 25 ?
@leandro05057 ай бұрын
Where can i get the example file?
@richardpearcephotography599 Жыл бұрын
What happens if you have two or more unrelated customers with the same name? John Smith vs John Smith.
@Transformita Жыл бұрын
In a real world scenario you tend to have the data structured from a source system/db that has a unique identifier already. When the two accounts sign up to x organisation, they would get an accountid for instance.
@user-re5uv2mm2h2 жыл бұрын
Why didn't you create a DimDate?
@Transformita2 жыл бұрын
I forgot in this video if I'm honest 😅 you should always have a seperate date table in Power BI
@vishalmane3139Ай бұрын
where is the dataset man
@Transformita10 күн бұрын
this is one of my older videos. I no longer have the dataset unfortunately. My newer videos have all files available in the description and then my community @ PowerBISchool.com
@vishalmane313910 күн бұрын
@@Transformita can u make a video on multiple tables in power bi nt just one and hw to use them?
@St8ofmind1234563 ай бұрын
Look forward to the day all this is done by AI. Great vid though.