Want to build a reliable, modern data architecture without the mess? Here’s a free checklist to help you → bit.ly/kds-checklist
@fury007133 ай бұрын
In our company, when we have fields with low cardinality (values less than 30), we’ve created a common codemap table for those types of fields. We then reference that codemap table to a dimension table instead of creating a separate dim_type table for each type.
@Milhouse77BS3 ай бұрын
Aka Junk Dimension
@andrewh69513 ай бұрын
Great video and a fun wee problem to solve that comes up a lot! In my current role's DWH we have a lot of dimensions upon dimensions. From what I can tell in the problem statement, product type is related to product. In a STAR schema, I'd probably keep product type + related fields in the product dim table. If we had to split it out, I'd make a Snowflake style Dim table. In application, I'd go 3NF, for analytics I'd either go snowflake or preserve Star by keeping the product type information in the product table to reduce join clauses. The product ordered is a facet of your transaction, so product id column makes sense in the fact. The product type, however, is a facet of a product, not a transaction. The problem with having a dim table for product type that joins onto the fact table (and not the product dim table) is that the fact table now has to store information about how product and product type are related. If the relationship changes (i.e. a product changes type) this now has to be accounted for in the fact table, where the relationship between those two entities should be ideally managed by the connections between dimensions (that ideally have some historicization element or SCD2 in your pipeline). Adding dimensions to the star should be reserved for entities that are unrelated to each other, this is a case where two dimensions are linked by a grouping concept; the fact table should not have any say in the relationship between them.
@KerneelsRoos2 ай бұрын
I agree with this and the video's statements and opinion here. What I'd like to add is to have an efficient code for product type in the dim product, as well as a textual description so that the product_type_code could efficiently be used in large queries, and can always be described more by a simple lookup table (or just the textual description in the dim product itself, which would make it live alongside the historic trail of products). However, in the event that the product type starts to be applicable to additional facts, or if another dimension appears: let's say dim packaging which also depends on the very same product type, then it might be time to create a dim product type in my opinion.
@Milhouse77BS3 ай бұрын
Well said. You laid out the “it depends” well.
@stoianandreimircea15093 ай бұрын
Thank you from the bottom of my heart are so well done that I believe they should be presenting at the university level or I don’t know. I am just mind blown how well you express some of the tanks and the things that we grapple with every single day. Thank you so much for your videos, they mean a lot. I am now searching if you have a paid course but I’ll tell my manager to pay for.
@KahanDataSolutions3 ай бұрын
Really appreciate the kind words & support! I encourage you to come check out The Modern Data Community (www.ModernDataCommunity.com)
@adam67bree3 ай бұрын
Great video as always! Just joined the Modern Data Community :)
@KahanDataSolutions3 ай бұрын
Awesome! See you inside!
@darshakkumarranpariya8673 ай бұрын
Any books or sources you recommend to learn about data modeling?? Thank you so much for this type of content 🙏 Really appreciated!!
@pseudounknow55593 ай бұрын
I really appreciate your videos 💯
@InstantBI3 ай бұрын
Just FYI Michael. I have published my BI4ALL Base models. YT is censoring how to get them but I am sure you will work it out...
@raram58663 ай бұрын
In your proposed solution the product id would not be unique anymore. How would you go about that?