Data Mart Design - Populate Dimensions and Fact Tables

  Рет қаралды 71,714

Clinton Daniel

Clinton Daniel

Күн бұрын

Пікірлер
@poptrt
@poptrt 3 жыл бұрын
Thank you so much, I looked for just this kind of walkthrough, tutorial on this subject; you spoke so clearly and directly and stayed on topic that it was easy to follow. Thanks again, great job, I like the usage of staging tables I use that alot as well!!
@arturobrunocm
@arturobrunocm Жыл бұрын
Thank you for all the information shared, I’ve just finished the course project. Thanks!!
@philippr.3946
@philippr.3946 3 жыл бұрын
Great Job on this video I am trained at SSMS, SSIS and SSAS seperately but putting it all together to get a DW from a source was a missing link for me. Keep up the great work :-)
@ronaldmudenyo6281
@ronaldmudenyo6281 3 жыл бұрын
Best video ever. You just helped me complete my project. Thanks for the video
@JanUnitra
@JanUnitra 2 жыл бұрын
is it a good way of doing warehousing ? I always hear that the staging should be raw, no calculation, no changing etc. but you added surrogate keys ?
@jesflo4
@jesflo4 3 жыл бұрын
Very clear and straightfoward, thank you for sharing your knowledge!
@pedrobarrocas8673
@pedrobarrocas8673 2 жыл бұрын
You are the man, thank you so much! You are saving Panda Jardins
@Jackie.1001
@Jackie.1001 3 жыл бұрын
Best tutorial of data marts in youtube
@jahangirkabir9271
@jahangirkabir9271 6 жыл бұрын
Great video! thank you. Please continue uploading.
@clintondaniel5285
@clintondaniel5285 5 жыл бұрын
I've added the source code, source data, and documentation for creating the Crime Data Mart on GitHub here: github.com/cedanie2/ISM-4402
@mohamedkhalilmadhkour9416
@mohamedkhalilmadhkour9416 3 жыл бұрын
thanks a lot
@sandeeppai25
@sandeeppai25 3 жыл бұрын
Hi, thanks for the tutorials. Can we include business logic or calculation in ETL while loading data from staging to fact ?
@clintondaniel5285
@clintondaniel5285 3 жыл бұрын
Yes. For example, if you had a specific business formula, such as an interest rate calculation, you can calculate this value as a form of pre-processing during the ETL. If the pre-processing calculations are really complicated you can load the data in a staging table, then execute calculations using SQL or a scripting language (such as C# or Python). In other words, if there is a lot of computing involved, you do not want to perform this in one single ETL process. The calculated data can then be saved to another "secondary" staging table temporarily. Then, you can move the data from the "secondary" staging table to the final FACT table. I do this type of work a lot in staging tables. This would add some separation of pre-processed computing vs. the normal ETL processes.
@sandeeppai25
@sandeeppai25 3 жыл бұрын
@@clintondaniel5285 Thank you for quick response helped in clarifying doubt.
@learnenglishwithcumaali2883
@learnenglishwithcumaali2883 8 ай бұрын
Why would you keep all the values in a single FactTable? Is not a FactTable suppose to keep only agg values in order to limit the storage and the cots of DWH?
@sakshamjinsi6731
@sakshamjinsi6731 4 жыл бұрын
Hey, the relationship between state and city and all of these facts and dimensional tables, can these be considered part of a star schema design?
@clintondaniel5285
@clintondaniel5285 4 жыл бұрын
Yes, absolutely.
@clintondaniel5285
@clintondaniel5285 4 жыл бұрын
Also, depending on whether you build a hierarchy between state and city you can also consider a snow flake design
@quinnhogan545
@quinnhogan545 Жыл бұрын
Hello, great video. I know it's been a while, but can you perhaps speak on how you would take your data from a staging db and insert it into your fact table after all your dimensions are built, but this time you do not have any IDs for the dimensions in your staging table?
@clintondaniel5285
@clintondaniel5285 Жыл бұрын
I will be posting an updated version of the data mart design process using SQLite later this year. There are a lot of examples where the data mart design is useful for localized analysis/data science projects rather than enterprise scalable or service oriented deployment. Instead of using a Microsoft tool, I will be showing how all of the data engineering (ETL) is done using Python and SQLite. I'll see if I can include a part of the FACT table example where the ID is not available in a staging table.
@abrammakgalemele3491
@abrammakgalemele3491 5 жыл бұрын
Tx Clinton. You are super! Spot on like Clint Eastwood's shooting!
@joyo2122
@joyo2122 3 жыл бұрын
you normally dont update staging tables or merge any id
@HumairDawood
@HumairDawood 4 жыл бұрын
Quick questsions, we made Dimensions in previous videos and made tables to layout our data correctly and all those tables were empty. My question is how did you import the CSV file for Staging table in the beginning.
@clintondaniel5285
@clintondaniel5285 4 жыл бұрын
Take a look here: kzbin.info/www/bejne/jofQpGOJdpJpjtU
@HumairDawood
@HumairDawood 4 жыл бұрын
@@clintondaniel5285Thank you Clinton. Currently doing my Coursework from your videos and it took me almost 200 hours of videos lol. I still need to do OLAP CUBE and run some Queries such as; Show the top 5 Facilities and the total weight in TotalTonnes for each material in each of the four quarters e.g. April-June, July-September, October- December and January-March. ii. Rank the top 10 Materials and the TotalTonnes of waste processed by each of these for the 2016-17 year and for each of the two half-years April- September and October-March. For the second half year period show the number of places each Authority has moved (up/down the ranking) since the first half year period. iii. For each Authority show the WasteStreamType, TotalTonnes reported for the year, order the output in descending order of TotalTonnes. iv. Develop a chart that shows the monthly TotalTonnes processed for each OutputProcessType. Your answer should allow selection of the outputProcessType and the Authority in which the waste originated. Till Now I have loaded my tables USING your videos but still way to go. Ty so much for your videos
@josuearana8756
@josuearana8756 3 жыл бұрын
Hi, thanks for you vídeo si very good and I appreciate your time here. I would like to see how build without etl tool but you show all those queries and also Understood very well thanks
@MasterCamus
@MasterCamus 4 жыл бұрын
Great video. What’s the video 1?
@sandipbaidya2839
@sandipbaidya2839 3 жыл бұрын
I really enjoyed watching this video ::)
@anhngo9039
@anhngo9039 6 жыл бұрын
this is spot on. thank you so much!
@godfreynyirenda470
@godfreynyirenda470 5 жыл бұрын
Thanks Clint, very well explained. Do you have follow up videos on multi dimensional analysis?
@ericsos101
@ericsos101 3 жыл бұрын
Hi do you have the dataset?
@chamathhasthuka
@chamathhasthuka 3 жыл бұрын
Thank you Sir. Could you please update more videos. And one suggestion please create playlists
@datafreak911
@datafreak911 3 жыл бұрын
Thanks for this nice video, would you do another one for how to do this project from beginning
@ericsos101
@ericsos101 3 жыл бұрын
Such a wonderful project approach on data mart.Can you please share the dataset because i would love to follow through on hand on ...it will be much appreciated./.Thx again in advance
@asasas617
@asasas617 3 жыл бұрын
Thanks for the detailed video
@mayankchaubey5917
@mayankchaubey5917 4 жыл бұрын
For newly create Data this looks good. Can you let us know how it will work for updated records like Type1 and Type2 Attributes
@clintondaniel5285
@clintondaniel5285 4 жыл бұрын
Good question. This is a common operation sometimes called an "UPSERT". Essentially, you check if data associated with an ID has changed. If it has, you execute an UPDATE. If the ID does not exist, then you INSERT. But, that's easy for me to just say vs. demonstrate. I plan on doing a video that addresses this scenario sometime in the future.
@brianaelia
@brianaelia 3 жыл бұрын
@@clintondaniel5285 is this more like a merge command in sql?
@clintondaniel5285
@clintondaniel5285 3 жыл бұрын
@@brianaelia Not really, there is a high cost to the MERGE command depending on the number of records. This is more of a methodology to track and update records.
@timovskyt8520
@timovskyt8520 6 жыл бұрын
Thank you, helped a lot, very grateful, any videos on updating the fact table through querying in siss. Eg example if you were to add to the datawarehouse a new dataset and had to insert new fields into the fact table and then update that in visual studio. Is this difficult?
@tincanpete
@tincanpete 2 жыл бұрын
That was really helpful, thank you!
@astralvolt6309
@astralvolt6309 7 ай бұрын
Can you share me your previous videos
@muhammadtanveerislam5998
@muhammadtanveerislam5998 5 жыл бұрын
Can you give the data source link?
@Ryanspjut
@Ryanspjut 4 жыл бұрын
Question -- I noticed that in your fact table you have each crime category (MurderAndNonEgligentManslaughter, ForcibleRape, Robbery, AggravatedAssault, etc.) each in their own column. Would you agree that this data set would be better if it were stored unpivoted? So that there was a new column called "CrimeCategory" and that column was populated with the values of MurderAndNonEgligentManslaughter, ForcibleRape, Robbery, AggravatedAssault, etc., along with another column called "NumberOfCrimes"? This way a user could ask a question like "how many total crimes were committed?" and the system could aggregate a single column (NumberOfCrimes). I understand that this data set is just being used for illustration, but maybe you can touch on when to convert columns to a single dimension and when to leave them as individual columns. Thanks.
@clintondaniel5285
@clintondaniel5285 4 жыл бұрын
Typically, in my experience, you would want to store the measurable FACTs as separate columns because they can have a different context. Yes, they are all crimes. But, someone may want to establish their own context on each column and would require the fields to be separated. If the context of a data analyst is to Pivot the columns so that they all fall under "CrimeCategory" and "NumberOfCrimes", then this could be easily done in a reporting tool such as Excel, Tableau, or PowerBI. But, there really is no right or wrong answer to your question. I typically architect my data marts to account for maximum context and allow for a user to define their own contextual rules.
@Ryanspjut
@Ryanspjut 4 жыл бұрын
@@clintondaniel5285 Great info!
@jitenderreddy8605
@jitenderreddy8605 4 жыл бұрын
Hey Daniel !! Its a nice video can you please share the Database and queries to practice.
@loinguyen9331
@loinguyen9331 9 ай бұрын
thank you so much
@ajeyarjun8718
@ajeyarjun8718 Жыл бұрын
This is Great
@ouaildjaidja1447
@ouaildjaidja1447 3 жыл бұрын
thank you so much !
@roya75
@roya75 5 жыл бұрын
thank you this was very helpful
@chihhowbong8460
@chihhowbong8460 5 жыл бұрын
Very good.
@tanat2247
@tanat2247 4 жыл бұрын
thanks you,
@vayunandu
@vayunandu 4 жыл бұрын
Neat!
@keerthihulagur55
@keerthihulagur55 4 жыл бұрын
Nice
@rohitnath5545
@rohitnath5545 Жыл бұрын
amazing snowflake schema video. no BS
Data Mart Design - ETL Process to Populate Staging Table
25:42
Clinton Daniel
Рет қаралды 40 М.
Dimensional Modeling
53:54
Bryan Cafferky
Рет қаралды 174 М.
Мясо вегана? 🧐 @Whatthefshow
01:01
История одного вокалиста
Рет қаралды 7 МЛН
When you have a very capricious child 😂😘👍
00:16
Like Asiya
Рет қаралды 18 МЛН
Сестра обхитрила!
00:17
Victoria Portfolio
Рет қаралды 958 М.
“Don’t stop the chances.”
00:44
ISSEI / いっせい
Рет қаралды 62 МЛН
Learn Database Normalization - 1NF, 2NF, 3NF, 4NF, 5NF
28:34
Decomplexify
Рет қаралды 2,2 МЛН
129 How do you load a fact table
32:14
Learn SSIS
Рет қаралды 10 М.
Data Mart Design - Table Construction
15:04
Clinton Daniel
Рет қаралды 17 М.
Database Sharding and Partitioning
23:53
Arpit Bhayani
Рет қаралды 107 М.
Which Database Model to Choose?
24:38
High-Performance Programming
Рет қаралды 67 М.
Dimensional Modeling - Declaring Dimensions
55:32
Pragmatic Works
Рет қаралды 24 М.
Mastering ETL Test Automation: Verifying Table Existence in Schemas with Python
30:24
SRTechnologies Ravindra Reddy
Рет қаралды 9 М.
SSIS Design Patterns for Loading a Data Warehouse
1:01:14
Pragmatic Works
Рет қаралды 42 М.
Designing Your Data Warehouse from the Ground Up
1:01:21
Pragmatic Works
Рет қаралды 157 М.
Мясо вегана? 🧐 @Whatthefshow
01:01
История одного вокалиста
Рет қаралды 7 МЛН