Hi Clinton, this is really helpful. one suggestion: it would be nice to add Video # number to follow one after another video.
@bridgetmukuhi48104 жыл бұрын
Hello Clinton. Your lectures are very helpful for my lectures they are very easy to follow especially in this corona situation. Thank you
@manishaggarwal76004 жыл бұрын
Your videos are awesome Clinton. I have seen LinkedIn Learning videos also but you have done better. And yes please number them then the whole tutorial will be great!! Once again thanks for sharing your knowledge.
@asmegetachew252811 ай бұрын
Thank you Clinton!
@ASDAvlogs3 жыл бұрын
Why are we performing conversion while loading into staging?
@ilikedigitallogic50594 жыл бұрын
Thanks for your transfer knowledge👍
@durgaprasadvallabhineni23224 жыл бұрын
Very good explanation
@clintondaniel52855 жыл бұрын
I've added the source code, source data, and documentation for creating the Crime Data Mart on GitHub here: github.com/cedanie2/ISM-4402
@ericsos1013 жыл бұрын
Can you please share the dataset you used?.. Thanks
@alphar854 жыл бұрын
Hey Daniel, i am looking to watch all your videos on sql ETL but am trying to figure out which one comes first 😉. Thank you for making those videos
@clintondaniel52854 жыл бұрын
If you look at the videos included in my channel, identify the video names that start with "Data Mart Design: ...". Then, you will see a sequence in the names.
@alphar854 жыл бұрын
@@clintondaniel5285 I am going through all of them. Can i ask you please ? does the Snowflake and Star Schema come before we extract the data from the source to the destination ? at what stage, do we need to design either snowflake or star schema ?
@clintondaniel52854 жыл бұрын
@@alphar85 Yes, you should be designing your data mart with a schema before you begin the ETL process.
@alphar854 жыл бұрын
@@clintondaniel5285 Thank you Daniel for helping me. can i also ask you? how would you consolidate data sources (CVS, excel, files) within your organisation into one spource of truth? would you ask your organisation to switch to Access, for instance, to enter data in order to avoid redundancy ? if os, what about the old files in excel sheets ?
@maheshbabu40845 жыл бұрын
how can i proceed if the number of data source files are more than one?
@clintondaniel52855 жыл бұрын
I've added the source code, source data, and documentation for creating the Crime Data Mart on GitHub here: github.com/cedanie2/ISM-4402. I'll see if I have time to work an example with multiple source files. Basically, you need to append the existing Staging table or you have to create additional staging tables. Then, your ETL package will have additional connection managers that point to the additional data source files. These connection managers will be used in separate Data Flows within SSIS to move the data to the additional staging tables or UPDATE an existing staging table.
@lazykitten43565 жыл бұрын
@@clintondaniel5285 Thank you so much. This is very helpful. As for the additional staging tables (for multiple sources of data at multiple times), do we delete them after we uploaded the data to the data mart? Another question: if we append the staging table, instead of creating more staging tables, how to upload the appended data to the data mart?
@clintondaniel52855 жыл бұрын
@@lazykitten4356 Usually, in a production environment the staging tables are not useful after your data mart has been loaded with the most recent or initial data. I usually recommend two additional columns in each dimension that record a timestamp in one column and a status of the update in another column. For instance, status = I if the record was INSERT or status = U if the record was UPDATE. In terms of the appended data, take a look at various methods of "UPSERT" (INSERT vs. UPDATE example - using the MERGE command). Since the staging table would be empty (TRUNCATE) from the previous LOAD, then you have new data in the staging table. Then, your SSIS package should run an "UPSERT" (example - MERGE command) on the existing dimension. If there is a new dimension record, then execute an INSERT. If the record already exists with no change, then ignore. If the record exists but with updated changes, then UPDATE. If any record is UPDATE or INSERT, also UPDATE the "status" column and the "timestamp" column in your dimensions. In terms of FACT tables, different opinions from different developers. You can also take the "UPSERT" approach, but you can also just add a new record. If you just add an additional record to your FACT table and you have an updated FACT with 2 or more rows, then you will need to develop additional SQL logic on your reporting side that either MERGEs the 2 records or shows the timeline where the FACTs have changes. It all depends on your use case and requirements. If historical change is important in the FACTs and you don't want multiple records, then just create a separate FACT table that allows all changes. So for instance the normal FACT table may only have 1000 rows while the "historical" FACT table may have > 1000 rows.
@lazykitten43565 жыл бұрын
@@clintondaniel5285 Thank you so much. You at least give me ideas of what to look for :) I really appreciate it.
@lazykitten43565 жыл бұрын
@@clintondaniel5285 Another question, where did you learned all these techniques about staging tables and DW? Is there a book that you recommend? Thank you very much