17. Slowly Changing Dimension(SCD) Type 2 Using Mapping Data Flow in Azure Data Factory

  Рет қаралды 57,554

WafaStudies

WafaStudies

Күн бұрын

Пікірлер: 69
@amitpundir3983
@amitpundir3983 3 жыл бұрын
Great work Maheer, couple of observations 1. Type 2 dimension needs EffectiveStartDate / EffectiveEndDate too. If we add these columns updating all history rows will always reset these dates which fails type2 idea. Also, not good for performance , as we are always updating all history rows be it millions. 2. During 1st execution, we should have a capability of verifying although source has an entry for EmpId=1001 but is it really updated coz only in that case it make sense to INSERT and UPDATE history rows else we are simply duplicating rows with no changes.
@shaksrini
@shaksrini 3 жыл бұрын
Best way to implement SCD Type 2 😀👍 very well explained
@WafaStudies
@WafaStudies 3 жыл бұрын
Thank you 😊
@dvsrikanth22
@dvsrikanth22 3 жыл бұрын
great explanation... explained in very easy way to understand the concept
@WafaStudies
@WafaStudies 3 жыл бұрын
Thank you 🙂
@bhosdiwalechacha9702
@bhosdiwalechacha9702 3 жыл бұрын
Good explanation. But I guess you forgot to add a check if there is any change in any one of the column coming from the source file. Because you'll update the row in target only if you find any change in the source and destination.
@kromerm
@kromerm 3 жыл бұрын
Nice technique, great job! One small nitpick ... I'd prefer if you used true() instead of 1==1 for your Alter Row Update policy :)
@WafaStudies
@WafaStudies 3 жыл бұрын
Yup. We are good to use true function as well. Idea is condition should return always true, so that update policy can be applied on all rows 😊
@AmehaAddis
@AmehaAddis 2 жыл бұрын
@@WafaStudies CC
@MohammedKhan-np7dn
@MohammedKhan-np7dn 3 жыл бұрын
Nice and Superb Explanation. Thanks alot Maheer.
@WafaStudies
@WafaStudies 3 жыл бұрын
Thank you 🙂
@pankajmandania1785
@pankajmandania1785 Жыл бұрын
Good explaination. What is i have duplicate rows in the source file? How do i filter them?
@pankajchandel1000
@pankajchandel1000 2 ай бұрын
can anyone explain what he means to say by alter row instead of select at 21:30 ?
@ramubuddi8396
@ramubuddi8396 2 жыл бұрын
I have implemented as your explanation.. but i am facing an issue that, key column does not exist in sink...here is the screen shot.
@g.suresh430
@g.suresh430 6 ай бұрын
Hi Maheer, can we use Inner Join instead of lookup and filter ?
@imkind1976
@imkind1976 2 жыл бұрын
May i know how the surrogate key is generated in dim table?
@Ravi-gu5ww
@Ravi-gu5ww 3 жыл бұрын
Good one maheer along with add duplicate records form source and make some columns as scd type 1 and some as scd 2 for same table and also incremental load as new session
@martand89
@martand89 3 жыл бұрын
Could you please tell me how your pipeline behave if you do not change anything. In my case, it is inserting a new row with isrecent=1 and changing the previous value isrecent =0, but As I am not changing anything then it should not be inserted again.
@hendrik-stack22
@hendrik-stack22 3 жыл бұрын
I have exactly the Same question. In nothing changes it is not supposed to be added again. How can we fix this?
@mnoTyrAngel
@mnoTyrAngel Жыл бұрын
Hi! were you able to fix this issue? I need help on this
@mnoTyrAngel
@mnoTyrAngel Жыл бұрын
Hi! were you able to fix this issue? I need help on this
@dipanjanmukherjee4338
@dipanjanmukherjee4338 3 жыл бұрын
In the update branch, instead of lookup and filter, we could replace it with join (inner)
@curious_mind7575
@curious_mind7575 2 жыл бұрын
We did not check Md5 values for attributes whose employee id is already present in source and target…
@mohanwarudkar6161
@mohanwarudkar6161 Жыл бұрын
Hi Maheer sir, in the case of SCD 2 type, We can use inner join transformation. As it will only take the common rows from CSV file and SQL table, based on the primary key column, so then no need to apply filter transformation. I mean to say that instead of lookup transformation and then filtering transformation we can directly use inner join transformation, to simply. Am I right? can we do so?
@himanshutrivedi4956
@himanshutrivedi4956 3 жыл бұрын
This is really good video and helpful too just one suggestion can you add record_create_date and record_expire_date and then upload ..It would be great..
@RobertLenior
@RobertLenior Жыл бұрын
In SSIS this is very very easy to accomplish, why is it still so cumbersome in ADF?
@VinodKumar-lg3bu
@VinodKumar-lg3bu Жыл бұрын
good one dude thanks for explaining .
@WafaStudies
@WafaStudies Жыл бұрын
Thank you ☺️
@ilovecute7826
@ilovecute7826 7 ай бұрын
@@WafaStudies Great explanation. How can I redirect the history (isActive=0) to a different table if I done want to keep history on same table. Do you have a video for this or can you create one. Thank you!
@theroy068
@theroy068 3 жыл бұрын
SCD type 2 was explained properly but one scinerio was not covered suppose we received same record from the source which is already present in the target. In that case also it will create new records and will update the old record as inactive under this logic.
@mnoTyrAngel
@mnoTyrAngel Жыл бұрын
may i ask how to resolve this then? thank you
@islammatkarimov2353
@islammatkarimov2353 Жыл бұрын
+++ with this data flow, adf cannot recognise the old data, it literally gives 1 isactive for every row, would be better with staging I think
@vaishnosharma3248
@vaishnosharma3248 Жыл бұрын
can we use scd2 in real time data ?
@likhybadri789
@likhybadri789 Жыл бұрын
I was trying scd type2 using dataflows to make it dynamic , but on the frst run it is failing bcz I haven't choose the option of inspect schema to make it use for any delta table . Any workaround for this? The solution is atlst it should be able to read the header though the delta table is empty but am getting an error on the source side when the table is empty
@nagoorpashashaik8400
@nagoorpashashaik8400 Жыл бұрын
Can we do SCD Type 2 on Delta file using mapping data flow
@vinnyakhil
@vinnyakhil Жыл бұрын
I am getting this error. Cannot insert explicit value for identity column in table when identity_insert is set to off. Can any one help on this
@raghavendrareddy4765
@raghavendrareddy4765 2 жыл бұрын
Great work Maheer, How to load parquet file from on-premises to Azure SQL database using Azure Data factory
@islammatkarimov2353
@islammatkarimov2353 Жыл бұрын
Sir it is not working, the values still remains 1 for all + it does not recognise the old data, it literally inserts all data
@rajeshmanepalli7367
@rajeshmanepalli7367 Ай бұрын
Excellent Maheer can you please do video on scd type 3😊
@kenpoken1
@kenpoken1 3 жыл бұрын
Nice job. Please keep them coming. How About a video on SCD type 4 implementations
@rajpes1833
@rajpes1833 2 жыл бұрын
Can you make a video in which you can include Start date and End date, and dynamically the dates getting updated for type-2 scd. I see that is a necessity and many people face this issue.
@TNuno
@TNuno Жыл бұрын
dit you learn that ? if so where ?
@vinayagamoorthyboobalan6268
@vinayagamoorthyboobalan6268 3 жыл бұрын
Nice explanation WafaStudios.. I have a doubt is that how to handle the rows which are not having any updates in Source? With this example, even the unaffected data also will be updated in the destination unnecessarily.. Looking for your reply and thanks in advance..
@mdsd465
@mdsd465 3 жыл бұрын
+1 i too have same question
@odudabdul4152
@odudabdul4152 3 жыл бұрын
Yes same doubt here. Could you please respond @WafaStudies as many people have this doubt
@harikrishnaupputuri3255
@harikrishnaupputuri3255 Жыл бұрын
+1
@popcornandpremiers
@popcornandpremiers 9 ай бұрын
@WafaStudies , what if we get same column values from source as incremental, in that case isActive 0 and 1 both will be having true duplicates
@imkind1976
@imkind1976 2 жыл бұрын
I see surrogate key is initially inserted for target record..but in source record surrogate key is not there, can you explain how surrogate key is mapped for the newly inserted records
@islammatkarimov2353
@islammatkarimov2353 Жыл бұрын
you just don’t insert it, it will automatically add surrogate keys in sink as it is identity, if you add surrogate key in mapping it will fail
@suprobhosantra
@suprobhosantra 2 жыл бұрын
@WafaStudies I am facing a problem implementing scd2 using exist transformation instead of lookup u used here. But I guess the problem will be same for both the implementation. Here we need to make sure we are finishing the update inside the table first. If the new records are accidentally inserted in table first then lookup will fetch newly inserted columns also as matching and therefore all the columns are getting marked as non active. But the order of execution of the parallel streams are not in our hand. How to solve this? Any idea?
@lingay3850
@lingay3850 3 жыл бұрын
Hello. How about doing it in sql server and not in query editor? Like doing mapping on Azure data factory but the result or the output will be seen in sql server.😊
@thesujata_m
@thesujata_m 3 жыл бұрын
For scd 1 instead of update we have to just delete that row and further activities are not required in sink2 , right?
@WafaStudies
@WafaStudies 3 жыл бұрын
Pls check below link for scd type 1 kzbin.info/www/bejne/g6uriI1rlLCAj9E
@mayank5644
@mayank5644 3 жыл бұрын
Create a branch from source use alter row to update the records in sink that are present in source and in the branch just use insert
@rajeshmanepalli7367
@rajeshmanepalli7367 Жыл бұрын
good explaination
@Imrannaseem818
@Imrannaseem818 3 жыл бұрын
thanks Maheer
@WafaStudies
@WafaStudies 3 жыл бұрын
Welcome 🤗
@rahulwankhade8654
@rahulwankhade8654 3 жыл бұрын
Hello Sir, I want to switch my career from SQl Server developer, shoul i go through below playlist 1.Asure Basic 2.Azure Function 2.Asure Data Factory Please suggest steps.
@WafaStudies
@WafaStudies 3 жыл бұрын
Azure functions is not require. Azure basics and adf
@rahulwankhade8654
@rahulwankhade8654 3 жыл бұрын
Thank you..🙏🙏
@rahulwankhade8654
@rahulwankhade8654 3 жыл бұрын
Can I practice ADF pn free Azure subscription
@WafaStudies
@WafaStudies 3 жыл бұрын
Ya
@zikoraswrld
@zikoraswrld 3 жыл бұрын
Hi, if I was archiving rows in a database. During the copying process from one data base to another. I want to delete what ever I’m archiving from the source. Is there a place where I could write a query that does that instead of using alter rows etc because the expression builder is just not what I need
@zikoraswrld
@zikoraswrld 3 жыл бұрын
Using the azure data factory
@bhawnabedi9627
@bhawnabedi9627 3 жыл бұрын
Nice info
@WafaStudies
@WafaStudies 3 жыл бұрын
Thank you 🙂
@Cheyenne9663
@Cheyenne9663 2 жыл бұрын
Good video but all the noise from the kids in the background was very distracting and loud.
@WafaStudies
@WafaStudies 2 жыл бұрын
Thank you. Sorry for that trouble. In old voices that might have happened. I am trying not to have any noise in all other and recent videos 🙂
@ilovecute7826
@ilovecute7826 7 ай бұрын
@@WafaStudies Great explanation. How can I redirect the history (isActive=0) to a different table if I done want to keep history on same table. Do you have a video for this or can you create one. Thank you!
18. Copy multiple tables in bulk by using Azure Data Factory
18:27
Support each other🤝
00:31
ISSEI / いっせい
Рет қаралды 66 МЛН
Processing Slowly Changing Dimensions with ADF Data Flows
1:10:53
Pragmatic Works
Рет қаралды 28 М.
SCD: Slowly changing dimensions explained with real examples
25:43
Azure Data Factory Mapping Data Flows Tutorial | Build ETL visual way!
26:25
Adam Marczak - Azure for Everyone
Рет қаралды 232 М.
Informatica Slowly Changing Dimensions --SCD2
25:53
Suresh ETL Trainer
Рет қаралды 12 М.
61. Databricks | Pyspark | Delta Lake : Slowly Changing Dimension (SCD Type2)
20:03
Implementing SCD Type 2 using Delta
13:56
Knowledge Sharing
Рет қаралды 21 М.