Thank for the video . I was trying to use Groupby and rest of the columns as a stored procedure. Your video made my job easy.
@WafaStudies3 жыл бұрын
Welcome 😁
@susmitapandit87852 жыл бұрын
In Output file , Why EmpID is not in sorted format even though we used Sort function?
@vishaljhaveri7565 Жыл бұрын
In the aggregation step, choose column pattern and write name!=columnnameonwhichyougroupedby -> Basically this will filter out all the columns which are mentinoed in GroupBy step and will perform aggregation on the rest of the other columns. Write $$ if you don't want to change the name of the main column and write first($$) or last($$) as per your requirement.
@mankev92553 жыл бұрын
Good concise tutorial with clear explanations. Thank you.
@WafaStudies3 жыл бұрын
Thank you ☺️
@gsunita1234 жыл бұрын
Data in Output Consolidated CSV is not sorted on EmployeeID , we did use the Sort before the Sink , then why the data is not sorted ?
@Anonymous-cj4gy3 жыл бұрын
Yes, it is not sorted. Same thing happened with me
@Aeditya2 жыл бұрын
Yeah it's not sorted
@maheshpalla29543 жыл бұрын
How do you know which function to use since we are not sure about duplicate rows if we have millions of records in Source??
@Anonymous-cj4gy3 жыл бұрын
in the output file data is still not sorted, if you see it. same thing happen with me also. even after using sort - data is still unsorted.
@lehlohonolomakoti78282 жыл бұрын
Amazing video, super helpful, allowed me to remove duplicates from a restapi source and create a ref table inside my db
@WafaStudies2 жыл бұрын
Thank you 😊
@swapnilghorpadewce2 жыл бұрын
Hi, I am trying to bulk load multiple json files to cosmosDB. Each json file contains json array 5000 objects. total data size is around 120 GB. have used "copy data" with "foreach" iterator It is throwing error for respective file but inserts some records from file. I am not able to skip incompatible rows. also, not able to log skipped rows. have tried all available options. Can you please help?
@rohitkumar-it5qd2 жыл бұрын
How do I update the records in the same destination , the updated record and the new record without having any duplicates on ID. PLEASE SUGGEST.
@karthike17152 жыл бұрын
Hi,I have to check all the colum duplicate and how to handle in aggregate activity, please help me
@rajkiranboggala97223 жыл бұрын
Well explained!! Thank you. If I have only one csv file and I want to delete the duplicate rows, I guess I can do the same by self union’ing the file, I’m not sure if there’s any other simpler method
@WafaStudies3 жыл бұрын
Thank you 🙂
@AkshayKumar-ou8in2 жыл бұрын
thank you for the video, very good explanation
@WafaStudies2 жыл бұрын
Welcome 😊
@ACsakvith Жыл бұрын
Thank you for the nice explanation
@battulasuresh93062 жыл бұрын
What if we wanna remove both columns Point2 what if u wanna specifically want in middle of a row saying latest modified date column like that
@nareshpotla25882 жыл бұрын
Thank you Maheer. If we have 2 same records with unique empid you use last($$)/first($$) to get either of one. If we have 3 records like 1,abc 2,xyz 3,pqr. if we use first($$) we will get 1,abc and last($$) will give 3,pqr.How to get the middle one (2,xyz)?
@marcusrb10482 жыл бұрын
Great video, it's clear. But, what happen with new records? Because If you use an Union table and use only upsert, check only duplicates rows isn't it? I tried same of yours, but new one are removed in the final step. I tried and I figure out an issue for INSERT, UPDATE and DELETE in three separate steps, how could I achieve it? Thanks
@MigmaStreet Жыл бұрын
Thank you for this tutorial!
@arifkhan-qe4td2 жыл бұрын
Aggregates is not allowing me add $$ as an expression. Any suggestions pls.
@pachinkosv Жыл бұрын
I don't want to import a few columns to datatable, how is it done?
@vishvesbhesania77673 жыл бұрын
why data is not sorted in output file ? even after using sort transformation in data flow.
@kumarpolisetty30484 жыл бұрын
Suppose if we have more than 2 records for one empid. And if i want to take Nth record , how can i do that ?
@luislacadena96893 жыл бұрын
Excellent video, do you think that it is possible to eliminate the values keeping for example the one that has the higher department id/number? I've seen that you kept the first register by using first ($$), but im curious if you can remove duplicates in the RemoveDuplicateRows box based in other criteria. Is it possible to keep only the duplicates with higher department id?
@PhaniChakravarthi4 жыл бұрын
Hi, thank you for the sessions. They are wonderful. Just have a query, can you make any video on identifying the DELTA change between two data sources and capture only the mismatched records with in ADF?
@WafaStudies4 жыл бұрын
Sure. I will plan one video on this.
@benediktbuchert90022 жыл бұрын
You could use a window function and mark all duplicates, and then use a filter and filter them out.
@MrSuryagitam3 жыл бұрын
If we have multiple files in adf then how to remove dublicate files in adf in single time
@krishj80112 жыл бұрын
great tutorial...
@WafaStudies2 жыл бұрын
Thank you ☺️
@EmmaSelma4 жыл бұрын
Hello Wafa, Thank you so much for this tutorial, it's very helpful. New subscriber here. Thinking of scenarios to use this, I have a question please : Is it correct to use this to get last data from ODS to DWH in the case of a full load (only insertion occuring in ODS and no truncate) just like row partition by ? Thank you Upfront.
@DataForgeAcademy2 жыл бұрын
Why you didn't use sort function with remove duplicate option?
@paulhepple994 жыл бұрын
Great Vid - thnx
@varun8952 Жыл бұрын
super
@isanayang63384 жыл бұрын
Can you speak slowly and clearly?
@WafaStudies4 жыл бұрын
Sure. Thank for feedback. I will try to improve on it.
@isanayang63384 жыл бұрын
Your strong accent makes it so difficult to understand you.
@kajalchopra6953 жыл бұрын
How we can optimize the cluster start up time. Basically it is taking 4m 48 sec to start a cluster. So how i can reduce that?