While extraction of data from operational database. Won't that affect the operetional databases performance. How to extract data without affecting it.
@ITkFunde7 ай бұрын
Good Qs thats why in old times ETL pipelines used to run overnight when the operational systems were not under heavy use but today there are seamless replication tools like attunity that can replicate the data from source by reading the logs.
@patmclaughlin107 Жыл бұрын
Love this, man! Our engineers made it so hard to understand what DAG was. I thought I was not smart enough, but now I know they were either deliberately making it hard, or maybe they didn’t understand it themselves.
@nikhilgurram65692 жыл бұрын
Thanks!
@ITkFunde Жыл бұрын
thanks
@bajicdusko2 жыл бұрын
It always amazes me how we can have knowledge like this one click away! Fantastic content, keep up with good work.
@sumit12072007 Жыл бұрын
I was thinking the same while going through this video.
@jeganj200911 күн бұрын
Excellent Explanation.. on all design patterns in simple way ....
@daves40263 жыл бұрын
Perfect. Full respect to your kindness and sharing of your knowledge
@andygarnet7191 Жыл бұрын
Thanks man! Your explanations so clear and straight fwd. For years I spoke to do many engineers who would over -complicate pattern concepts or straight low ball the documentation to cover themselves when the pipelines blow up and impact the business. Keep up with these great videos!
@AamirAzizYouTube2 жыл бұрын
Thanks so much, sir! This topic was a nightmare for me, you made it so simple to grasp. Keep up the good work!
@rahuldey11822 жыл бұрын
In my project, we are using CDC + EtLT design pattern for our data pipeline. All the design patterns of data pipelines are covered here. Very well presented, good job, keep going.
@ITkFunde2 жыл бұрын
Thanks Rahul ♥️
@ashishmaharana58653 ай бұрын
This is god send. I was just trying to implement CDC in my organizaton. This was most helpful.
@deeptihazari32332 жыл бұрын
How this Amazing channel was hidden till now ..this is called Quality content delivery 👍
@sultanqureshi27663 жыл бұрын
Though its not exactly related to my current profile but its make me happy to learn more about the whole software industry from core and you are best at making this understand by making it simple. Understood the 4(ETL, ELT, ETLT, CDC) data pipeline at once. Video was not long at all Thanks
@ITkFunde3 жыл бұрын
Thanks Diwakar for your support as always 🙏☺️
@ASHighlights6682 жыл бұрын
Very helpful sir your videos converts my nervousness into confidence !!
@metaocloudstudio22212 жыл бұрын
Good point , also the pros of using ELT over cons of ETL is creating normalizing tables and real-time materialized views
@SheetalKumari-lk1vvАй бұрын
Thank you.. it was amazing..
@benoyeremita13593 ай бұрын
Really good explanation man. Hats off to you
@mzeeshan2 жыл бұрын
Loved the details mate!.
@ITkFunde2 жыл бұрын
Thanks Zeeshan☺️
@francis1914 ай бұрын
Fantastic tutorial
@swaragupta79322 жыл бұрын
Easy Explanation, Detailed video
@ITkFunde2 жыл бұрын
Thanks Swara
@harishb8790 Жыл бұрын
Amazing explanation. 👏
@ITkFunde Жыл бұрын
Glad you liked it
@sunnyj1967 Жыл бұрын
Its a terrific presentation.
@ITkFunde Жыл бұрын
thanks sunny
@vigneshbaskaran7931 Жыл бұрын
Love this content, Thank you so much for all the efforts.
@ITkFunde Жыл бұрын
Thanks
@AnandhabalanRadhakrishnan Жыл бұрын
Well explained, keep sharing valuable information like this.
@mangesh4231 Жыл бұрын
Very detailed explanation, helpful. thanks a lot for all work and efforts.
@HassanYoussoufFossi Жыл бұрын
I studied Spark and read DAG many times but just understand it now that i'm watching ur tutorial. thks
@raghurajsawant243 жыл бұрын
You are doing a fantastic job. Love your videos.
@Meowlah322 жыл бұрын
Not exactly a backend developer or data engineer, but this video is very informational on the various data pipeline designs!
@ITkFunde2 жыл бұрын
thanks
@mayurarun2 жыл бұрын
This is such a gem video. This would help me so much. Great work.
@rohithsai52652 жыл бұрын
Great content 💯
@vaidyanathashankar7441 Жыл бұрын
Fantastic explanation, thanks for the wonderful session.
@Poornima_life2 жыл бұрын
Absolutely…I liked the video ,content and your valuable efforts….thanks
@JJ-ki2mw Жыл бұрын
Thank you so much the way you described it is so easy to understand
@mohit.srivastava2 жыл бұрын
both this and the previous connected video explained the concept really well. thanks!!
@Liubov_110 Жыл бұрын
Thank you so much for this detailed video 👍
@greenshadowooo Жыл бұрын
Thanks for your sharing ! 😀😀😀
@almamun82912 жыл бұрын
Thank you very much, got clear concept about data pipelines
@jagss3472 Жыл бұрын
Lovely explanation and very insight details.
@ITkFunde Жыл бұрын
Glad it was helpful Jaga!
@connect_vikas2 жыл бұрын
Love you brother for beautifully explained this.
@ITkFunde2 жыл бұрын
Thanks Vikas
@the.abhisheksinha2 жыл бұрын
nicely explained !
@guidodichiara22432 жыл бұрын
Great job. Keep going on!
@PiyushSharma-jq8rr Жыл бұрын
This was really good :-)
@hsiaoshuang Жыл бұрын
Very informative!
@federicogonzalez76733 жыл бұрын
Im glad that I found your video in my feed, nice one
@javierruizdiaz86562 жыл бұрын
Thank you, excellent Video.
@itneka Жыл бұрын
Thanks for the information
@TheAfroKingPlay3 жыл бұрын
Very nice video man. Thanks I need this class. Take my like.
@emmanuelaolaiya9 ай бұрын
Great job and thanks
@augugninfin10342 жыл бұрын
Thank You!
@ITkFunde2 жыл бұрын
♥️♥️♥️
@francksgenlecroyant3 жыл бұрын
perfect video about Data Pipelines 👌, thanks!
@AmanGupta-yf1hj Жыл бұрын
Wonderful content
@kristhomas82953 жыл бұрын
Thank you so much for this!
@jayanth13762 жыл бұрын
👌👌👌
@mohammadateef33392 жыл бұрын
ur entry is osm sir
@ashokrajur092 жыл бұрын
nice one, very informative
@SeanRomberg2 жыл бұрын
Thanks for the share - you have helped me better understand the pipeline automation software that delivers orchestration, ingestion, transformation, and activation all in one. This makes sense now.
@wendypark38483 жыл бұрын
I learned a lot network and data pipeline knowledge from you. It''s really hard to learn these from a book. Thanks a lot!
@chinuamareashwar81462 жыл бұрын
nice explanation brother
@aditiaditi33027 ай бұрын
Thanks for sharing this video :)
@altamashjawad66913 жыл бұрын
Thank you so much, very nice and comprehensive video!
@DepressedMonkeyGaming3 жыл бұрын
Great Video, Simple and detailed explanation
@rdprasad2225Ай бұрын
thank you very much
@jay215100028 күн бұрын
Thank you...
@UTUBDZ3 жыл бұрын
Great content, thank you very much sir !
@victoraf4274 Жыл бұрын
such an amazing video! not bored at all (im not joking) hehe
@arond.g1120 Жыл бұрын
Feel like I am learning in my own language. ❤❤❤
@ITkFunde Жыл бұрын
Thanks Aron ♥️♥️🙏
@ashisharora9649 Жыл бұрын
AMAZING
@Vikas.0073 жыл бұрын
Awesome content 👍👍 Datamart video link in description plz share 🙏
@ajaykiranchundi99793 жыл бұрын
Thank you so much! BTW it was certainly not at all a long video.
@ITkFunde3 жыл бұрын
Thanks Ajay ☺️❤️
@davidcamiloespitiamanrique92 жыл бұрын
Good one! probably, you can talk about AWS DMS and AWS GLUE
@GabrielJambert Жыл бұрын
Thank you
@prashantprashant12913 жыл бұрын
Your videos are full of knowledge.. are u Data Solution Architect
@masterh68683 жыл бұрын
hey your video as usual full for information and with crystal clear concepts of understanding.. Thanks posting such useful video as industries trends.. can you make video data pipeline , which does not fall DAG pattern ...... like Ml pipeline maybe..(not sure)
@ITkFunde3 жыл бұрын
Thanks buddy for your feedback and suggestion ☺️
@Lebrao093 жыл бұрын
great video!
@StaceyJ1908 Жыл бұрын
First, your videos are amazing....I have learned so much! I am looking at our current GCP implementation and trying to identify key risks across each step in the pipeline to determine if we have the correct controls in place or gaps...what are key risks to address at each stage of the data pipeline?
@lwhieldon12 жыл бұрын
DAG concept is talked about a lot in data science. Can you talk about how this concept in data science correlates with the DAG design?
@VlasTrunov2 жыл бұрын
It's good you focus on DAG's. But for those new to the subject it might be too abstract, I guess. What I would do is I would show how things flow in Airflow, for example, for those who perceive information visually. This way you would spread the (butter on the bread) information in your video uniformly, makking people get the grasp of the information in one pass, if you know what I mean. It's just a suggestion. But to me personally, the detailization you give is perfect.
@ITkFunde2 жыл бұрын
Thanks Vlas such useful feedbacks helps me better my content. I will defintely take your thaughts and do something better nxt time.😊
@ganeshsrinivasannv42962 жыл бұрын
Thanks and it's a great work. Can you share a content on the data captured received as XML messaging pattern and advise on how to store that
@sreechalasani92682 ай бұрын
Thanks for the video ! In the cdc model, rather than capture all the prior versions, when a new version of the data comes in - why can’t the previous versions be invalidated? Wouldn’t that be more efficient?
@upendrakumar-ok3tr2 жыл бұрын
Can you please make a video on Baremetal and Hypervisor
@TheyCalledMeT2 жыл бұрын
would you put data- cleansing / preparation as part of the t of EtLT pattern? or in the T?
@brookster7772 Жыл бұрын
Great Video! Can you tell where a Vector Database fits into this model? Isnt it at some point all data must be converted to Embeddings / vectors to be stored into a massive Vecotr Store to be used for AI similarity searches?
@arijitsinha29552 жыл бұрын
Can you make video on data bricks along with an example please ?
@deepsy47863 жыл бұрын
I would like to discuss on considering CDC as a data pipeline design pattern. My understanding would be that CDC is more related to data modelling concept. You would have to build an ELT or ETL pipeline anyways. CDC more relates to Load or Transformation technic instead of being an individual pipeline. However, all the insights shared were helpful and did helped me relate my work with some of these concepts.
@ambarishhazarnis95313 жыл бұрын
Here CDC referred to storing the delta on a separate table. This way we don’t need to do a read on source table again to extract the change.
@arundhutinayak82213 жыл бұрын
Now I can put technical terms to my current task. Can you do something on API
@lastboomer61642 жыл бұрын
Hello I very much appreciate the training. Would you consider a white board exercise whereas the ETL Jobs and Transformations are using a Metadata Data Driven ETL. - I learned that this is a good practice....but one downside is that this data design can not feed a data catalog "lineage"
@anilmantri2139 Жыл бұрын
Hi, how to pull the source data into EL DAG in the CDC pattern. I mean what tech stack to be used?
@imnischaygowda Жыл бұрын
What is purpose of Sink ? Can t we store data directly to DataWarehouse ?
@debrajpradhan55003 жыл бұрын
Sir, I am interested on AWS analytics.so can u plz tell what AWS data services read 1st,2nd?
@lcsxwtian3 жыл бұрын
At CDC, you had said that max() would get the latest snapshot of the data. I am assuming max() would get the maximum count of the data - correct? If that were the case, what if the last change was to DELETE some data, then I don't think max() would be right?
@AnishBhola Жыл бұрын
yes your right! timestamp based cdc is generally not a good option to process deletes. There are other types of cdc such as log based (most optimal) which you can use for such situations. This video primarily talks about implementing difference based cdc (where 2 snapshots of target systems are compared).
@chinnap8987Ай бұрын
coding is required to learn this?
@SamS-oi5pz3 жыл бұрын
Hi how do we identify changed data from source?
@adamjapal73703 жыл бұрын
do you have a reference or pdf book file of the data pipeline concept? if you do, could you take me to the link? thank you.
@veeek82 жыл бұрын
'Hope you're not bored', never 😁
@pm4306 Жыл бұрын
please give some concrete business example instead of 'n' and 'n+1' as as example will help to clarify and walk thru in a better way - i think you should give concrete real-life business examples for all cases that u discuss......you are missing actual business examples in your videos.
@antonfernando84093 жыл бұрын
Never heard of most of the terms like (ETL, ELT, CDC) mentioned, I guess these are specific to cloud computing, still in terms of data pipeline, its useful to learn I think. Thanks
@egor.okhterov3 жыл бұрын
No, it’s not about cloud computing. It’s about data analytics in general. When you want to build web dashboards that draw graphs of some business processes or want to analyse customer behavior, you build this data pipeline. TLDR: you cannot run SQL on your logs. You need to push your logs into MySQL in order to be able to query your logs.
@ITkFunde3 жыл бұрын
Hi Anton these terms are quite old but have become more prominent with new age data management. May be you are not from Data, Business Inteligence background, but its good to learn these
@ernesto87383 жыл бұрын
and here I am with a cyclic graph problem {{{(>_
@dylanmccullough26793 жыл бұрын
Question regarding the ELT pattern. You said that you should use SQL at the (T)transformation part. Could you use spark instead of SQL at this point? For example - Data Factory Data Flows, instead of putting compute pressure on the EDW with SQL queries?
@chandrakanthotkar72623 жыл бұрын
Whenever we say ELT basically we do transformation after data has been landed in DWH or Database. Like Bigquery (GCP). As Spark engine is basically used for transformation during flow.