Designing a Data Pipeline | What is Data Pipeline | Big Data | Data Engineering | SCALER

  Рет қаралды 35,772

SCALER

SCALER

Күн бұрын

Watch Shashank Mishra (Data Engineer III, Expedia) explain how to create a data pipeline in this exclusive tutorial video. Check out our FREE masterclasses by leading industry experts now: www.scaler.com/events?...
When designing data pipelines, there are several elements to consider, and early decisions have significant consequences for future performance.
However, before we can grasp the design process, we must first understand what a data pipeline is.
A data pipeline is a series of elements that automatically gather, organise, transfer, convert, and process data from one place to another, ensuring that the data is in a usable condition for enterprises to allow a data-driven culture.
Put in simpler words,
A data pipeline is a series of activities that ingest raw data from many sources and transport it to a destination for storage and analysis.
The most important thing is to have a flexible, scalable data pipeline that can adapt to new use cases for data while scaling as your data volume grows.
A data engineer is someone who acts as a gatekeeper and facilitator of data transit and storage. They create data reservoirs and are critical in managing such reservoirs.
Data Engineering, as a profession is fast gaining pace in the data science ecosystem, and due to the strong need for Data Engineers, even FAANG companies are willing to spend a fortune for qualified individuals.
The following topics are covered in this designing a Data Pipeline video:
00:00 - Introduction
00:57 - Understanding of data domains
02:57 - Choosing data sources
04:43 - Determine the data ingestion strategy
08:37 - Design the data processing plan
11:11 - Set up storage for the pipeline output
13:19 - Plan the data workflow
14:42 - Monitoring and governance tools
16:22 - Designing a demo data pipeline
--------------------------------------- About SCALER -------------------------------------------------
A transformative tech school, creating talent with impeccable skills. Upskill and Create Impact.
Learn more about Scaler: bit.ly/3PqyUyS
📌 Follow us on Social and be a part of an amazing tech community📌
👉 Meet like-minded coder folks on Discord - / discord
👉 Tweets you cannot afford to miss out on - / scaler_official
👉 Check out student success stories, expert opinions, and live classes on Linkedin - / scalerofficial
👉 Explore value-packed reels, carousels and get access to exclusive updates on Instagram - / scaler_official
📢 Be a part of our one of a kind telegram community: t.me/Scalercommunity
🔔 Hit that bell icon to get notified of all our new videos 🔔
If you liked this video, please don't forget to like and comment. Never miss out on our exclusive videos to help boost your coding career! Subscribe to Scaler now!
kzbin.info?sub_co...
#datapipeline #dataengineering #bigdata #SCALER

Пікірлер: 55
@SCALER
@SCALER Жыл бұрын
Check out our FREE masterclasses by leading industry experts now: bit.ly/3Apojjv
@ankitKumar-js1ow
@ankitKumar-js1ow Жыл бұрын
I think scaler should have separate course for Data engineering with Dsa and system design with industry level courses as most of guys are working in data engineer field than as Data science Waiting for such quality course to move into product based company
@sandeepdash5652
@sandeepdash5652 3 ай бұрын
@@ankitKumar-js1ow Till now they do not have a plan/module for Data Engineering .They are simply not interested ..And what they have is DE is just not digestable
@akhilcoder
@akhilcoder Жыл бұрын
Regular content. Can be easily searched over internet.
@coding3438
@coding3438 Жыл бұрын
Haha
@sandeepdash5652
@sandeepdash5652 4 ай бұрын
Paid Content is terrible .
@AkashKumar-kx9vj
@AkashKumar-kx9vj Жыл бұрын
Shashank just makes everything so easy to understand
@ArunSingh-rk7mm
@ArunSingh-rk7mm Жыл бұрын
Thank you for talking about a demo pipeline, this could come in handy in interviews.
@daniyaqureshi6201
@daniyaqureshi6201 Жыл бұрын
Thank you for brilliant video
@asishjoshi5774
@asishjoshi5774 Жыл бұрын
very nice.. thanks a ton!
@Rk-mv8sz
@Rk-mv8sz Жыл бұрын
Good content . Thank you🙏
@shaistaqureshi8408
@shaistaqureshi8408 Жыл бұрын
I just wanna say thank you for this video
@panktikhurana8906
@panktikhurana8906 Жыл бұрын
Awesome content 🙂
@abhisekchowdhury8584
@abhisekchowdhury8584 Жыл бұрын
Awesome Video
@krishnasaksena2364
@krishnasaksena2364 Жыл бұрын
Thanks scaler! 🔥
@saniyasharif9861
@saniyasharif9861 Жыл бұрын
Brilliant video again
@healthificteam8465
@healthificteam8465 Жыл бұрын
Can't wait!
@shrutiikarla1055
@shrutiikarla1055 Жыл бұрын
Thank you scaler
@obiradaniel7391
@obiradaniel7391 Жыл бұрын
Thank you.
@saibabatelagamsetty2538
@saibabatelagamsetty2538 Жыл бұрын
Really good Content
@arunsundar3739
@arunsundar3739 2 ай бұрын
helps to see the big picture, thank you very much :)
@tamannamam3563
@tamannamam3563 Жыл бұрын
I easily understand this video
@umakantyadav9972
@umakantyadav9972 Жыл бұрын
Thanks Shashank for explaining in very understandable manner, But i have one question you have not discussed about Staging Area??
@cutipy433
@cutipy433 Жыл бұрын
Very nice content
@marksun6420
@marksun6420 10 ай бұрын
Thanks
@divyanshtayal5077
@divyanshtayal5077 Жыл бұрын
Make more vedios Gurudev thankyou very much
@saniyapoetry8386
@saniyapoetry8386 Жыл бұрын
Very nice 🙂
@endpermia
@endpermia 9 ай бұрын
Thank you! This was really helpful and well-explained.
@SCALER
@SCALER 9 ай бұрын
Happy to hear that! 🙌🏼
@NehaSingh-wp4mf
@NehaSingh-wp4mf 4 ай бұрын
Very well explained and all important topics were covered, thankyou for your efforts. Very helpful.
@SCALER
@SCALER 4 ай бұрын
Thanks! Glad this was helpful! 😃
@avshekraj
@avshekraj 11 ай бұрын
thank you for the nice explanantion
@SCALER
@SCALER 10 ай бұрын
Happy to hear that! 🙌🏼
@AmitSharma-xv6sh
@AmitSharma-xv6sh 7 ай бұрын
This is really really a very detailed and great explanation of end-to-end data pipeline building architecture. Hatsoff to your hardwork and putting this video out there for us brother. It will definitely clear the doubts and picture about how pipeline work for data migration/ingestion/integration based projects. Thanks a lot. 🙏
@SCALER
@SCALER 7 ай бұрын
Thanks! Glad this was helpful! 😃
@parisreview4651
@parisreview4651 Жыл бұрын
You guys did a great job.
@Sameerkhan-kt5jj
@Sameerkhan-kt5jj Жыл бұрын
More Data engineering related content please
@prachiipandeyy
@prachiipandeyy Жыл бұрын
🔥🔥🔥
@ruthmk
@ruthmk 3 ай бұрын
Double like 👍🏽 Thank you
@ramangupta6159
@ramangupta6159 Жыл бұрын
Grafana is a really good monitoring tool
@justdataengineer3138
@justdataengineer3138 Жыл бұрын
When will complete Data Engineering course will be launched from Scaler?
@nandlaljaiswal7217
@nandlaljaiswal7217 Жыл бұрын
Need full course for Data Engineer
@StartDataLate
@StartDataLate Ай бұрын
here is a summary: 00:57 - Understanding of data domains (example: finance data terminology, what is the relationship, primary key, foreign key. Give business side a clear image what can data engineers provide) 02:57 - Choosing data sources (example: sql database, distributed file system, API, sensor data, web application generated) 04:43 - Determine the data ingestion strategy( full load or incremental load) 08:37 - Design the data processing plan (pipeline design real-time process, or batch process) 11:11 - Set up storage for the pipeline output ( amazon s3 HDFS for datalake, AWS redshift, Hive for datawarehouse, dump back in transational databases) 13:19 - Plan the data workflow (scheduler, Apache airflow, apache nifi, Azkaban) 14:42 - Monitoring and governance tools (alert for pipeline failing, tools: Kibana, Grafana, DataDog, PagerDuty)
@TheSoumyakole
@TheSoumyakole 7 ай бұрын
How can NOSQL (specifically Cassandra, MongoDB ) be good for ad-hoc analytical queries as mentioned during 12:05?
@FaizanKhan-ct7pc
@FaizanKhan-ct7pc Жыл бұрын
As a data engineer, should you know all of these tech before getting a job or is it acquired during one?
@Watson22j
@Watson22j Жыл бұрын
you can easily get an entry level job in data engineering if you know good sql, basic python, basic cloud and hadoop architecture.
@shanayakhan839
@shanayakhan839 Жыл бұрын
Redshift is already setup on the cloud, what about Hive?
@it3374
@it3374 Жыл бұрын
Please 1 pipeline practical karke dikhao ...KZbin PE Aisa ek bhi vdo nhiye Jo big data ki pipe line create karke dikhaya ho...
@piyushjain419
@piyushjain419 Жыл бұрын
Scaler knows what us students are searching for on google before an exam lol
@PankajKumar-vv5db
@PankajKumar-vv5db Жыл бұрын
Here the data source is MySQL, what if there was data coming in from multiple sources.
@bangalibangalore2404
@bangalibangalore2404 Жыл бұрын
Data Modelling part was missed I guess
@ashutoshrai5342
@ashutoshrai5342 Жыл бұрын
Bumb explanation.What he is explaining is based on his experience.Its not at all generic.He himself needs to improve
@nemodbuniversity
@nemodbuniversity Жыл бұрын
Aadha adhura gyan
@sheenagupta896
@sheenagupta896 Жыл бұрын
Thank you for talking about a demo pipeline, this could come in handy in interviews.
@fazaila2047
@fazaila2047 Жыл бұрын
Grafana is a really good monitoring tool
⬅️🤔➡️
00:31
Celine Dept
Рет қаралды 44 МЛН
Они убрались очень быстро!
00:40
Аришнев
Рет қаралды 3,5 МЛН
Design a Data Warehouse | System Design
14:08
Interview Pen
Рет қаралды 17 М.
How To DESIGN YOUR First DATA PIPELINE ??🔥 15 Minutes BASIC STEPS
31:03
Top AWS Services A Data Engineer Should Know
13:11
DataEng Uncomplicated
Рет қаралды 152 М.
How to NOT Fail a System Design Interview (By a Data Engineer)
19:32
What is ETL | What is Data Warehouse | OLTP vs OLAP
8:07
codebasics
Рет қаралды 404 М.
System Design Interview: Design an Ad Click Aggregator w/ a Ex-Meta Staff Engineer
1:02:22
Hello Interview - Tech Interview Preparation
Рет қаралды 13 М.
How Data Engineering Works
14:14
AltexSoft
Рет қаралды 420 М.
Data Pipelines Explained
8:29
IBM Technology
Рет қаралды 139 М.