Creating an ETL Data Pipeline on Google Cloud with Cloud Data Fusion & Airflow - Part 1

  Рет қаралды 33,155

TechTrapture

TechTrapture

Күн бұрын

Part 2 - • Creating an ETL Data P...
Source Code - github.com/vis...
Creating an ETL Data Pipeline on Google Cloud with Cloud Data Fusion & Airflow
Explore the magic of building an ETL pipeline in Google Cloud with this comprehensive tutorial. Learn how to craft a seamless process for extracting, transforming, and loading data into BigQuery, then visualize it effortlessly in Looker Studio.
Step 1: Begin by extracting dummy employee data using the Python Faker library, seamlessly storing it in a designated Google Cloud Storage (GCS) bucket.
Step 2: Dive into the creation of a Cloud Fusion instance, setting up the groundwork for your data pipeline journey.
Step 3: Unveil the magic of Data Fusion as you craft a robust pipeline. Witness the transformation of data while ensuring sensitive information remains masked, ultimately loading it into BigQuery for further analysis.
Step 4: Elevate your data visualization game as you harness the power of Looker Studio, bringing your insights to life in a visually compelling manner.
Join me on this illuminating journey through the intricacies of ETL pipelines, empowering you to master data orchestration and visualization in the Google Cloud ecosystem.
Looking to get in touch?
Drop me a line at vishal.bulbule@gmail.com, or schedule a meeting using the provided link topmate.io/vis...
Playlists
Associate Cloud Engineer -Complete Free Course
• Associate Cloud Engine...
Google Cloud Data Engineer Certification Course
• Google Cloud Data Engi...
Google Cloud Platform(GCP) Tutorials
• Google Cloud Platform(...
Generative AI
• Generative AI
Getting Started with Duet AI
• Getting started with D...
Google Cloud Projects
• Google Cloud Projects
Python For GCP
• Python for GCP
Terraform Tutorials
• Terraform Associate C...
Linkedin
/ vishal-bulbule
Medium Blog
/ vishalbulbule
Github
Source Code
github.com/vis...
#googlecloud #gcp #airflow #dataengineeringessentials #dataengineering #bigquery #dataengineeringprojects

Пікірлер: 39
@AR-by2lk
@AR-by2lk 7 ай бұрын
Thank You Vishal for doing this. It will be definitely a great help! Kudos to you!
@rajeshiyer4999
@rajeshiyer4999 3 ай бұрын
Thanks Vishal for the detailed pipeline design and development video. Great job.
@royal_dsz
@royal_dsz 2 ай бұрын
Thanks Vishal, this was very helpful
@LMGaming0
@LMGaming0 3 ай бұрын
Very simple and well explained, thanks!
@zikoalexis2751
@zikoalexis2751 8 ай бұрын
Thank you for the help
@asifshaharia2756
@asifshaharia2756 3 ай бұрын
Im facing some problem. In my cloud fusion some of the field in phone_number, ssn is missing. And data of birth and password column is completely empty. Could you please help me troubleshoot it?
@selvaarul8258
@selvaarul8258 6 ай бұрын
awesome video, can you create complete composer airflow video for this one
@techtrapture
@techtrapture 6 ай бұрын
Seperate playlist for Composer Cloud Composer - Airflow on GCP: kzbin.info/aero/PLLrA_pU9-Gz22Zml5mxcszG4A9ecqWtd4
@vikascbr
@vikascbr 2 ай бұрын
Thanks very helpful
@renvils
@renvils 5 ай бұрын
Great video as always ! Can you do make a timestamp for this video ?
@basavrajningadali4919
@basavrajningadali4919 3 ай бұрын
not getting mask data option in wrangler
@abhisheknaidu8877
@abhisheknaidu8877 6 ай бұрын
i am getting more environment error while connecting data fusion and python code has error
@anonymous8038-c4m
@anonymous8038-c4m 4 ай бұрын
Fusion is not parsing the salary and many fields although they are in the csv
@lmarwarl
@lmarwarl 5 ай бұрын
Amazing video, unfortunately I have problems creating my cloud composer environment, maybe because I am in a free trial. I get this error after create the environment: CREATE operation on this environment failed 49 minutes ago with the following error message: Some of the GKE pods failed to become healthy. Please check the GKE logs for details, and retry the operation.
@Abracadanz00
@Abracadanz00 5 ай бұрын
I'm having the same issue, any idea how to resolve it?
@lmarwarl
@lmarwarl 5 ай бұрын
@@Abracadanz00 Nothing yet, but after searching a lot I read a post from Google that says you have to activate your billing account in GCP before creating the cloud composer environment.
@paranoya733
@paranoya733 4 ай бұрын
@@Abracadanz00 If you want to use shorter free pipeline in this part 14:57 cut off these part: Cloud Composer, Cloud Storage, Cloud Data Fusion, BigQuery, and replace them with free short pipelines: google sheets (data) -> Looker Studio. If you extract API data, in google sheets add extension called "API Connector" configure it (search in youtube) -> looker studio
@basavrajningadali4919
@basavrajningadali4919 3 ай бұрын
i am not able to create composer env
@adijos92
@adijos92 8 ай бұрын
cloud composer environment showing error and image version not showing while creating environment manually..is their any update
@adijos92
@adijos92 8 ай бұрын
please reply on that
@abdulfasith7905
@abdulfasith7905 5 ай бұрын
Nice video, can you create a pipeline using server / serverless dataproc.?
@yishanzhan6066
@yishanzhan6066 7 ай бұрын
I got these errors "Cannot load filesystem: java.util.ServiceConfigurationError: org.apache.hadoop.fs.FileSystem: Provider org.apache.hadoop.hdfs.web.HftpFileSystem not found. Can not load the default value of `spark.yarn.isHadoopProvided` from `org/apache/spark/deploy/yarn/config.properties` with error, java.lang.NullPointerException. Using `false` as a default value." Any clues on how to fix it?
@figh761
@figh761 7 ай бұрын
did you fix this
@akshaymantena6699
@akshaymantena6699 4 ай бұрын
I'm also getting the same error, Did you fix it?
@Daswinian
@Daswinian 15 күн бұрын
I thinks it's permission issue. Try adding the following roles to the compute service account your datafusion uses Dataproc Service Agent Dataproc Worker Editor Service Account User
@TeekawinKirdsaeng
@TeekawinKirdsaeng 7 ай бұрын
How to use gcloud in vs code? Error: gcloud : The term 'gcloud' is not recognized as the name of a cmdlet, function, script file, or operable program. Check the spelling of the name, or if a path was included, verify that the path is correct and try again
@techtrapture
@techtrapture 7 ай бұрын
Install Google cloud SDK in your system . Use below link cloud.google.com/sdk/docs/install#windows
@Alfred_vinci
@Alfred_vinci 8 ай бұрын
in place of Airflow i want to use Mage ai.
@promitdutta3029
@promitdutta3029 8 ай бұрын
composer shows "This environment has errors"
@fatallny
@fatallny 7 ай бұрын
thank you!!
@punk77777
@punk77777 8 ай бұрын
kindly make this kind of pipeline ETL video with the {GCS-->(COMPOSER---DATAFLOW)--->BIGQUERY}
@techtrapture
@techtrapture 8 ай бұрын
It's already there kzbin.info/www/bejne/i4mtqZaNnN2qq5I
@VthePeople4156
@VthePeople4156 8 ай бұрын
Please explain total project 3-5 sentences for interview purpose Like what is the flow of project, Which gcp services used for project How u developed all different modules by using all different GCP services...
@Rajdeep6452
@Rajdeep6452 8 ай бұрын
@@VthePeople4156 Cant you see and tell? Does he have to spoon feed you now? your parents still wash your ass?
@VthePeople4156
@VthePeople4156 8 ай бұрын
@@Rajdeep6452 yes
@Rajdeep6452
@Rajdeep6452 8 ай бұрын
@@VthePeople4156 idiot xD
@flosrv3194
@flosrv3194 5 ай бұрын
its written gcloud is not an executable so your login stuff doesnt work with everyone and you did stuffs before without telling it in video. please next time show everything from scratch, i mean for real, not saying but doing it in reality too
@techtrapture
@techtrapture 5 ай бұрын
Apologies if I missed. You need to install gcloud/ cloud SDK first to execute your command.
Disrespect or Respect 💔❤️
00:27
Thiago Productions
Рет қаралды 39 МЛН
GCP Composer | Airflow GCS to BigQuery and BigQuery Operators
19:24
Anjan GCP Data Engineering
Рет қаралды 15 М.
GCP Data Engineer Mock  interview
15:22
Grow With Google Cloud
Рет қаралды 4,3 М.
Analyzing YouTube Channel Performance with BigQuery and Looker
15:10
Create first Dag in Cloud Composer(Airflow) - Part 1
33:30
TechTrapture
Рет қаралды 22 М.
Data Engineering Course for Beginners
3:03:43
freeCodeCamp.org
Рет қаралды 558 М.