Build your first pipeline DAG | Apache airflow for beginners

  Рет қаралды 52,545

MaxcoTec Learning

MaxcoTec Learning

Күн бұрын

Пікірлер: 35
@MrMegabeat
@MrMegabeat Жыл бұрын
My best 10mins investment in the morning! 🎉
@demohub
@demohub Жыл бұрын
This video has definitely given a better understanding of Airflow and now have some ideas on how to use it more effectively for projects.
@muhammedalbayati
@muhammedalbayati 3 жыл бұрын
Thanks a lot. Very good tutorials. Thumbs-up and subscribe
@maxcoteclearning
@maxcoteclearning 3 жыл бұрын
thank you :)
@downeytang7006
@downeytang7006 2 жыл бұрын
a quick question, if you have two different date format from two csv files, and after performing concat, is there a way to unify the date format, for example '2021-01-13' and '01-20-2022'
@maxcoteclearning
@maxcoteclearning 2 жыл бұрын
Thanks for watching. have a look at dateutil.parser. Checkout more answers here stackoverflow.com/a/40800072/5167801
@najmuddin7506
@najmuddin7506 2 жыл бұрын
Thanks for the tutorial! However, could explain what the difference is between running this workflow as an airflow DAG and simply running the program and calling both functions sequentially?
@maxcoteclearning
@maxcoteclearning 2 жыл бұрын
Speaking about this single and simple workflow, its really easy to manage, so you may not need airflow. But it becomes really hard to manage and maintain if you have 100+ complex workflows. Watch the first video of this series, where I am explaining what problem airflow solves (kzbin.info/www/bejne/a2eqdX6rp9abedE).
@victoriwuoha3081
@victoriwuoha3081 2 жыл бұрын
@MaxcoTec Please do you have any resource on how I can read data from an API and perform some similar processing & finally write to a destination SQL server. I'll be grateful if you could advise.
@maxcoteclearning
@maxcoteclearning 2 жыл бұрын
You can use most popular python library requests (docs.python-requests.org/en/latest/) to fetch data from any API. They have good examples under Quickstart section. hope that helps :)
@victoriwuoha3081
@victoriwuoha3081 2 жыл бұрын
@@maxcoteclearning Thank You, I'll try that out
@parikshitchavan2211
@parikshitchavan2211 2 жыл бұрын
Hello Thanks for such a great tutorial everting you made smooth like butter ,just one question whenever we made new DAG we will have to add (docker-compose-CeleryExecutor, docker-compose-LocalExecutor, and Config for that particular DAG )??
@maxcoteclearning
@maxcoteclearning 2 жыл бұрын
Thanks Parikshit. Only one executor can be used at a time. You can add multiple dags, while keeping single executor with same config file.
@ammadkhan4687
@ammadkhan4687 8 ай бұрын
how can I access airflow container when I am hosting to another server and add more dags?
@maxcoteclearning
@maxcoteclearning 8 ай бұрын
Hi Ammad, could you explain what does 'when I am hosting to another server ' means?
@ammadkhan4687
@ammadkhan4687 8 ай бұрын
@@maxcoteclearning suppose I have a docker hosting server. I am connecting to this server remotely. how can we as a team create more dags to work on this hosted server? for example hosting docker container of airflow in azure cloud or on premise docker hosting server.
@nghianguyen9439
@nghianguyen9439 3 жыл бұрын
Thank for very good videos. Can you help me to give some instructions about an example data pipeline in Mongodb?
@maxcoteclearning
@maxcoteclearning 3 жыл бұрын
You welcome. Sure, can you explain more about your pipeline. Whats the data flow (source/destination). are you persisting data into mondoDB ? or extracting out of it ? Have you looked at this github.com/airflow-plugins/mongo_plugin/blob/master/operators/s3_to_mongo_operator.py
@nghianguyen9439
@nghianguyen9439 3 жыл бұрын
@@maxcoteclearning I am trying to do a data sync between 2 separate MongDB or simply read a csv file then import to MongoDB
@hsk7715
@hsk7715 4 ай бұрын
hey brother in my ui dag is not possible
@maxcoteclearning
@maxcoteclearning 4 ай бұрын
Hi hsk, can you elaborate what's the exact issue you're facing?
@hsk7715
@hsk7715 4 ай бұрын
@@maxcoteclearning ya its small issue I given start date from past so I used current date now its showing but one problem i facing another error whenever I click trigger button is always faild and its show file not find but I already uploaded the file in write directory can you help me please
@FedericoLov
@FedericoLov 2 жыл бұрын
good video but it seems that the actual transformations are done in pandas while airflow only provides a layer of logging and task scheduling
@maxcoteclearning
@maxcoteclearning 2 жыл бұрын
Thats true. Airflow is a workflow management tool. I've just used a simple ETL operation to show how it can be deployed and managed using Airflow.
@muhammedalbayati
@muhammedalbayati 3 жыл бұрын
Please how can save these CSV data to MS Sql server database?
@maxcoteclearning
@maxcoteclearning 3 жыл бұрын
It will be similar just the way we are loading data to sqlite database. pandas_df.to_sql("table_name", engine)
@muhammedalbayati
@muhammedalbayati 3 жыл бұрын
@@maxcoteclearning Thanks
@riyasingh2515
@riyasingh2515 3 жыл бұрын
my task are getting failed in airflow UI can u tell why it is happening, so I copied all your code properly
@maxcoteclearning
@maxcoteclearning 3 жыл бұрын
Hi Riya, May I know what exact errors are you getting ?
@diptimanraichaudhuri6477
@diptimanraichaudhuri6477 2 жыл бұрын
I was also getting a DAG failed initially from the Github code sample, turns out there is a variable "file_date_path" in transform_data method, which gets constructed from the op_args passed to the DAG. So, unless, the file is kept in the same folder hierarchy, the booking read will fail. So, please keep your booking .csv in the following hierarchy "raw_data//" and it will start working. O/wise, you can modify the code where it reads from that folder and just plainly read and write w/o dates in folder names. It is rare to find such a well-laid out series. Kudos Maxco Tec ! !
@orick92
@orick92 2 жыл бұрын
You should delete "beginners" title from the headline...
@TheVickramsharma
@TheVickramsharma 2 жыл бұрын
Hi @MaxcoTec, i tried running this example and am getting this error: FileNotFoundError: [Errno 2] No such file or directory: '/opt/airflow/raw_data/2022-12-13/5/booking.csv', could you please help me in this
@maxcoteclearning
@maxcoteclearning 2 жыл бұрын
Are you surely running code from rite branch ? This video code is not in main branch. Check this commit github.com/maxcotec/Apache-Airflow/tree/1787097721a8cec8999bdaee4c04a9f4bc0e1f71/DAG_ingestion_pipeline.
@prafulsoni9378
@prafulsoni9378 Жыл бұрын
@@maxcoteclearning I'm also facing the same issue, I clone the branch and at the `DAG_ingestion_pipeline` I run `docker-compose up`
@prafulsoni9378
@prafulsoni9378 Жыл бұрын
@@maxcoteclearning I am using Windows!
Use Apache Airflow Macros like pro | Getting easy with Apache Airflow
9:45
Сестра обхитрила!
00:17
Victoria Portfolio
Рет қаралды 958 М.
VIP ACCESS
00:47
Natan por Aí
Рет қаралды 30 МЛН
Airflow DAG: Coding your first DAG for Beginners
20:31
Data with Marc
Рет қаралды 236 М.
Apache airflow for beginners | Executors Architecture | Comparison
10:06
MaxcoTec Learning
Рет қаралды 19 М.
Airflow Tutorial for Beginners - Full Course in 2 Hours 2022
2:01:13
Building Data Pipelines Part 1: Airbnb's Airflow Vs Spotify's Luigi
15:34
The Newcomer's Guide to Airflow's Architecture
27:26
Apache Airflow
Рет қаралды 24 М.
Airflow for Beginners - Run Spotify ETL Job in 15 minutes!
16:38
Karolina Sowinska
Рет қаралды 144 М.
Don't Use Apache Airflow
16:21
Bryan Cafferky
Рет қаралды 100 М.
Сестра обхитрила!
00:17
Victoria Portfolio
Рет қаралды 958 М.