From Poll to Push: Revolutionize Your Data Architecture with Airflow and Lambda

Рет қаралды 2,273

Knowledge Amplifier

Күн бұрын

Пікірлер: 7

@bhumikalalchandani321 Жыл бұрын

Helpful!

@KnowledgeAmplifier1 Жыл бұрын

Thank you Bhumika Lalchandani! Happy Learning

@karangupta_DE Жыл бұрын

Great video.

@KnowledgeAmplifier1 Жыл бұрын

Thank you karan gupta! Happy Learning

@Andonokar Жыл бұрын

Hello, Great video, great channel, I learned kafka alot from your content and started working with it. 1 question: How this DAG handles multiple requests from multiple files arriving in S3? And if handles bad, what is the best way to orchestrate it? thanks for the attention

@KnowledgeAmplifier1 Жыл бұрын

Thank you for watching the video and for your question, Imagine Zat! I'm glad to hear that you find the content helpful. Regarding your question about how the DAG handles multiple requests from multiple files arriving in S3, the answer is _SUCCESS File. The _SUCCESS file is a marker file that is automatically generated by Apache Spark when writing data to a destination, such as a file system (e.g., HDFS, S3). Its purpose is to indicate the successful completion of a write operation. So instead of triggering lambda code for any file write , only if _SUCCESS File is written in s3 , then only Lambda should be triggerd , that way , lambda will run only once and this process make sure , the airflow dag will run only when whole data is written by source system 😊For details , you can refer this video -- kzbin.info/www/bejne/paXQaIGYotusaac I hope this clarifies your question. If you have any further inquiries, please feel free to ask.

@SanjayKumar-d4t2q 9 ай бұрын

Let say, we lambda triggered multiple REST API request to Airflow? How will Airflow handle that scenario? Will it create multiple DAG instances and run concurrently? Or there will be one DAG run with multiple concurrent tasks for each request?