Most Asked interview question in Apache Spark ‘Joins’

No video

Most Asked interview question in Apache Spark ‘Joins’

Рет қаралды 1,858

Data Engineering For Everyone

Күн бұрын

Пікірлер: 23

@venkatasai4293 Жыл бұрын

Great series . Please include a session on how to handle skew in a join , Bucket join vs shuffle hash join .

@dataengineeringforeveryone Жыл бұрын

Thats next video. Will get published this week. Please share with your friends!

@tusharhatwar Жыл бұрын

Thank You bro for the video

@dataengineeringforeveryone Жыл бұрын

My pleasure

@passions9730 Жыл бұрын

Thank you nilendra for the video, hope to see more videos reg this topic in your channel. One small suggestion please dont add back ground music in upcoming videos..😊

@dataengineeringforeveryone Жыл бұрын

Suggestion noted. Next video uploaded too :)

@sriadityab4794 Жыл бұрын

Thanks 👍

@dataengineeringforeveryone Жыл бұрын

Welcome 👍

@Sonuyadav-um9fj Жыл бұрын

Thanks sir 🙏

@dataengineeringforeveryone Жыл бұрын

Hope you like the video and please share !

@TechnoSparkBigData Жыл бұрын

Hi Sir, i would request to please remove the background music 😊. However your content is awesome

@dataengineeringforeveryone Жыл бұрын

Point noted. But to me it seems with my voice only video, viewers might get bored. Haha

@TechnoSparkBigData Жыл бұрын

@@dataengineeringforeveryone if content is good then nobody will get bored. Content is key.

@TechnoSparkBigData Жыл бұрын

This is my video on installing pyspark. I am also a data engineer kzbin.info/www/bejne/pIC2ial8mZmbpbs

@aaroncode2634 Жыл бұрын

@@dataengineeringforeveryone let the bg music be but reduce the volume level and not dominate your voice. Great content btw 😊

@nandlalsharma521 Жыл бұрын

Can you provide link of this document

@dataengineeringforeveryone Жыл бұрын

Hey unfortunately my onedrive is full and not sharable at the moment. I need to upgrade it.

@karthikr4185 Жыл бұрын

Hi Nilendra, Thank you for sharing .... I have one famous question, how many cores and executors are required for 100 gb of data ?? Could you please help us in understanding this ... Thanks in advance

@dataengineeringforeveryone Жыл бұрын

The number of cores and executors needed to process 100 GB of data will depend on several factors, such as the complexity of the processing logic, the type of data, and the resources available on the Spark cluster. Here is a rough estimate based on some general assumptions: A single executor can process about 10 GB of data in a reasonable amount of time. (Again depends on core capacity ) A single core can handle about 2-3 executors, depending on the processing logic and the amount of memory required. Based on these assumptions, processing 100 GB of data would require: 10 executors * 2-3 executors per core = 20-30 cores However, this is just a rough estimate and your actual requirements may vary. It's always best to conduct performance testing with representative data to determine the actual resources required for your specific use case.

@karthikr4185 Жыл бұрын

@@dataengineeringforeveryone Great Thanks for taking some time in replying... Your videos helped us a lot ... Thanks again....