Great series . Please include a session on how to handle skew in a join , Bucket join vs shuffle hash join .
@dataengineeringforeveryone Жыл бұрын
Thats next video. Will get published this week. Please share with your friends!
@tusharhatwar Жыл бұрын
Thank You bro for the video
@dataengineeringforeveryone Жыл бұрын
My pleasure
@passions9730 Жыл бұрын
Thank you nilendra for the video, hope to see more videos reg this topic in your channel. One small suggestion please dont add back ground music in upcoming videos..😊
@dataengineeringforeveryone Жыл бұрын
Suggestion noted. Next video uploaded too :)
@sriadityab4794 Жыл бұрын
Thanks 👍
@dataengineeringforeveryone Жыл бұрын
Welcome 👍
@Sonuyadav-um9fj Жыл бұрын
Thanks sir 🙏
@dataengineeringforeveryone Жыл бұрын
Hope you like the video and please share !
@TechnoSparkBigData Жыл бұрын
Hi Sir, i would request to please remove the background music 😊. However your content is awesome
@dataengineeringforeveryone Жыл бұрын
Point noted. But to me it seems with my voice only video, viewers might get bored. Haha
@TechnoSparkBigData Жыл бұрын
@@dataengineeringforeveryone if content is good then nobody will get bored. Content is key.
@TechnoSparkBigData Жыл бұрын
This is my video on installing pyspark. I am also a data engineer kzbin.info/www/bejne/pIC2ial8mZmbpbs
@aaroncode2634 Жыл бұрын
@@dataengineeringforeveryone let the bg music be but reduce the volume level and not dominate your voice. Great content btw 😊
@nandlalsharma521 Жыл бұрын
Can you provide link of this document
@dataengineeringforeveryone Жыл бұрын
Hey unfortunately my onedrive is full and not sharable at the moment. I need to upgrade it.
@karthikr4185 Жыл бұрын
Hi Nilendra, Thank you for sharing .... I have one famous question, how many cores and executors are required for 100 gb of data ?? Could you please help us in understanding this ... Thanks in advance
@dataengineeringforeveryone Жыл бұрын
The number of cores and executors needed to process 100 GB of data will depend on several factors, such as the complexity of the processing logic, the type of data, and the resources available on the Spark cluster. Here is a rough estimate based on some general assumptions: A single executor can process about 10 GB of data in a reasonable amount of time. (Again depends on core capacity ) A single core can handle about 2-3 executors, depending on the processing logic and the amount of memory required. Based on these assumptions, processing 100 GB of data would require: 10 executors * 2-3 executors per core = 20-30 cores However, this is just a rough estimate and your actual requirements may vary. It's always best to conduct performance testing with representative data to determine the actual resources required for your specific use case.
@karthikr4185 Жыл бұрын
@@dataengineeringforeveryone Great Thanks for taking some time in replying... Your videos helped us a lot ... Thanks again....