Apache Spark Executor Tuning | Executor Cores & Memory

Рет қаралды 6,749

Күн бұрын

Welcome back to our comprehensive series on Apache Spark Performance Tuning & Optimisation! In this guide, we dive deep into the art of executor tuning in Apache Spark to ensure your data engineering tasks run efficiently.
🔹 What is inside:
Learn how to properly allocate CPU and memory resources to your Spark executors and the number of executors to create to achieve optimal performance. Whether you're new to Apache Spark or an experienced data engineer looking to refine your Spark jobs, this video provides valuable insights into configuring the number of executors, memory, and cores for peak performance. I’ve covered everything from understanding the basic structure of Spark executors within a cluster, to advanced strategies for sizing executors optimally, including detailed examples and calculations.
📘 Resources:
📄 Complete Code on GitHub: github.com/afaqueahmad7117/sp...
🎥 Full Spark Performance Tuning Playlist: • Apache Spark Performan...
🔗 LinkedIn: / afaque-ahmad-5a5847129
Chapters:
0:00 - Introduction to Executor Tuning in Apache Spark
0:37 - Understanding Executors in a Spark Cluster
3:30 - Example: Sizing Executors in a Cluster
4:58 - Example: Sizing a Fat Executor
9:34 - Example: Sizing a Thin Executor
12:50 - Advantages and Disadvantages of Fat Executor
18:25 - Advantages and Disadvantages of Thin Executor
22:12 - Rules for sizing an Optimal Executor
26:30 - Example 1: Sizing an Optimal Executor
38:15 - Example 2: Sizing an Optimal Executor
43:50 - Key Takeaways
#ApacheSparkTutorial #SparkPerformanceTuning #ApacheSparkPython #LearnApacheSpark #SparkInterviewQuestions #ApacheSparkCourse #PerformanceTuningInPySpark #ApacheSparkPerformanceOptimization #ApacheSpark #DataEngineering #SparkTuning #PythonSpark #ExecutorTuning #SparkOptimization #DataProcessing #pyspark #databricks

Пікірлер: 84

@dudechany 3 күн бұрын

Every-time I come here before attending an interview , I try to give this video a like , but end up realising that I already did it earlier. Best video on this topic on whole internet.

@bijjigirisupraja8021 20 күн бұрын

Bro do the videos regularly on spark it will be very helpful. Thank you

@BabaiChakraborty-ss8pt 3 ай бұрын

Man your tutorials are the best. I have been following you for Spark turning related videos. Thanks

@afaqueahmad7117 2 ай бұрын

Thank you @BabaiChakraborty-ss8pt, really appreciate it, means a lot to me :)

@mohitupadhayay1439 18 күн бұрын

Really waiting to see if you can add some real world use cases to your videos to strengthen our understanding. It will be appreciated a lot man!

@SandeepPatel-wt7ye 21 күн бұрын

This is awesome stuff..The executor Tuning concept is explained at a very granular level.

@afaqueahmad7117 20 күн бұрын

Appreciate it @SandeepPatel-wt7ye, thank you!

@harshshah8884 11 күн бұрын

@@afaqueahmad7117- qq .. let’s say i have limited RAM available like 50 GB and wants to process 1TB data , no additional capacity can be add into cluster , how should we apply based on your video- for optimal # executor / memory per executor.. / core per executor

@AshishStudyDE Ай бұрын

Great work, going good. I hope you cover 2 more topic of driver oom and executor oom. Why it happens and how we can tackle it.

@saineelkiranch9790 3 ай бұрын

Excellent. Very Well Explained

@afaqueahmad7117 3 ай бұрын

Thank you @saineelkiranch9790, really appreciate it :)

@mayapareek2844 2 ай бұрын

Wow !! Great Content !! I am preparing for interviews and found this super helpful. Thanks a Ton !!

@afaqueahmad7117 2 ай бұрын

Glad you're finding it helpful @mayapareek2844, heartfelt thanks :)

@seenu0104 3 ай бұрын

Thank you very much for this amazing content with super easy explanation 👏👏

@afaqueahmad7117 3 ай бұрын

Thank you @seenu0104, really appreciate it :)

@iamexplorer6052 3 ай бұрын

Thanks for this currently working on job optimization it is very useful to me

@afaqueahmad7117 3 ай бұрын

Thank you, really appreciate it :)

@adtempgupta 3 ай бұрын

Thankyou so much for wonderful content. please start PySpark session

@ComedyXRoad 2 ай бұрын

thanks for the content and your efforts

@afaqueahmad7117 2 ай бұрын

Thank you @ComedyXRoad, appreciate the kind words :)

@leilaturgarayeva105 3 ай бұрын

Thank you for the useful content! IRL an analyst / engineer would have access to a huge cluster which is shared between many people / teams. It would be very interesting to watch a video where you calculate the amount of resources that should be requested based on the task at hand (particular dataset, task and output). And again - thanks for helping to understand these somewhat hard to grasp concepts :-)

@asokanramasamy2087 3 ай бұрын

Great! If possible Pls make video of Spark streaming as well!

@sankarshkadambari2742 3 ай бұрын

Amazing is the word you never dissapoint us . very greatful and indebted to you for this excellent content you are creating. God bless you !

@afaqueahmad7117 2 ай бұрын

Thank you @sankarshkadambari2742, really appreciate it, means a lot to me :)

@chitransh847 23 күн бұрын

sir can you please bring python and sql series for prep of interviews and also basics of it , remaining of the content is just great!

@afaqueahmad7117 20 күн бұрын

Thank you, appreciate it @chitransh847, Python coming soon :)

@purnimasharma9734 2 ай бұрын

Hell Afaque, your tutorials are excellent and I learnt so much about optimization techniques. I am wondering if you can add some real world use cases to your videos to strengthen our understanding. It will be appreciated a lot.

@iamkiri_ 3 ай бұрын

Awesome :)

@afaqueahmad7117 2 ай бұрын

Thank you @iamkiri_, really appreciate it :)

@yashwantdhole7645 Ай бұрын

Hi Afaque, it is was a really nice video. Never got such detailed understanding anywhere. Do you also provide 1:1 session? If yes, I am highly interested.

@afaqueahmad7117 29 күн бұрын

Hey @yashwantdhole7645, appreciate the kind words, means a lot. At this moment, I do not take 1:1 sessions, but if you have any questions feel free to shoot an email or comment here in this thread :)

@wreckergta5470 3 ай бұрын

Thanks

@afaqueahmad7117 3 ай бұрын

Appreciate it, @wreckergta5470 :)

@yatinchadha1803 2 ай бұрын

Thanks Afaque for this great tutorial. This will really help while working on Spark Optimization. It would be of great help if you can tell how do you deal with this type of questions: - spark cluster size -- 200 cores and 100 gb RAM data to be processed --100 gb give the calculation of spark for driver memory, driver cores, executor memory, overhead memory, number of executors

@afaqueahmad7117 Ай бұрын

Hey @yatinchadha1803, thanks for the kind words, really appreciate it. Regarding the question - after watching the video, it should be a cakewalk :)

@yatinchadha1803 Ай бұрын

@@afaqueahmad7117 can you please guide on how to calculate the driver memory and driver cores?

@Amarjeet-fb3lk 2 ай бұрын

Thanks for this videos. I have been watching your videos from quite a while. You explain things in a very easy and simple manner. But, I thinks in real time we would be processing a very large amount of data, So , It will be great if you can make a video ön processing large amounts of data with all the optimisation techniques we can use. Thanks in advance.

@afaqueahmad7117 2 ай бұрын

Hey @Amarjeet-fb3lk, Thank you so much for the kind words; they truly mean a lot! I'm delighted to hear that you find the explanations easy and simple to understand. While production/large-scale projects are in the future plans, I would like to emphasize that the fundamental concepts and optimization techniques remain the same. My goal is to help you build a rock solid understanding of these concepts so you can confidently apply them in any scenario.

@remedyiq8034 3 ай бұрын

Hi, Can you please make a video on Spark UI or Databricks Spark UI understanding. There are a lot of tabs there; its tough to understand it.

@afaqueahmad7117 3 ай бұрын

Hey @remedyiq8034, could you share which tabs are troubling you? The most important ones, I've discussed, sharing links below: 1. Storage tab: Caching video (kzbin.info/www/bejne/fKbNqIWPoKeqg5Y) 2. SQL tab: Master Reading Spark Query Plans video (kzbin.info/www/bejne/gZ-4ia2qgNqYg7c) 3. Jobs/Stages/SQL - Unlock Performance With Spark DAG Mastery video (kzbin.info/www/bejne/hZCXZq13r5aCfc0)

@ashutoshpatkar4891 3 күн бұрын

Hey man. learnt a lot from the video. please help me out on this doubt for example 2, total executors = 44/4 = 11 you have said. But shouldn't we think machine by machine, here each machine can have, 15/4 === 3 executors if 4 core for each, giving total 3*3 nodes = 9. in your workout, it seems like there will be an executor which will use some cores from one node and some from other. Am I wrong in my thought process somewhere?

@dataterre 3 ай бұрын

Thanks Afaque, this is an excellent video to start my Saturday morning. It has been on my list to do for the whole week. A couple of questions for you / community since this is very relevant to my current work. 1) Considering we are "exhausting" the cluster resources, could you explain where does driver node come into the picture in this pool of resources (e.g. --driver-memory)? I presume a sizeable amount of driver memory is required since we tend to collect data in the driver node in a count(), etc. 2) Understand the concept of optimal executor sizing here. Suppose my application abstraction is looking at optimal Spark sessions running in parallel, then this optimal tuning here would mean I can only run 1 spark-submit job in the entire cluster, right? Excellent video, again

@afaqueahmad7117 3 ай бұрын

Hi @dataterre, thank you for the kind words, means a lot to me :) On the questions: 1. Indeed, a reasonable amount of cores and memory is required for the driver because it is the one coordinating the lifecycle of the application, managing communication, creating and scheduling tasks to be executed on executors. However, in this video, with specific focus being on "executor" tuning, driver resource allocation is skipped, but it's important to note (as you rightly pointed out) - driver will need resources for it's own functioning / executing it's responsibilities + collecting data as a result of actions (count(), show() etc..). I would think of subtracting out an appropriate number for driver cores and memory from the total cluster cores/memory and then doing the executor sizing discussed in the video. 2. Yes, this example assumes, you're taking up the whole cluster for best utilization. However, if you're looking forward to running multiple Spark sessions in parallel, you could do the following: a. Enable dynamic allocation (by setting `spark.dynamicAllocation.enabled` set to `true`) to allow each session to use resources. b. Define a reasonable minimum and maximum number of executors per application (by using `spark.dynamicAllocation.minExecutors`, `spark.dynamicAllocation.maxExecutors`) c. Adjust `spark.executor.cores` and `spark.executor.memory` using the principles/rules as discussed (in video), to ensure that each application gets enough resources to perform efficiently but not so much that it monopolizes cluster resources

@remedyiq8034 3 ай бұрын

At 35:10 @afaqueahmad7117 I want to add one point. You said that executions happen in execution memory, that is 60 % percent, and 40 percent is user memory. So . 60 Percent of 20GB -> is 12 GB memory. Out of which 50 percent is for execution and 50 percent for storage. Let's assume 50 percent is given to execution(static allocation). Out of 12 GB, only 6 GB is for execution. As we have 5 cores per executor. Therefore 6/5 === approximately 1.2 per portion of memory per core. The maximum partition size that can be accommodated is 1.2 GB of partition. My thought process is correct ????

@iamkiri_ 3 ай бұрын

Looks Like this is a valid question bro!

@afaqueahmad7117 3 ай бұрын

Hi @remedyiq8034, this is a very valid point and thanks for highlighting this. You're absolutely right about ~1.2GB memory per core. My mind was referring to execution memory but I really appreciate your attention to the breakdown of the `--executor-memory` into its various components, which I should have explained more clearly before doing the memory per core calculation. I'll look into adding an info card to make this clear in the video. Thanks again for your sharp observation!

@remedyiq8034 3 ай бұрын

@@afaqueahmad7117 Thanks > I learned a lot from you. Watched all your videos. Keep doing great work for the community . Better than paid coursed of Udemy !!

@Amarjeet-fb3lk 2 ай бұрын

Hi @Afaque I watched this video previously ,and I am still watching many more videos that covers, spark memory management and reading articles on spark memory and partitions. So here are some points that I have learnt. 1. Memory for each core should we 4 times of 128MB. 2. Total number of partitions should be , 4*no. Of cores. But, How should we decide the no. Of partitions,each partitions size, memory for each core. Because, this things will change,according to our data. So,can u answer this 3 questions? Thanks.

@atifiu 3 ай бұрын

Thanks Afaque for this video. Have question regarding task level and executor level parallelism. As per my understanding 1 partition = 1 task = 1 core/thread, so how task level parallelism is executed as 1 task will be assigned to only one core which means within a executor remaining 46 cores will not be utilized if number of tasks are say only 5.

@maheshmahadev9918 3 ай бұрын

Great Explanation, thanks !! I have a question: Can you explain the basis for choosing these numbers? Is it based on the incoming data that needs to be processed? In that case, for the calculations in this video, what is the data size considered. Thanks again

@afaqueahmad7117 3 ай бұрын

Hey @maheshmahadev9918, the numbers for the cluster (X Nodes, Y Cores, Z RAM) are for illustration and independent of the incoming data size. As discussed in 34:06, the reason why I'm not talking about incoming data sizes because that should be tailored based on the "Memory per core". The most granular unit of data is going to be a "partition", and as long the core has got enough memory to process that partition, things will run fine. Would suggest to re-watch 34:06 again, if unclear :)

@ajaydhanwani4571 3 күн бұрын

sorry if I am asking very basic question, can we set executors per spark job or per spark cluster? Also how to set this up using coding examples and all

@satheeshkumar2149 3 ай бұрын

How much of memory or core should we set aside for the internal stuff if we have got a standalone cluster instead of YARN ?

@ShubhamWakshe-e4c 27 күн бұрын

you talked about yarn application master. is it driver which contain application master container right? means we are assigning driver memory as 1 gb. right?

@naveenreddybedadala Ай бұрын

Will that final actual executor memory again split into user,reserve, unified, overhead memory??

@rohitdeshmukh7274 Ай бұрын

Very informative video. I have one question. I’m having databricks cluster and auto scaling is enabled. Will calculations change in that case?

@adusumillisudheer2772 26 күн бұрын

same question to me also. when autoscaling is enabled. how it will tune up the workers and executors inside it.

@ShubhamWakshe-e4c 27 күн бұрын

if we have already alloting 1 core and 1 gb ram for yarn/os deamons then why do we need to allot seperate 1 core and 1 gb or one executor for yarn resource manager?

@Wonderscope1 Ай бұрын

I really enjoy your videos. Thanks for sharing your knowledge. I have a question about how you create these videos. It is an amazing way to create tutorial videos. Do you mind share what tools you use to make these videos? Thanks

@afaqueahmad7117 Ай бұрын

Thank you @Wonderscope1, really appreciate it. I use Notion and Miro :)

@Wonderscope1 Ай бұрын

@@afaqueahmad7117 I am familiar with Notion as project managmeent tool I didn't know it can help with video production. I need to look into that. Thanks 😊

@afaqueahmad7117 Ай бұрын

Sorry I meant Notion for the code snippets. I use Ecamm Live for video production :)

@Wonderscope1 Ай бұрын

@@afaqueahmad7117 perfect that's what I was looking for . Thanks :)

@roshankumargupta46 3 ай бұрын

Hi Afaque! Can you confirm if I'm wrong here. So if thin executors promote more parallelism than fat executors? Because in the case of a thin executor, the number of executors will be higher, resulting in more individual cores, which will eventually promote parallelism. Whereas in Fat executor, all cores will be consumed by Executors which may lead to wastage of resources.

@maheshh1695 3 ай бұрын

Hi thanks for sharing the information In fat executor case, since we have 5 nodes and each node is having only one executor , then number of cores should be 5*11 ie 55 cores right

@afaqueahmad7117 Ай бұрын

Hey @maheshh1695, total cores will be 55 while cores per node is 11

@suresh.suthar.24 2 ай бұрын

wonderfull explanation ahmad, i have one doubt like as in your example 23GB memory willl be assigned to each and every executor and then 10% will excluded for overhead memory so we will left with 20 GB memory for executor. So now this 20 GB memory is ON heap memory and this will be divided into reserved memory, storage memory, execution memory. Am i wrong or right please reply i have asked this question to my seniors but they dont have answer for this. Thank you in advance..!!

@afaqueahmad7117 2 ай бұрын

Hey @SS1251, You're correct! The 20GB of memory is indeed on-heap memory and it will be divided respectively into reserved, storage, and execution memory. The memory defined through `--executor-memory` or `spark.executor.memory` is the one allocated to on-heap. You can refer this video to get a better understanding: kzbin.info/www/bejne/qYmvYqSdp7Wsqck :)

@vikastangudu712 3 ай бұрын

Great Video, Thanks for the Explanation, But how would a fat exec improve Data Locality ? A node can be broken into 11 exec or 1 exec, The HDFS storage or some other storage within the node is still the same for all the exec inside the node. Data Locality talks about the storage not memory. Thus Fat/Thin --> No effect on Data Locality.

@rambabuposa5082 3 ай бұрын

Because FAT executor have more memory, it can store more partitions of your dataset and not much shuffling of data is required, and also it increases data locality (i.e. most of its required partitions are stored within that FAT executor)

@afaqueahmad7117 3 ай бұрын

Hey @vikastangudu712, you're correct in saying that data locality talks about "storage". However, what I'm referring to is that the interplay with "memory" becomes important once data is loaded in memory in the sense that "how much" amount of data can be processed without having go through the overhead of having to load data from disk again. Several operations are going to benefit from this "memory" locality. In Spark, the best form of locality is `PROCESS_LOCAL` - which would mean that the data required for a task is present in the memory of the same JVM. Therefore, fat executors occupying most memory of the node would benefit in this case - given that the chances of data being present on the same JVM increases. Hope this clarifies :)

@swapnilpatil18 2 ай бұрын

Hi , in case of fat executor we assigned all 47 GB remaining to executor (1 GB for Hadoop yarn ops). In this case from where executor overhead memory will come from ??

@afaqueahmad7117 2 ай бұрын

Hey @swapnilpatil18, Good question. In the initial parts of the video (before explaining the 4 rules to size an optimal executor), the goal to explain fat executors was to only point out that they take up a large portion of the memory on a node and that was the rationale for not separating out the respective parts i.e. overhead memory, AM memory. However, you understanding is absolutely correct. The ideal calculation should involve subtraction of Max(384MB, 10 % 47GB) = Max(384MB, 4.7GB) = 4.7GB per executor before calculating the `--executor-memory`

@Amarjeet-fb3lk 2 ай бұрын

Hi, I watched this video till end. Very good explanation. But, I have below doubts. If no. of cores are 5 per executor, At shuffle time, by default it creates 200 partitions,how that 200 partitions will be created,if no of cores are less, because 1 partition will be stored on 1 core. Suppose, that My config is, 2 executor each with 5 core. Now, how it will create 200 partitions if I do a group by operation? There are 10 cores, and 200 partitions are required to store them, right? How is that possible?

@afaqueahmad7117 2 ай бұрын

Hi @Amarjeet-fb3lk, thanks again for the kind words. Regarding your question, you're right in stating that 1 partition will be processed by 1 core. Given the configuration you shared has 2 * 5 = 10 cores in total, it is not necessary for the number of cores to match the number of partitions exactly at any given moment. Spark will create 200 partitions during shuffle by default and it will manage the execution of those 200 partitions by scheduling the tasks in chunks based on resource availability - firstly 10 partitions assigning them to 10 cores and once those 10 cores are freed, then the remaining 10 and so on.. until all 200 partitions are processed.

@Amarjeet-fb3lk 2 ай бұрын

@@afaqueahmad7117 thanks for your response Afaque. Learning and going deep into the topics, bringing me lots of doubts and questions. Thanks for the answer,highly appreciate that.

@rambabuposa5082 3 ай бұрын

Hi @afaqueahmad7117 At 35.30 minutes, you were discussing about "Memory per core" which 4gb per core. If we have partitions with size of 128Mb or 256Mb with this 4gb per core configuration, is that mean any inefficient utilisation of resources (memory)? because one core can process upto 4gb but partition size is very less. Do we need to reduce "Memory per core" size to get better performance and efficient utilisation of resources? Many thanks

@afaqueahmad7117 3 ай бұрын

Hey @rambabuposa5082, Good question! 4GB per core was for an example. If the partition sizes are 128MB or 256MB, then this would indeed be underutilising the cluster. You could reduce the memory per core giving some room for overhead (maybe 400MB per core for a 256MB partition), however, it's important to keep the 4 rules of the game as discussed in mind (e.g. keeping number of cores

@remedyiq8034 3 ай бұрын

@@afaqueahmad7117 I want to add one point. You told that executions happen in execution memory, that is 60 % percent and 40 percent is user memory. So . 60 Percent of 20GB --> is 12 GB memory. Out of which 50 percent is for execution and 5- percent storage. Let's assume 50 percent is given to execution(static allocation). out of 12 GB, only 6 GB is for execution. As we have 5 cores per executor. therefore 6/5 === approximately 1.2 per portion of memory per core. Maximum partition size can be accommodated is 1.2 GB of partition. MY thought process is correct ????

@afaqueahmad7117 3 ай бұрын

Copying the same answer as in the previous comment for the community :) """ Hi @remedyiq8034, this is a very valid point and thanks for highlighting this. You're absolutely right about ~1.2GB memory per core. My mind was referring to execution memory but I really appreciate your attention to the breakdown of the `--executor-memory` into its various components, which I should have explained more clearly before doing the memory per core calculation. I'll look into adding an info card to make this clear in the video. Thanks again for your sharp observation! """

@tushibhaque863 Ай бұрын

Thanks and please provide contact details .Also do you take classes?

@afaqueahmad7117 29 күн бұрын

Hey @tushibhaque863, appreciate the kind words. At this moment, I do not take classes, but if you have any questions feel free to shoot an email or comment here in this thread :)