Processing 25GB of data in Spark | How many Executors and how much Memory per Executor is required.

  Рет қаралды 30,971

Clever Studies

Clever Studies

Күн бұрын

Пікірлер
@shafimahmed7711
@shafimahmed7711 3 ай бұрын
Your explanation of the Spark cluster and memory configurations was excellent. I really appreciate it!
@anubhavsingh2290
@anubhavsingh2290 9 ай бұрын
Simple explanation Great sir 🙌
@cleverstudies
@cleverstudies 9 ай бұрын
Thanq
@shivamchandan50
@shivamchandan50 9 ай бұрын
plz make video on pyspark unit testing
@anirudh2704
@anirudh2704 4 ай бұрын
Good explanation. Spark is all about good resource allocation or use and optimization
@kusumal190
@kusumal190 3 ай бұрын
Amazing explanation 👍🏻👏🏻👏🏻👏🏻
@sravankumar1767
@sravankumar1767 5 ай бұрын
Superb explanation 👌 👏 👍
@yadi4diamond
@yadi4diamond 9 ай бұрын
You are simply superb.
@cleverstudies
@cleverstudies 9 ай бұрын
Thank you 🙏
@sureshpujari2510
@sureshpujari2510 5 ай бұрын
Awesome explanation
@arindamnath1233
@arindamnath1233 8 ай бұрын
Wonderful Explanation.
@aditya9c
@aditya9c 9 ай бұрын
If num of partition is 200 ... And so it the number of core required ... So core size is 128mb ... Right ? Then how in 3rd block core size turn to 512mb and thus executer is then 4*512 ????
@PravinUser
@PravinUser 7 ай бұрын
in each core memory should be minimum 4 times of data it is going to process(128mb) roughly it should be minimum 512 mb of memory.
@snagendra5415
@snagendra5415 20 сағат бұрын
How many nodes does it requires ? Can anyone help me
@tanushreenagar3116
@tanushreenagar3116 7 ай бұрын
perfect video sir
@VikasChavan-v1c
@VikasChavan-v1c 9 ай бұрын
for example you are assigning 25 executors instead of 50 then in each executors there will be 8 cores and parallel task will be run(25*8). Then also it will take 5 mins only to complete the job then how 10min. can you please explain this point once again?
@vamshi878
@vamshi878 9 ай бұрын
For each executor 2-5 cores should be there, so he is saying he is going to take 4 this number is fixed, if the data size increased or increased
@kingoyster3246
@kingoyster3246 8 ай бұрын
what if we have limited resource? what configuration would you recommend to process 25GB? (16 cores and 32GB)
@paulinaadamski8233
@paulinaadamski8233 7 ай бұрын
You would have to choose between an increased partition size or lowered parallelism with an increased number of partitions.
@Rakesh-q7m8r
@Rakesh-q7m8r 9 ай бұрын
Hi, Does the same study applies if we are working in Data Bricks?
@bhanuprakashtadepalli7248
@bhanuprakashtadepalli7248 8 ай бұрын
yes, its same logic
@Fresh-sh2gc
@Fresh-sh2gc 9 ай бұрын
In my company the cpu per executor is 5 min and 8 max.
@cleverstudies
@cleverstudies 9 ай бұрын
It depends on the use case and resources availability.
@Fresh-sh2gc
@Fresh-sh2gc 9 ай бұрын
@@cleverstudies depends on cluster. We have a state of the art one over $1b data center that can support high cpu’s per executor
@shibhamalik1274
@shibhamalik1274 9 ай бұрын
Is it that each core would take 4 * partition size memory ?
@anirudh2704
@anirudh2704 4 ай бұрын
1 executor.the best configuration is1 executor = 4 cores, 512 mb There's concept of fat and thin executors
@shibhamalik1274
@shibhamalik1274 9 ай бұрын
There are 200 cores in total . Each core will use one partition at a time so will use 128MB Each executor has 4 core so each executor requires 4*128 MB which is 512 mb. Where does extra 4 multiplier came from ?😊
@bhanuprakashtadepalli7248
@bhanuprakashtadepalli7248 8 ай бұрын
by default, to process a file in one core, we need 4 times the file size memory.
@anirudh2704
@anirudh2704 4 ай бұрын
Spark is in memory processing. So it requires min 512mb of memory to perform cache, persist, shuffling and overhead tasks. 1 core handles 1 block of data.
@kamatchiprabu
@kamatchiprabu 9 ай бұрын
Sir,I want to join Job ready program.How to join .Link is not enabled.pls help
@cleverstudies
@cleverstudies 9 ай бұрын
Sorry, we are not conducting CSJRP sessions at present. Please check our website www.cleverstudies.in for more details.
@dineshughade6741
@dineshughade6741 8 ай бұрын
Zuper
@Amarjeet-fb3lk
@Amarjeet-fb3lk 7 ай бұрын
What is use of giving each core 512 mb,if blcok size is 128 MB. Each block process on a single core,so if each block is 128 mb, why we should give 512mb To each core? There will be wastage of memory,Am I right? Please explain this. Thanks
@debayanmitter
@debayanmitter 5 ай бұрын
The memory is for processing, not for storage.
@anirudh2704
@anirudh2704 4 ай бұрын
The min req of executor is 4-5 cores and 512 mb memory. 1 core can handle 1 block data. And as spark is in memory processing so it requires memory space for cache, persist, shuffling etc
Spark [Driver and Executor] Memory Management Deep Dive
13:36
Clever Studies
Рет қаралды 8 М.
Don’t Choose The Wrong Box 😱
00:41
Topper Guild
Рет қаралды 62 МЛН
coco在求救? #小丑 #天使 #shorts
00:29
好人小丑
Рет қаралды 120 МЛН
10 recently asked Pyspark Interview Questions | Big Data Interview
28:36
Database Sharding and Partitioning
23:53
Arpit Bhayani
Рет қаралды 107 М.
Event-Driven Architecture (EDA) vs Request/Response (RR)
12:00
Confluent
Рет қаралды 179 М.
4 Recently asked Pyspark Coding Questions | Apache Spark Interview
28:39
Spark Runtime Architecture (Cluster Mode) | #pyspark  | #databricks
25:38
Spark Memory Management | How to calculate the cluster Memory in Spark
18:52
The ONLY PySpark Tutorial You Will Ever Need.
17:21
Moran Reznik
Рет қаралды 147 М.
Don’t Choose The Wrong Box 😱
00:41
Topper Guild
Рет қаралды 62 МЛН