6.1 Repartition and Coalesce | Spark Interview questions

Рет қаралды 12,289

Data Savvy

Күн бұрын

Пікірлер: 21

@SunilKottha 5 жыл бұрын

You are rocking boss..Its helping me a lot

@ajinkyahatolkar294 Жыл бұрын

Is it possible to a make switch into big data without prior/relevant exp. ?

@nehabansal677 5 жыл бұрын

Finally got the concepts cleared

@DataSavvy 5 жыл бұрын

Thanks Neha

@DataSavvy 5 жыл бұрын

Thanks

@pandurangbhadange25 7 ай бұрын

repartition: 1. is used to increase or decrease the RDD/DataFrame partitions 2. More shuffle Coalesce : 2. Reduce the partition 2. No shuffle 3. Less expensive

@mohans3143 5 жыл бұрын

Well explained but it would be explained by using some use cases.. We can get definitions in google. Now a days it is needed to explain everything in practical.

@aparnashrivastava4882 4 жыл бұрын

in which use case repartition and coalesce be used?

@DataSavvy 4 жыл бұрын

Repartition calls full shuffle to create equal size partitions... Coalesce tries to combine existing partitions and reduce no of partitions... Coalesce is used for decreasing no of partitions... Repartition can be used to decrease or increase partitions

@Kassadhy 5 жыл бұрын

Well explained!!!!!

@gauravpathak7017 5 жыл бұрын

Harjeet-On what basis this partition happens?

@surenderraja1304 4 жыл бұрын

Does Coalesce() happen in map side or reduce side ? Does repartition() happen in map side or reduce side ().

@Nikita-fy7js 3 жыл бұрын

there is no map reduce in spark....everything happens in memory so there is no concept of map reduce here

@vkd9442 4 жыл бұрын

Dude.. Audio is too low.. Can u pls rectify it

@DataSavvy 4 жыл бұрын

I tried changing it... Somehow KZbin is not allowing to do so... This is improved in New videos

@ampolusantosh5350 6 жыл бұрын

how can w know one partiton has high data,one partition has low data

@DataSavvy 6 жыл бұрын

Following will give you a new RDD which will help u get size of each partition in terms of records rdd.mapPartitions(iter => Array(iter.size).iterator, true)

@ampolusantosh5350 6 жыл бұрын

expalin diff between linage vs DAG

@DataSavvy 6 жыл бұрын

Here is your video my friend... kzbin.info/www/bejne/hHiydWqAg5uUsK8

@ampolusantosh5350 6 жыл бұрын

in wide transfermation also we can give no.of partition.so what is diff groupByKey(8) vs repartition(8)

@DataSavvy 6 жыл бұрын

Will create a video on this