Job, Stage and Task in Apache Spark | PySpark interview questions

  Рет қаралды 1,954

The Big Data Show

The Big Data Show

Күн бұрын

Пікірлер: 9
@Simrankotiya10
@Simrankotiya10 3 ай бұрын
Great explaination
@ChetanSharma-oy4ge
@ChetanSharma-oy4ge 6 ай бұрын
What if count function we used along with some variable and transformation?
@TheBigDataShow
@TheBigDataShow 6 ай бұрын
count is a tricky action. Most Data Engineers actually get confused with this. Ideally, count() is an action and should create a brand new JOB but Apache spark is a very smart computing engine and it uses its source and predicate pushdown and purning, if source stores the value of count() in their meta data then it will directly fetch the value of count() instead of creating a brand new JOB.
@ChetanSharma-oy4ge
@ChetanSharma-oy4ge 6 ай бұрын
@@TheBigDataShow Great, Thanks for answering ...do we have some other examples as well? or the resources from where i can get these concepts?
@siddheshchavan2069
@siddheshchavan2069 6 ай бұрын
Can you make end to end data engineering projects?
@TheBigDataShow
@TheBigDataShow 6 ай бұрын
I have already created one. Please check the channel. There is no prerequisite for this 3-hour long video and project. You just need to know the basics of PySpark. Please check the link. kzbin.info/www/bejne/eJ26hGecpLNsmbssi=qL0ZSXBELEEKe2L2
@siddheshchavan2069
@siddheshchavan2069 6 ай бұрын
@@TheBigDataShow great, thanks!
@debabratabar2008
@debabratabar2008 6 ай бұрын
is below correct ? df_count = example_df.count() ----> transformation example_df.count() ---> job ?
@NiteeshKumarPinjala
@NiteeshKumarPinjala 4 ай бұрын
No, count() it self is an action. In First line itself it will create Job
Repartition vs. Coalesce in Apache Spark | PySpark interview questions
19:22
Chain Game Strong ⛓️
00:21
Anwar Jibawi
Рет қаралды 37 МЛН
So Cute 🥰 who is better?
00:15
dednahype
Рет қаралды 18 МЛН
Quando eu quero Sushi (sem desperdiçar) 🍣
00:26
Los Wagners
Рет қаралды 14 МЛН
УДИВИЛ ВСЕХ СВОИМ УХОДОМ!😳 #shorts
00:49
Salting in Apache Spark - Part I
17:46
The Big Data Show
Рет қаралды 2 М.
Salting in Apache Spark - Part II
15:09
The Big Data Show
Рет қаралды 1 М.
Apache Spark End-To-End Data Engineering Project | Apple Data Analysis
3:01:19
What Is 7Ps In Marketing? - BusinessGuide360.com
3:25
BusinessGuide360
Рет қаралды 4
All about Spark DAGs
14:09
BigData Thoughts
Рет қаралды 16 М.
Chain Game Strong ⛓️
00:21
Anwar Jibawi
Рет қаралды 37 МЛН