6. what is data skew in pyspark | pyspark interview questions & answers | databricks interview q & a

  Рет қаралды 4,366

SS UNITECH

SS UNITECH

Күн бұрын

Пікірлер: 17
@motivational_thoughts6760
@motivational_thoughts6760 7 күн бұрын
00:05 Data skew occurs when data is unevenly partitioned 00:53 Data skew causes uneven processing in distributed systems. 01:46 Data skew in large data processing 02:38 Identifying and dealing with data skew in PySpark. 03:28 Data skew can lead to inefficient processing in PySpark 04:26 Understanding partitions and rows distribution in PySpark dataframes 05:19 Partitioning data frame for efficient processing 06:11 Managing data skew in PySpark
@amritasingh1769
@amritasingh1769 7 ай бұрын
One more very informative video. Keep uploading videos like this..
@ssunitech6890
@ssunitech6890 7 ай бұрын
Thanks 🙏
@surajpatil4940
@surajpatil4940 6 ай бұрын
Well explained question along with the real time example.
@ssunitech6890
@ssunitech6890 6 ай бұрын
Thanks Please share to others
@tejasgangurde1998
@tejasgangurde1998 6 ай бұрын
Very informative video
@ssunitech6890
@ssunitech6890 6 ай бұрын
Thanks, Please share to others
@BalaMurugan-kb8ri
@BalaMurugan-kb8ri 2 ай бұрын
----> 1 df1=df.select(spark_partition_id().alias('partid')).groupBy('partid').count() NameError: name 'spark_partition_id' is not defined Sir I am getting above error
@MsMohanj
@MsMohanj 5 ай бұрын
How I know sir project allocation how many core on my project according only I can create the partition right
@ssunitech6890
@ssunitech6890 5 ай бұрын
You can check that in your cluster that what is the configuration of that.
@goluSingh-su1xs
@goluSingh-su1xs 7 ай бұрын
Superb explanation bro ❤
@ssunitech6890
@ssunitech6890 7 ай бұрын
Thanks
@vaddenata6735
@vaddenata6735 7 ай бұрын
Thank you sir
@ssunitech6890
@ssunitech6890 7 ай бұрын
Thanks Please share to others
@shivamchandan50
@shivamchandan50 7 ай бұрын
plz share dataset also
@ssunitech6890
@ssunitech6890 7 ай бұрын
Will update tomorrow
@motivational_thoughts6760
@motivational_thoughts6760 7 күн бұрын
plz share code
when you have plan B 😂
00:11
Andrey Grechka
Рет қаралды 24 МЛН
Magic or …? 😱 reveal video on profile 🫢
00:14
Andrey Grechka
Рет қаралды 57 МЛН
Gli occhiali da sole non mi hanno coperto! 😎
00:13
Senza Limiti
Рет қаралды 24 МЛН
When you discover a family secret
00:59
im_siowei
Рет қаралды 33 МЛН
The ONLY PySpark Tutorial You Will Ever Need.
17:21
Moran Reznik
Рет қаралды 131 М.
Why Data Skew Will Ruin Your Spark Performance
12:36
Afaque Ahmad
Рет қаралды 4,9 М.
when you have plan B 😂
00:11
Andrey Grechka
Рет қаралды 24 МЛН