6. what is data skew in pyspark | pyspark interview questions & answers | databricks interview q & a

7. modifiedbefore & modifiedafter in reading in pyspark | pyspark interview questions & answers

Processing 25GB of data in Spark | How many Executors and how much Memory per Executor is required.

when you have plan B 😂

Magic or …? 😱 reveal video on profile 🫢

Gli occhiali da sole non mi hanno coperto! 😎

When you discover a family secret

6. what is data skew in pyspark | pyspark interview questions & answers | databricks interview q & a

Рет қаралды 4,366

SS UNITECH

Күн бұрын

Пікірлер: 17

@motivational_thoughts6760

@motivational_thoughts6760 7 күн бұрын

00:05 Data skew occurs when data is unevenly partitioned 00:53 Data skew causes uneven processing in distributed systems. 01:46 Data skew in large data processing 02:38 Identifying and dealing with data skew in PySpark. 03:28 Data skew can lead to inefficient processing in PySpark 04:26 Understanding partitions and rows distribution in PySpark dataframes 05:19 Partitioning data frame for efficient processing 06:11 Managing data skew in PySpark

@amritasingh1769

@amritasingh1769 7 ай бұрын

One more very informative video. Keep uploading videos like this..

@ssunitech6890 7 ай бұрын

Thanks 🙏

@surajpatil4940

@surajpatil4940 6 ай бұрын

Well explained question along with the real time example.

@ssunitech6890 6 ай бұрын

Thanks Please share to others

@tejasgangurde1998

@tejasgangurde1998 6 ай бұрын

Very informative video

@ssunitech6890 6 ай бұрын

Thanks, Please share to others

@BalaMurugan-kb8ri

@BalaMurugan-kb8ri 2 ай бұрын

----> 1 df1=df.select(spark_partition_id().alias('partid')).groupBy('partid').count() NameError: name 'spark_partition_id' is not defined Sir I am getting above error

@MsMohanj 5 ай бұрын

How I know sir project allocation how many core on my project according only I can create the partition right

@ssunitech6890 5 ай бұрын

You can check that in your cluster that what is the configuration of that.

@goluSingh-su1xs

@goluSingh-su1xs 7 ай бұрын

Superb explanation bro ❤

@ssunitech6890 7 ай бұрын

Thanks

@vaddenata6735 7 ай бұрын

Thank you sir

@ssunitech6890 7 ай бұрын

Thanks Please share to others

@shivamchandan50

@shivamchandan50 7 ай бұрын

plz share dataset also

@ssunitech6890 7 ай бұрын

Will update tomorrow

@motivational_thoughts6760

@motivational_thoughts6760 7 күн бұрын

plz share code

7. modifiedbefore & modifiedafter in reading in pyspark | pyspark interview questions & answers

6:20

7. modifiedbefore & modifiedafter in reading in pyspark | pyspark interview questions & answers

SS UNITECH

Рет қаралды 2,9 М.

Processing 25GB of data in Spark | How many Executors and how much Memory per Executor is required.

14:20

Processing 25GB of data in Spark | How many Executors and how much Memory per Executor is required.

Clever Studies

Рет қаралды 19 М.

when you have plan B 😂

00:11

when you have plan B 😂

Andrey Grechka

Рет қаралды 24 МЛН

00:14

Magic or …? 😱 reveal video on profile 🫢

Andrey Grechka

Рет қаралды 57 МЛН

Gli occhiali da sole non mi hanno coperto! 😎

00:13

Gli occhiali da sole non mi hanno coperto! 😎

Senza Limiti

Рет қаралды 24 МЛН

When you discover a family secret

00:59

When you discover a family secret

im_siowei

Рет қаралды 33 МЛН

37. schema comparison in pyspark | How to Compare Two DataFrames in PySpark | pyspark interview

12:19

37. schema comparison in pyspark | How to Compare Two DataFrames in PySpark | pyspark interview

SS UNITECH

Рет қаралды 3,6 М.

Top 15 Spark Interview Questions in less than 15 minutes Part-2 #bigdata #pyspark #interview

12:46

Top 15 Spark Interview Questions in less than 15 minutes Part-2 #bigdata #pyspark #interview

Sumit Mittal

Рет қаралды 18 М.

KNOW the difference between Data Base // Data Warehouse // Data Lake (Easy Explanation👌)

8:10

KNOW the difference between Data Base // Data Warehouse // Data Lake (Easy Explanation👌)

Chandoo

Рет қаралды 504 М.

The ONLY PySpark Tutorial You Will Ever Need.

17:21

The ONLY PySpark Tutorial You Will Ever Need.

Moran Reznik

Рет қаралды 131 М.

8. how to read files from subfolders in pyspark | how to create zip file in pyspark

8:23

8. how to read files from subfolders in pyspark | how to create zip file in pyspark

SS UNITECH

Рет қаралды 2,3 М.

12. how partition works internally in PySpark | partition by pyspark interview q & a | #pyspark

7:13

12. how partition works internally in PySpark | partition by pyspark interview q & a | #pyspark

SS UNITECH

Рет қаралды 2 М.

10. How to load only correct records in pyspark | How to Handle Bad Data in pyspark #pyspark

6:27

10. How to load only correct records in pyspark | How to Handle Bad Data in pyspark #pyspark

SS UNITECH

Рет қаралды 1,8 М.

11. How to handle corrupt records in pyspark | How to load Bad Data in error file pyspark | #pyspark

6:50

11. How to handle corrupt records in pyspark | How to load Bad Data in error file pyspark | #pyspark

SS UNITECH

Рет қаралды 2,8 М.

Part 1: Cracking Databricks Interview: Top Questions Answered with Detailed Explanations!

16:29

Part 1: Cracking Databricks Interview: Top Questions Answered with Detailed Explanations!

The Data Master

Рет қаралды 26 М.

Why Data Skew Will Ruin Your Spark Performance

12:36

Why Data Skew Will Ruin Your Spark Performance

Afaque Ahmad

Рет қаралды 4,9 М.

when you have plan B 😂

00:11

when you have plan B 😂

Andrey Grechka

Рет қаралды 24 МЛН