Data validation between source and target table | PySpark Interview Question |

  Рет қаралды 2,117

GeekCoders

GeekCoders

3 ай бұрын

Hello Everyone,
source_data = [(1,'A'),(2,'B'),(3,'C'),(4,'D'),(5,'E')]
source_schema = ['id','name']
source_df = spark.createDataFrame(source_data,source_schema)
source_df.show()
target_data = [(1,'A'),(2,'B'),(3,'X'),(4,'F'),(6,'G')]
target_schema = ['id','name']
target_df = spark.createDataFrame(target_data,target_schema)
target_df.show()
This series is for beginners and intermediate level candidates who wants to crack PySpark interviews
Here is the link to the course : www.geekcoders.co.in/courses/...
#pyspark #interviewquestions #interview #pysparkinterview #dataengineer #aws #databricks #python

Пікірлер: 8
@jhonsen9842
@jhonsen9842 3 ай бұрын
Main Problem i found in learning Pyspark is brackets every time it gives me some error.
@GeekCoders
@GeekCoders 3 ай бұрын
Yes
@nishirajnikku969
@nishirajnikku969 2 ай бұрын
I request you to please create a playlist for Pyspark Unit testing .
@gudiatoka
@gudiatoka 3 ай бұрын
exceptAll can be usefull too or anti join
@GeekCoders
@GeekCoders 3 ай бұрын
Except all may miss the null value sometime
@shivamchandan50
@shivamchandan50 2 ай бұрын
plz make video on pyspark unit testing
@rishabhkesarwani-br2rx
@rishabhkesarwani-br2rx 2 ай бұрын
I do below steps to compare source vs target table 1) Count should be matching in source and target table 2) Schema should be matching in source and target table 3) Use the except and to check if any records are there which are present in source and not in target or vice versa. 4) Use the left anti join to find out the records which are not matching. 5) Trying to debug why there is record mismatch
@GeekCoders
@GeekCoders 2 ай бұрын
Nice
Flatten Nested Json in PySpark
9:22
GeekCoders
Рет қаралды 2,2 М.
Женская драка в Кызылорде
00:53
AIRAN
Рет қаралды 470 М.
10 recently asked Pyspark Interview Questions | Big Data Interview
28:36
Solve Globant PySpark Interview Question | Apache Spark |
11:12
GeekCoders
Рет қаралды 3,6 М.
PySpark Interview Questions & Answers | PySpark Interview Questions
9:59
I gave 127 interviews. Top 5 Algorithms they asked me.
8:36
Sahil & Sarra
Рет қаралды 627 М.
Mphasis ETL Testing Interview Question
13:41
Tester Hoon
Рет қаралды 12 М.
НИЧЕГО СЛОЖНОГО
0:21
KINO KAIF
Рет қаралды 3 МЛН
КАЧЕЛИ ИЗ АРБУЗА #юмор#cat  #топ
0:24
Лайки Like
Рет қаралды 1,8 МЛН
Finger Heart - Fancy Refill (Inside Out Animation)
0:30
FASH
Рет қаралды 14 МЛН
Пранк над Махачевым🥶
0:19
FERMACHI
Рет қаралды 10 МЛН
бим бам бум💥💥 типа..
0:18
Ma1x1
Рет қаралды 1,9 МЛН