4 Recently asked Pyspark Coding Questions | Apache Spark Interview

  Рет қаралды 35,533

Sumit Mittal

Sumit Mittal

Күн бұрын

Пікірлер: 38
@sopankardile2603
@sopankardile2603 11 ай бұрын
One of the best interview series Thank you sumit sir .
@sumitmittal07
@sumitmittal07 11 ай бұрын
glad to know that you liked it.
@adityatomar9820
@adityatomar9820 11 ай бұрын
One of the great explanation so far on youtube. I wish i could afford your course :(
@souradeep.official
@souradeep.official 5 ай бұрын
Need more Pyspark Interview Solutions like this 😊
@ritikadamani2008
@ritikadamani2008 3 ай бұрын
Best selection of questions and very good explanation.
@abhyaravya421
@abhyaravya421 3 ай бұрын
Thanks a lot, Sumit! I am a senior data engineer with 5 years of exp but since we don't work with dataframes or pyspark mostly I am not able to do these simple things.
@praptijoshi9102
@praptijoshi9102 9 ай бұрын
You are doing a great job posting these❤
@veerugandhad3437
@veerugandhad3437 11 ай бұрын
Very useful informative video which gives more confidence to the bigdata aspirants. Thanks Sumit.
@MamtaChoudhary-c4i
@MamtaChoudhary-c4i 19 күн бұрын
Thank You sir for the best explanation. Can you please come up with more examples?
@singhjirajeev
@singhjirajeev 9 ай бұрын
00:03 Recently asked Pyspark Coding Questions 02:37 Writing and executing Pyspark pseudo code 05:21 Creating a Spark dataframe from input and performing group by aggregation 08:04 Using aggregation functions and collect list in Pyspark. 11:15 Spark SQL solution for creating DataFrame and running queries. 14:18 Understanding the data frame reader API for reading JSON and the usage of explode function 17:11 Creating a Spark dataframe and performing operations on it. 19:44 Converting string to date and performing group by in Pyspark DataFrame 22:32 Finding the average stock value using PySpark 25:38 Practice more on data frames for interviews 28:15 Practice more to gain confidence in writing correct syntax for Pyspark coding
@gudiatoka
@gudiatoka 11 ай бұрын
Sir...Share need more .. please continue this playlist
@naveenkumar-oq6zi
@naveenkumar-oq6zi 15 күн бұрын
Hi Sumit , Well the last question about aggregation and max average of stock , there should be time also with date. Because originally at different times the prices of stock changes. Then we need to convert it into yyyy-MM-dd format to get the day specific stock , get their average and then max of avg. Just thought of sharing. Well overall implementation would still be same :) cheers
@venugopal-nc3nz
@venugopal-nc3nz 11 ай бұрын
It will be great if you put questions in comment . Others can try without looking at solution first
@SusheelGajbinkar
@SusheelGajbinkar 5 ай бұрын
Thank you sir😄
@satishutnal
@satishutnal 11 ай бұрын
Best explanation sir thanks
@sumitmittal07
@sumitmittal07 11 ай бұрын
I am happy to hear this
@rohit-ll3rj
@rohit-ll3rj 9 ай бұрын
We can apply distinct() too I guess for avoiding duplicate values in df.
@sravankumar1767
@sravankumar1767 10 ай бұрын
Superb
@NextGen_Tech_Hindi
@NextGen_Tech_Hindi 11 ай бұрын
thanks sumit make videos like this .
@sumitmittal07
@sumitmittal07 11 ай бұрын
definitely
@2412_Sujoy_Das
@2412_Sujoy_Das 11 ай бұрын
Much needed sir.....!!!
@sumitmittal07
@sumitmittal07 11 ай бұрын
Sujoy, I am sure you will enjoy watching this.
@anjibabumakkena
@anjibabumakkena 11 ай бұрын
Nice explanation sir, kindly post scenario based questions
@sumitmittal07
@sumitmittal07 11 ай бұрын
yes for sure
@shashankgupta2776
@shashankgupta2776 8 ай бұрын
Thank you Sir greatly explained, would be good if you can post data/schemas also in the decription box for us to query and do hands on. Thanks.! :)
@prasoonvijay5775
@prasoonvijay5775 11 ай бұрын
Hi Sumit, Could you please create Video explaining pipelines on AWS Databricks End-End along with Orchestration of those.
@NextGen_Tech_Hindi
@NextGen_Tech_Hindi 11 ай бұрын
What about remaining 10 questions on pyspark you told we are covering it in next video but still you not uploaded on KZbin and when you will upload it on KZbin we are waiting for remaining 10 questions on pyspark Thank you ❤
@mdasif2411
@mdasif2411 11 ай бұрын
Hi Sir, can we not write in Spark sql in interview? As there is no difference in performance.
@Nikhil-qi4oz
@Nikhil-qi4oz 11 ай бұрын
Amazing sir
@sumitmittal07
@sumitmittal07 11 ай бұрын
Nikhil, I am sure you will find it useful.
@TheUMESH34
@TheUMESH34 11 ай бұрын
This is great!
@sumitmittal07
@sumitmittal07 11 ай бұрын
thank you Umesh
@sharankarchella2688
@sharankarchella2688 11 ай бұрын
Nice video
@sumitmittal07
@sumitmittal07 11 ай бұрын
thank you
@rudrakasha-t1v
@rudrakasha-t1v 10 ай бұрын
in question number 2 = do we not need to remove duplicate as last can you please clear me on it ?
@VinodKumarChouhan-o8c
@VinodKumarChouhan-o8c 11 ай бұрын
Hello sir, how can I run pyspark code online, are you also using any online utilty to run pyspark code as shown in this video , could you please share the source, it would be very helpful.
@sonurohini6764
@sonurohini6764 7 ай бұрын
Sir create coding interview playlist
@RAHULKUMAR-px8em
@RAHULKUMAR-px8em Ай бұрын
Q2. Data=[('a','aa',1), ('a','aa',2), ('b','bb',5), ('b','bb',3), ('b','bb',4)] data_schema= "col1 string, col2 string, col3 int" df_data=spark.createDataFrame(data=Data,schema=data_schema) df_data.display() from pyspark.sql.functions import * from pyspark.sql.types import * result = ( df_data.groupBy(col('col1'),col('col2'))\ .agg(collect_set(col('col3'))) ) result.display()
10 recently asked Pyspark Interview Questions | Big Data Interview
28:36
Сестра обхитрила!
00:17
Victoria Portfolio
Рет қаралды 958 М.
My scorpion was taken away from me 😢
00:55
TyphoonFast 5
Рет қаралды 2,7 МЛН
coco在求救? #小丑 #天使 #shorts
00:29
好人小丑
Рет қаралды 120 МЛН
Леон киллер и Оля Полякова 😹
00:42
Канал Смеха
Рет қаралды 4,7 МЛН
The ONLY PySpark Tutorial You Will Ever Need.
17:21
Moran Reznik
Рет қаралды 149 М.
PySpark Interview Questions & Answers | PySpark Interview Questions
9:59
learn by doing it
Рет қаралды 10 М.
Сестра обхитрила!
00:17
Victoria Portfolio
Рет қаралды 958 М.