Azure Cloud Data Engineer Mock Interview | Important Questions asked in Big Data Interviews| Pyspark

  ะ ะตั‚ า›ะฐั€ะฐะปะดั‹ 4,551

Sumit Mittal

Sumit Mittal

2 ะฐะน ะฑาฑั€ั‹ะฝ

๐“๐จ ๐ž๐ง๐ก๐š๐ง๐œ๐ž ๐ฒ๐จ๐ฎ๐ซ ๐œ๐š๐ซ๐ž๐ž๐ซ ๐š๐ฌ ๐š ๐‚๐ฅ๐จ๐ฎ๐ ๐ƒ๐š๐ญ๐š ๐„๐ง๐ ๐ข๐ง๐ž๐ž๐ซ, ๐‚๐ก๐ž๐œ๐ค trendytech.in/?src=youtube&su... for curated courses developed by me.
I have trained over 20,000+ professionals in the field of Data Engineering in the last 5 years.
๐–๐š๐ง๐ญ ๐ญ๐จ ๐Œ๐š๐ฌ๐ญ๐ž๐ซ ๐’๐๐‹? ๐‹๐ž๐š๐ซ๐ง ๐’๐๐‹ ๐ญ๐ก๐ž ๐ซ๐ข๐ ๐ก๐ญ ๐ฐ๐š๐ฒ ๐ญ๐ก๐ซ๐จ๐ฎ๐ ๐ก ๐ญ๐ก๐ž ๐ฆ๐จ๐ฌ๐ญ ๐ฌ๐จ๐ฎ๐ ๐ก๐ญ ๐š๐Ÿ๐ญ๐ž๐ซ ๐œ๐จ๐ฎ๐ซ๐ฌ๐ž - ๐’๐๐‹ ๐‚๐ก๐š๐ฆ๐ฉ๐ข๐จ๐ง๐ฌ ๐๐ซ๐จ๐ ๐ซ๐š๐ฆ!
"๐€ 8 ๐ฐ๐ž๐ž๐ค ๐๐ซ๐จ๐ ๐ซ๐š๐ฆ ๐๐ž๐ฌ๐ข๐ ๐ง๐ž๐ ๐ญ๐จ ๐ก๐ž๐ฅ๐ฉ ๐ฒ๐จ๐ฎ ๐œ๐ซ๐š๐œ๐ค ๐ญ๐ก๐ž ๐ข๐ง๐ญ๐ž๐ซ๐ฏ๐ข๐ž๐ฐ๐ฌ ๐จ๐Ÿ ๐ญ๐จ๐ฉ ๐ฉ๐ซ๐จ๐๐ฎ๐œ๐ญ ๐›๐š๐ฌ๐ž๐ ๐œ๐จ๐ฆ๐ฉ๐š๐ง๐ข๐ž๐ฌ ๐›๐ฒ ๐๐ž๐ฏ๐ž๐ฅ๐จ๐ฉ๐ข๐ง๐  ๐š ๐ญ๐ก๐จ๐ฎ๐ ๐ก๐ญ ๐ฉ๐ซ๐จ๐œ๐ž๐ฌ๐ฌ ๐š๐ง๐ ๐š๐ง ๐š๐ฉ๐ฉ๐ซ๐จ๐š๐œ๐ก ๐ญ๐จ ๐ฌ๐จ๐ฅ๐ฏ๐ž ๐š๐ง ๐ฎ๐ง๐ฌ๐ž๐ž๐ง ๐๐ซ๐จ๐›๐ฅ๐ž๐ฆ."
๐‡๐ž๐ซ๐ž ๐ข๐ฌ ๐ก๐จ๐ฐ ๐ฒ๐จ๐ฎ ๐œ๐š๐ง ๐ซ๐ž๐ ๐ข๐ฌ๐ญ๐ž๐ซ ๐Ÿ๐จ๐ซ ๐ญ๐ก๐ž ๐๐ซ๐จ๐ ๐ซ๐š๐ฆ -
๐‘๐ž๐ ๐ข๐ฌ๐ญ๐ซ๐š๐ญ๐ข๐จ๐ง ๐‹๐ข๐ง๐ค (๐‚๐จ๐ฎ๐ซ๐ฌ๐ž ๐€๐œ๐œ๐ž๐ฌ๐ฌ ๐Ÿ๐ซ๐จ๐ฆ ๐ˆ๐ง๐๐ข๐š) : rzp.io/l/SQLINR
๐‘๐ž๐ ๐ข๐ฌ๐ญ๐ซ๐š๐ญ๐ข๐จ๐ง ๐‹๐ข๐ง๐ค (๐‚๐จ๐ฎ๐ซ๐ฌ๐ž ๐€๐œ๐œ๐ž๐ฌ๐ฌ ๐Ÿ๐ซ๐จ๐ฆ ๐จ๐ฎ๐ญ๐ฌ๐ข๐๐ž ๐ˆ๐ง๐๐ข๐š) : rzp.io/l/SQLUSD
BIG DATA INTERVIEW SERIES
This mock interview series is launched as a community initiative under Data Engineers Club aimed at aiding the community's growth and development
Our highly experienced guest interviewer, Umesh Kumar Roy, / umesh-kumar-roy shares invaluable insights and practical guidance drawn from his extensive expertise in the Big Data Domain.
Our expert guest interviewee, Satyam Meena, / satyam-meena-0a1b46138 has an interesting approach to answering the interview questions on Apache Spark, SQL and Azure Cloud Services.
Link of Free SQL & Python series developed by me are given below -
SQL Playlist - โ€ข SQL tutorial for every...
Python Playlist - โ€ข Complete Python By Sum...
Don't miss out - Subscribe to the channel for more such informative interviews and unlock the secrets to success in this thriving field!
Social Media Links :
LinkedIn - / bigdatabysumit
Twitter - / bigdatasumit
Instagram - / bigdatabysumit
Student Testimonials - trendytech.in/#testimonials
TIMESTAMPS : Questions Discussed
00:50 Introduction
02:10 What sources do you use for data ingestion?
02:25 What connectors do you use for data ingestion?
02:45 How do you store and transform data after ingestion?
03:58 How are you preprocessing the data?
04:41 How do you eliminate duplicate records?
05:12 How do you ensure the correct records when handling duplicates?
05:50 How is your storage layer designed? Do you use mounting techniques?
06:04 Do you use delta files? Why?
07:00 What optimization techniques have you implemented?
08:05 Do you use partitions?
08:24 What factors do you consider when partitioning?
09:11 Do you use bucketing?
09:36 What are the use cases for partitioning and bucketing?
10:33 Besides broadcast joins, what other joins do you use?
10:52 Which join is the most efficient?
11:50 What is the difference between narrow and wide transformations?
12:26 What is your understanding about Spark and Databricks?
13:22 How do you consume data from the gold layer?
14:42 How do you connect Power BI to Azure Synapse?
15:46 Can you outline Spark architecture?
17:07 What is a DAG?
18:15 What is the difference between client mode and cluster mode?
19:29 Have you faced any challenges with cluster mode?
20:50 Why do DataFrames and Datasets exist?
22:17 What do you understand by normalization?
22:51 What other optimization techniques do you use?
23:33 SQL query
Music track: Retro by Chill Pulse
Source: freetouse.com/music
Background Music for Video (Free)
Tags
#mockinterview #bigdata #career #dataengineering #data #datascience #dataanalysis #productbasedcompanies #interviewquestions #apachespark #google #interview #faang #companies #amazon #walmart #flipkart #microsoft #azure #databricks #jobs

ะŸั–ะบั–ั€ะปะตั€: 4
@gudiatoka
@gudiatoka 2 ะฐะน ะฑาฑั€ั‹ะฝ
When someone saying they are optimizing the code in databricks..all are faking๐Ÿ˜‚๐Ÿ˜‚. Spark itself optimize your code using catalytst optimizer/Spark sql engine and after spark 3.0 when Adaptive Query Execution(AQE) introduced it also optimized join during run time and we can alter the broadcast threshold which is part of admin team during databricks cluster creation The only things didnt impact by above two is those things stored inside user defined memory like udfs and low level programming on rdd ops which now a days no one doing in databricks.last one is caching manually also
@SrihariSrinivasDhanakshirur
@SrihariSrinivasDhanakshirur 2 ะฐะน ะฑาฑั€ั‹ะฝ
Not necessarily, there are other lot of optimizations we can do on resource level, partitioning, bucketing etc
@LearnifyTvKannada-ue6op
@LearnifyTvKannada-ue6op ะะน ะฑาฑั€ั‹ะฝ
โ€‹@@SrihariSrinivasDhanakshirurexactly there are a lot of other optimisations
@hdr-tech4350
@hdr-tech4350 26 ะบาฏะฝ ะฑาฑั€ั‹ะฝ
Source type, project discussion Handling duplicates Delta lake feature Spark vs dbx Power bi connect to synapse Spark architecture Dag Client mode vs cluster mode Df vs dataset Normalisation 2nd highest salary in dep
Big Data Engineering Mock Interview | Big Data Pipeline | AWS Cloud Services | Project Architecture
31:41
Sumit Mittal
ะ ะตั‚ า›ะฐั€ะฐะปะดั‹ 9 ะœ.
Tips and Tricks- Azure Data Engineering Interview Questions | Managed Identity vs Service Principal
14:05
Mr. K Talks Tech
ะ ะตั‚ า›ะฐั€ะฐะปะดั‹ 6 ะœ.
New model rc bird unboxing and testing
00:10
Ruhul Shorts
ะ ะตั‚ า›ะฐั€ะฐะปะดั‹ 25 ะœะ›ะ
What it feels like cleaning up after a toddler.
00:40
Daniel LaBelle
ะ ะตั‚ า›ะฐั€ะฐะปะดั‹ 78 ะœะ›ะ
ะ—ะฐั‡ะตะผ ะพะฝ ั‚ัƒะดะฐ ะทะฐะปะตะท?
00:25
Vlad Samokatchik
ะ ะตั‚ า›ะฐั€ะฐะปะดั‹ 3,3 ะœะ›ะ
Delta Live Tables A to Z: Best Practices for Modern Data Pipelines
1:27:52
Databricks
ะ ะตั‚ า›ะฐั€ะฐะปะดั‹ 83 ะœ.
Cloud Data Engineer Mock Interview | PySpark Coding Interview Questions |Azure Databricks #question
31:45
Sumit Mittal
ะ ะตั‚ า›ะฐั€ะฐะปะดั‹ 16 ะœ.
Azure Cloud Data Engineer Interview | Real-time Scenario based Questions & Expert Feedback | BigData
34:56
Sumit Mittal
ะ ะตั‚ า›ะฐั€ะฐะปะดั‹ 4,3 ะœ.
The Harsh Reality of Being a Data Analyst
7:39
Sundas Khalid
ะ ะตั‚ า›ะฐั€ะฐะปะดั‹ 568 ะœ.
Must Watch Live Mock Interview for Aspiring Big Data Engineers | PySpark, Hive & SQL #interview
34:08
Sumit Mittal
ะ ะตั‚ า›ะฐั€ะฐะปะดั‹ 4,9 ะœ.
TCS Live Interview for Azure Data Engineer | Technical round -1 Azure | KSR DATAVIZON
34:05
KSR Datavizon
ะ ะตั‚ า›ะฐั€ะฐะปะดั‹ 112 ะœ.
Live Data Engineering Coding Round Mock Interview | Apache Spark | Big Data Project #interview
50:59
Sumit Mittal
ะ ะตั‚ า›ะฐั€ะฐะปะดั‹ 5 ะœ.
Must Watch Live Mock Interview For Data Engineers | System Design | Data Modeling #interview
59:41
Sumit Mittal
ะ ะตั‚ า›ะฐั€ะฐะปะดั‹ 9 ะœ.
TCS SQL Real Interview BY TCS Team! TCS Interview Recording Simulation! TCS Ninja Hiring
48:02
CodiMinati
ะ ะตั‚ า›ะฐั€ะฐะปะดั‹ 605 ะœ.
New model rc bird unboxing and testing
00:10
Ruhul Shorts
ะ ะตั‚ า›ะฐั€ะฐะปะดั‹ 25 ะœะ›ะ