No video

data engineer interview questions

  Рет қаралды 27,957

MANISH KUMAR

MANISH KUMAR

Күн бұрын

In this video I have talked about salting in spark
Directly connect with me on:- topmate.io/man...
Discord channel:- / discord
Project details for resume :-
.Successfully led a data engineering project in a retail environment using technologies such as Apache Spark, Python, SQL, and Amazon S3 to optimize data processing.
.Implemented structured data models, including dimension and fact tables, to provide valuable context for point-of-sale data analysis.
Designed and executed an incentive program based on sales performance, enhancing motivation among sales teams by rewarding top performers.
Managed extensive daily data volumes of approximately 100GB, demonstrating the ability to handle large-scale data pipelines.
Employed Spark optimization techniques like caching and broadcast joins to improve data processing speed and efficiency.
Utilized Azure CI/CD pipelines for code deployment, and orchestrated workflows using Airflow and CRON jobs.
Detailed writeup to explain more during interview:-
As a Data Engineer on a project for a prominent offline grocery and kitchen supplies retailer, I applied my expertise in data engineering to drive critical improvements in their data processing and analysis operations.
The project primarily focused on processing and analyzing point-of-sale data, which was structured into dimension and fact tables to provide meaningful context for sales analysis. To further enhance employee motivation and performance, we designed and implemented an incentive program that rewarded salespeople with the highest sales volumes in each store.
Handling a substantial daily data volume of approximately 100GB, we leveraged Apache Spark and applied optimization techniques like data caching and broadcast joins to significantly accelerate data processing. This not only improved the speed of our data pipelines but also increased the efficiency of our data analysis.
We seamlessly integrated the code deployment process into the Azure CI/CD pipeline. As part of workflow automation, we orchestrated task scheduling using Airflow and CRON jobs.
One of the project's major achievements was the implementation of a customer engagement strategy that identified infrequent buyers and provided incentives in the form of coupons. This initiative not only boosted customer retention but also had a positive impact on the overall business growth.
For more queries reach out to me on my below social media handle.
Follow me on LinkedIn:- / manish-kumar-373b86176
Follow Me On Instagram:- / competitive_gyan1
Follow me on Facebook:- / manish12340
My Second Channel -- / @competitivegyan1
Interview series Playlist:- • Interview Questions an...
My Gear:-
Rode Mic:-- amzn.to/3RekC7a
Boya M1 Mic-- amzn.to/3uW0nnn
Wireless Mic:-- amzn.to/3TqLRhE
Tripod1 -- amzn.to/4avjyF4
Tripod2:-- amzn.to/46Y3QPu
camera1:-- amzn.to/3GIQlsE
camera2:-- amzn.to/46X190P
Pentab (Medium size):-- amzn.to/3RgMszQ (Recommended)
Pentab (Small size):-- amzn.to/3RpmIS0
Mobile:-- amzn.to/47Y8oa4 ( Aapko ye bilkul nahi lena hai)
Laptop -- amzn.to/3Ns5Okj
Mouse+keyboard combo -- amzn.to/3Ro6GYl
21 inch Monitor-- amzn.to/3TvCE7E
27 inch Monitor-- amzn.to/47QzXlA
iPad Pencil:-- amzn.to/4aiJxiG
iPad 9th Generation:-- amzn.to/470I11X
Boom Arm/Swing Arm:-- amzn.to/48eH2we
My PC Components:-
intel i7 Processor:-- amzn.to/47Svdfe
G.Skill RAM:-- amzn.to/47VFffI
Samsung SSD:-- amzn.to/3uVSE8W
WD blue HDD:-- amzn.to/47Y91QY
RTX 3060Ti Graphic card:- amzn.to/3tdLDjn
Gigabyte Motherboard:-- amzn.to/3RFUTGl
O11 Dynamic Cabinet:-- amzn.to/4avkgSK
Liquid cooler:-- amzn.to/472S8mS
Antec Prizm FAN:-- amzn.to/48ey4Pj

Пікірлер: 116
@aprao8014
@aprao8014 10 ай бұрын
bhai iss video se mein fan hogya aapka. "logon ke pass experiance nahi hain, aur company ko experiance chahiye" 🔥 🔥
@payalbhatia6927
@payalbhatia6927 Ай бұрын
@Manish Kumar. All of your videos are more than a gem if anything exists like this. I am 4-5 YOE and never get to learn spark in such a depth , clarity , concise answers , questions. It is useful for 10 YOE as well I can vouch for it. I have ADHD issue, but your videos are too engaging that I can sit for long with it. I have got interested in learning. You must be an extra ordinary guy. Having knowledge is one thing , presenting it , putting it in so simple manner is what stands you apart. It is very difficult to be simple . Thanks once again
@hritikapal683
@hritikapal683 10 ай бұрын
What a gem content sir 🥺 thankyou so much for in-depth video!
@DEwithDhairy
@DEwithDhairy 7 ай бұрын
DSA interview series for Data Engineer kzbin.info/aero/PLqGLh1jt697wQTamFvXx_Odlm-Wg3zbxq&si=suGxMRqt-uoYkprY
@rameshjadhav4963
@rameshjadhav4963 6 ай бұрын
Hey Manish, I'm extremely thankful to you and all of your playlists. Especially this video is super problem solver one! No one teaches in so much depth as you do. Thanks for taking out time to teach us!!😇🤗
@ritusrivastava8391
@ritusrivastava8391 13 күн бұрын
Wholesome of Data engineer playlist. Great job Manish 🎉
@WolfmaninKannada
@WolfmaninKannada 10 ай бұрын
Sir your amazing. No one has created content till now on this.Wish to see more on this type of content .Being a fresher we need to have a clear idea about how the project works and we should know how to explain project to interviewer.
@shadabahmed8817
@shadabahmed8817 10 ай бұрын
Thanks a lot manish bhaiya, u listen to even individual request. Big thabks to you. Loving your content.
@MCAMadeEasy
@MCAMadeEasy 6 ай бұрын
Bhai apko salute, ekdam sidha, saaf or sach bolne ke liye I will definitely connect with you on Top Mate after getting my Data Engineering job, to thank you! Hopefully usse pehle connect krne ji jarwat na pade
@shadabahmed8817
@shadabahmed8817 10 ай бұрын
Waiting for 2nd part eagerly , related to last project. Please next time usi ko upload krna.
@manish_kumar_1
@manish_kumar_1 10 ай бұрын
I did not get you. Maine saari chije Jo Maine project karwayi hai usi ke related batayi thi. Aap sayad Pura video nhi dekhe hai, ya fir main question nhi samjha
@mnshrm378
@mnshrm378 5 ай бұрын
Hey Manish! I am following all playlists and content, also I have given more than 30 interviews but have not been selected yet because of a scheduler if you can cover any of them or can cover a pipeline including airflow or any one schedular it will be very helpful. Without schedular knowledge, it's incomplete because each and every interview they are asking for it. You are explaining very well so I want to have an explanation in your depth knowledge. Thanks.
@user-gm7fn8ri8i
@user-gm7fn8ri8i 10 ай бұрын
Great Manish you don't say fake e,Xperince many times in your video your doing great job
@preetidalabanjan103
@preetidalabanjan103 11 күн бұрын
you are doing great job
@rishav144
@rishav144 10 ай бұрын
Thanks for amazing content . Spark playlist is amazing
@user-yd1fu7hl6j
@user-yd1fu7hl6j 10 ай бұрын
Wow... Manish bhai really loved this content. Please, I will encourage you to do more videos like this.
@kunalk3830
@kunalk3830 9 ай бұрын
Q.)Data skew is one example for which you do spark optimization, apart from data skew for what you have performed optimization for? Q.)What kind of Issues you have faced in your project while working? Matlab iss question ka ek right systematic approach chahiye tha, idea toh hai topics ka but when I think it the points seems to be scattered.
@vivekpuurkayastha1580
@vivekpuurkayastha1580 10 ай бұрын
Great video Manish .. What you have face problem while doing your projects and how to resolve it . Please answer this question as experience person.
@swapnalikudale2458
@swapnalikudale2458 8 ай бұрын
This whole PlayList helped a lot.💡
@parameshwarbhange9857
@parameshwarbhange9857 10 ай бұрын
What you have face problem while doing your projects and how to resolve it . Please answer this question as experience person
@SantoshKumar-yr2md
@SantoshKumar-yr2md 5 ай бұрын
universal truth of industry you explained
@Oxygen123_abc
@Oxygen123_abc Ай бұрын
Thank you
@user-tb8ry2jl7s
@user-tb8ry2jl7s 10 ай бұрын
Sir got placed as azure data engineer, its all because of you really thank you for everything 🥹🥹 i would like to talk with you
@manish_kumar_1
@manish_kumar_1 10 ай бұрын
Congratulations bhai. Aap linkedin ya Insta par ping kijiye. Social media handle ka link description me mil jayega
@likhithurs8597
@likhithurs8597 8 ай бұрын
Hi heartily congratulations for your success 🙌
@simizcodding4487
@simizcodding4487 8 ай бұрын
Hey I contact to u ...plz drop ur linkedin id
@surajpoojari5182
@surajpoojari5182 5 ай бұрын
Congratulations bro
@chinnasaiprathapmeesala8977
@chinnasaiprathapmeesala8977 4 ай бұрын
Bro can you share your interview preparation questions for Azure Data engineer
@varunmehta591
@varunmehta591 10 ай бұрын
bhai bhai 🙌...ultimate video ❤❤
@ajaypatil1881
@ajaypatil1881 9 ай бұрын
thank you so much bhaiya for amazing content 💝
@shubhamdeshmukh6339
@shubhamdeshmukh6339 3 ай бұрын
Thanks I just completed the playlist
@ManaviVideos
@ManaviVideos 10 ай бұрын
Thanks for the session!!
@prashantmehta2832
@prashantmehta2832 3 ай бұрын
Thank you so much sir for the great explanation, It was the best series I have found in my life. I just have one request from you. Can you please make a video on Cluster manager - Yarn.
@manish_kumar_1
@manish_kumar_1 3 ай бұрын
Noted
@prashantmehta2832
@prashantmehta2832 3 ай бұрын
@@manish_kumar_1 Thanks sir..
@ChandikaRohini
@ChandikaRohini 6 ай бұрын
make a video on coding questions and scenario questions(ex:what if the repartition size increases, how to handle out of memory issues and possible questions which are encountered.
@user-rh1hr5cc1r
@user-rh1hr5cc1r 3 ай бұрын
Bhaiya,, ye spark submit config cloud me toh kahin v nai mila databricks me cluster banate time spark submit jo on prem me karte hai...unke liye hai ky?
@rh334
@rh334 10 ай бұрын
How to do ONPREMISE to CLOUD migration.
@abhayc8015
@abhayc8015 10 ай бұрын
Thank You manish bhaiya
@ajaypatil1881
@ajaypatil1881 9 ай бұрын
most exciting video
@nipun384
@nipun384 10 ай бұрын
THANKSS VROOO LOVE U YR CLEARED INTERVIEW
@Wandering_words_of_INFJ
@Wandering_words_of_INFJ 9 ай бұрын
Hello Sir, firstly thankyou for this amazing content. Truly grateful. I request you to please make an Azure Data Engineer project real project questions to prepare for the interview by collaborating that with databricks. Please
@manish_kumar_1
@manish_kumar_1 9 ай бұрын
Mujhe Azure ke services ki idea nhi hai
@Wandering_words_of_INFJ
@Wandering_words_of_INFJ 9 ай бұрын
@@manish_kumar_1 okay sir, by the way, aise sirf Pyspark Developer ki koi position ni dikhti, aap in future skill sets k upar video banaynge kya ki kon si skills resume par mention karni hai and what are the relevant positions in the industry?
@abhishekchaturvedi9855
@abhishekchaturvedi9855 8 ай бұрын
Manish going through all of your videos I realized almost all of the optimization is based on number of rows. Do we have any optimization where data increases in terms of columns?
@dakait0867
@dakait0867 10 ай бұрын
bhai ek CI/CD par practical detailed video bana do usnign azure devops/databricks please tht will be great help
@sonjoysaha5454
@sonjoysaha5454 7 ай бұрын
great work. informative video. love it. I have a question about the data you receive. Do you receive 100 GB of new data every day?
@manish_kumar_1
@manish_kumar_1 7 ай бұрын
Not in every project but in last project I had an opportunity
@bobbytheman7535
@bobbytheman7535 10 ай бұрын
Why do we need layers in datawarehouse? Can we put for each loop inside another for each loop?
@poojajoshi871
@poojajoshi871 10 ай бұрын
Hi Manish, Got call wherein they are asking to hv exp into AWS glue n pyspark. Please tell me how to incorporate glue with pyspark
@poojajoshi871
@poojajoshi871 10 ай бұрын
Spark I know , glue is etl tool..toh how to use spark with glue
@user-yd1fu7hl6j
@user-yd1fu7hl6j 10 ай бұрын
Bhai table me columns and row ketne and kis type ke hai like - cust_id,refund columns kitne ho skte hai or kis kis type ke bata de
@bhavyamalviya8364
@bhavyamalviya8364 8 ай бұрын
😂😂i thoroughly learnt and enjoyed this video
@ig2947
@ig2947 5 ай бұрын
Amazing...!!!
@vaibhavpoul1067
@vaibhavpoul1067 10 ай бұрын
What a content sir ❤most needed
@widelens_world
@widelens_world 10 ай бұрын
how to analyse our source data in our project so that where we have to perform cleaning operation
@shubhamalsunde3230
@shubhamalsunde3230 10 ай бұрын
nice content Sir
@anweshkumarsahoo3927
@anweshkumarsahoo3927 8 ай бұрын
Shall I add personal project section along with work experience section in Resume for 2 YOE in DE ??
@koeld830
@koeld830 Ай бұрын
What is delta cache?
@vikastiwari9415
@vikastiwari9415 10 ай бұрын
amazing content..
@shaikhrizwan9907
@shaikhrizwan9907 10 ай бұрын
Manish Awesome videos, can you make some videos on Aws Glue job..
@manish_kumar_1
@manish_kumar_1 10 ай бұрын
Mujhe nhi aata hai glue
@loveyourselffirst6565
@loveyourselffirst6565 5 ай бұрын
bhaya aws map reduce pe ek video banao naa, please...
@roshan_off1955
@roshan_off1955 10 ай бұрын
Bro, scheduling jobs me airflow to nhi kia to use question puchega tab kya karenge
@abhijeetsugam
@abhijeetsugam 9 ай бұрын
bahut acha video hai
@tanushreenagar3116
@tanushreenagar3116 8 ай бұрын
Thanks too much
@sanooosai
@sanooosai 4 ай бұрын
thank you sir
@eagleeyetradingacademy
@eagleeyetradingacademy 6 ай бұрын
can we use this project for 3-4 yrs of experiance
@cretive549
@cretive549 10 ай бұрын
Sir maths kitna required h data engineer profile me please reply
@manish_kumar_1
@manish_kumar_1 10 ай бұрын
Nhi required hai
@user-yd1fu7hl6j
@user-yd1fu7hl6j 10 ай бұрын
Bhai thoda cluster se related incoming data se related chalnegs batao 2-3
@greendaywithtrading7408
@greendaywithtrading7408 10 ай бұрын
Why did you stop uploading videos ??? eagerly waiting for new video
@manish_kumar_1
@manish_kumar_1 10 ай бұрын
I was out of station due to job requirements
@princyanghan8734
@princyanghan8734 10 ай бұрын
Why did you stop uploading videos sir, please keep sharing.
@manish_kumar_1
@manish_kumar_1 9 ай бұрын
Started again
@anweshkumarsahoo376
@anweshkumarsahoo376 9 ай бұрын
Manish Bhiaya apne jo aapke resume mein BWAC,MHCDM ye sab keywords use kiye hain wo sab apke roles hai ??
@manish_kumar_1
@manish_kumar_1 9 ай бұрын
Nhi, projects ke name hai
@raghavendrakulkarni3920
@raghavendrakulkarni3920 9 ай бұрын
Platform metric used ?
@sameersuryawanshi145
@sameersuryawanshi145 10 ай бұрын
Muze Spark use krte time error ara h , pls help error like 'remote rpc client issue' due to executor lost failure heartbeat issue pls help
@rahulrai4686
@rahulrai4686 8 ай бұрын
Dsa aana chahiye kya ... Ya phir kisi aur per dhyan dena ha ok
@anketsonawane6651
@anketsonawane6651 10 ай бұрын
Hey Manish can you make video on end to end data engineering project it will be very much helpful to understand data engineering pipeline
@manish_kumar_1
@manish_kumar_1 10 ай бұрын
Aapne na Pura video dekha aur na hi i button me add Kiya hua link. Already project karwa diya hai and link v diya hua tha
@anketsonawane6651
@anketsonawane6651 10 ай бұрын
@@manish_kumar_1 Sure Manish... I regret for wrong comment. I will surely check it out and thanks for this amazing content ❤️
@desmond7182
@desmond7182 5 ай бұрын
Please make a video for a freshers(0 years of exp).
@dhavalkacha2481
@dhavalkacha2481 10 ай бұрын
If as a fresher if i mention a project in my resume can I say i completed in 1or 2 months
@jhonsen9842
@jhonsen9842 3 ай бұрын
One Like and One Comment.
@SATISHKUMAR-qk2wq
@SATISHKUMAR-qk2wq 10 ай бұрын
I was asked this kind of questions in interview
@Soccerfan_17
@Soccerfan_17 10 ай бұрын
How you analysis your source data before start cleaning?
@rawat7203
@rawat7203 10 ай бұрын
We will 1st remove the non csv files Read the correct files into dataframe using spark We will check if these correct files have the mandatory columns, if not then remove these files If some of these files have extra columns then add a column called extra column and put all these columns there Now we will have dataframe with all correct data, Now to this dataframe we join dimension table dataframe and create a Final DF On this final DF we do spark processing to get the desired calculation
@ishwarkoki1119
@ishwarkoki1119 9 ай бұрын
Manish bhai, thumbnail mai spelling galt ho gaya hai related ka !
@manish_kumar_1
@manish_kumar_1 9 ай бұрын
Oh, thanks for pointing it out
@sathyak3285
@sathyak3285 9 ай бұрын
Please talk in English, so that everyone will understood. And pls give answers for the questions
@rahulrai4686
@rahulrai4686 8 ай бұрын
Sir aapse personal me kaise baat kar skte hai hum
@avinash7003
@avinash7003 10 ай бұрын
still calls are there for bigdata AWS?
@prabhatgupta6415
@prabhatgupta6415 10 ай бұрын
are u not getting??
@avinash7003
@avinash7003 10 ай бұрын
@@prabhatgupta6415 what is the present market about AWS?
@prabhatgupta6415
@prabhatgupta6415 10 ай бұрын
i m azure guy..sir@@avinash7003
@sanketraut8462
@sanketraut8462 10 ай бұрын
can we say our source and sink is same like hadoop hdfs?
@manish_kumar_1
@manish_kumar_1 10 ай бұрын
Yes
@rh334
@rh334 10 ай бұрын
Can you make content about KAFKA
@rajaprasad-vv2rf
@rajaprasad-vv2rf 3 ай бұрын
How many nodes we use in our project
@manish_kumar_1
@manish_kumar_1 3 ай бұрын
Nodes are used in cluster. When job is scheduled then we don't mention the no of node, rather we use number of executor and more than 1 executor can start on the same node
@user-iz5hj1ep8s
@user-iz5hj1ep8s 10 ай бұрын
Python and spark code questions bhi bata do abhi sir .....
@manish_kumar_1
@manish_kumar_1 10 ай бұрын
Already bata rakha hai company specific in one of the playlist
@raajnghani
@raajnghani 10 ай бұрын
How to unbroadcast the dataframe?
@manish_kumar_1
@manish_kumar_1 10 ай бұрын
Set the configuration of broadcast threshold to -1
@amangurjar9714
@amangurjar9714 10 ай бұрын
can fresher become data engineer
@shubhamchavan9438
@shubhamchavan9438 10 ай бұрын
agar ye saval saal bhar puchega, to nahi ban payega, lekin ek bar puchke saal bhar practice kareka to ban jayega
@amangurjar9714
@amangurjar9714 10 ай бұрын
I should buy some course for data engineer or I should prepare from KZbin only and make online project??
@rakeshverma6867
@rakeshverma6867 10 ай бұрын
@@shubhamchavan9438
@rahulrai4686
@rahulrai4686 8 ай бұрын
Sor puthon language kitna aana chahiye hume
@ruchim3448
@ruchim3448 9 ай бұрын
is it complete playlist ?
@manish_kumar_1
@manish_kumar_1 9 ай бұрын
Yes
@ruchim3448
@ruchim3448 9 ай бұрын
@@manish_kumar_1 thank you.
@BigDataWithSky
@BigDataWithSky 2 ай бұрын
What Don't you talk in English 😢for non Hindi speaker😊
transformation and action in spark
21:58
MANISH KUMAR
Рет қаралды 32 М.
10 recently asked Pyspark Interview Questions | Big Data Interview
28:36
طردت النملة من المنزل😡 ماذا فعل؟🥲
00:25
Cool Tool SHORTS Arabic
Рет қаралды 13 МЛН
🩷🩵VS👿
00:38
ISSEI / いっせい
Рет қаралды 22 МЛН
Они так быстро убрались!
01:00
Аришнев
Рет қаралды 3,2 МЛН
Чёрная ДЫРА 🕳️ | WICSUR #shorts
00:49
Бискас
Рет қаралды 5 МЛН
Top 10 Power BI Interview Questions | Asked in Interviews 2024
21:44
Satyajit Pattnaik
Рет қаралды 92 М.
طردت النملة من المنزل😡 ماذا فعل؟🥲
00:25
Cool Tool SHORTS Arabic
Рет қаралды 13 МЛН