Hello Manish, I am presently working as a Data Engineer at Hyland, and it is your theory and practical playlists that I've been following to get me up and running with regard to learning Spark. Your content is very useful not just to me but to many professionals out there. Please do keep up the good work-although some may be casually watching, there are plenty of us who are seriously following through in the long term. Your work is making a real difference, and I really value your commitment. Thank you for all that you do, and please continue the great work! Best regards, Gaurav.
@abc_9873 ай бұрын
Give this guy some medal for Quality
@Shivam-Tiwari1110 ай бұрын
Literally i am watching it at 3 am in the morning and my eyes are not blinking. That much interesting ur way of teaching is. I hope you will not stop creating the content because of demotivation of any type let’s say less number of subscribers ( i can feel it’s disheartening sometimes when u put that much efforts) . You just increased my crave for learning. Dil se shukriya bhaii ❤
@tusharkantighosh46417 ай бұрын
same happens with me bro....these contents are diamonds....
@ranjithbohara37754 ай бұрын
@@tusharkantighosh4641 even with me also i was almost to quite spark but after this video i got much interest in spark
@renukasrivastava11672 ай бұрын
Hi Sir, I am very thankful that I came across your playlist . Trust us many of us are implementing whatever things you are explaining so don’t be demotivated. We understand the amount of extra work you are doing for us and we won’t let your effort go on vein. Your playlist is a hidden gem.
@panyamaravind8522 ай бұрын
Because of you I’m confidently attending interviews. Thank you sooo much
@Samar009-m4 күн бұрын
I just have one word for this ..awesome
@khurshidhasankhan4700 Жыл бұрын
Interview k liye bahut benifit ho raha hai ❤
@apoorvkansal92666 ай бұрын
Hello Sir, You are doing a great job and there are very few humble and selfless Data Engineers like you who could dedicate their valuable time for sharing their knowledge and experience. Please continue creating these videos on PySpark. I am preparing for Data Engineering Interviews for the the 3rd time i.e. for my 3rd switch of companies and interviews are getting scheduled and looking upto this series for cracking them. Thank you in advance for creating such informative and detailed videos. Jai Mata Di! 🙏
@ARGTalks-gx6xs6 ай бұрын
Please continue this series brother, I just started watching this and found it to be fantastic resource. Keep up the good work.
@prashantmehta28326 ай бұрын
I am preparing for the interview, and I can say you are playing main part. I do make note and follow all steps. Thanks for the supper.
@RaviKumar-xo5bz8 ай бұрын
sach mei aapko bahut ache se aata hai ... one place for all IT courses
@harshitmittal143420 күн бұрын
Manish You are doing a great work giving to society for free. keep it up.
@sanooosai8 ай бұрын
sir g dont loose motivations, we are fallowing with all dedication, sometime people with all the implementation knowledge also see your video for revision or to brush up basics, thank you for all the hard work and giving these great content in hindi
@ranjithbohara37754 ай бұрын
to be honest the way is explain the fundamental is amazing, for a beginners the course is really worth to watch instead of buying some random expensive courses.. thank you bother for such amazing and useful content .. 👍👍👍
@ghvendrashra49128 ай бұрын
Bhai..you are the best teacher for data Engineer... concepts are very clear.....thanks
@manvika Жыл бұрын
Your series is really good,. I started watching it today.. Please keep making it.. really helpful.
@udittiwari84209 ай бұрын
very helpful series sir thank you for giving us your imp time ❣
@ashwinichoure32032 күн бұрын
Hi Sir, I am very thankful that I came across your playlist . very nice , please keep it up
@ranidalvi10649 ай бұрын
Most informative and needful video. thank u so much.
@shravanshenoy38737 ай бұрын
Fantastic explanation, understood the concepts very clearly. Thanks for making such amazing content.
@abhishekkumar-gupta Жыл бұрын
Very informative video. Must watch for someone beginner or intermediate in pyspark.
@alokmishra536710 ай бұрын
No no sir .. I m honestly following your course.
@harmanpreetkaur804918 сағат бұрын
This is the best series related to Spark that I have seen so far. Tried so many other videos and couldn't complete those. You make it so easy to understand. Hands down the best teacher! Thanks for all your efforts. Subscribed :)
@deeksha65148 ай бұрын
I do not have words to praise this masterpiece. Keep creating this awesome content.
@lifeisfun9 Жыл бұрын
You are a gentle man seriously :)
@PraveenkumarVinukonda4 ай бұрын
I am watching and following from past 1 week
@RiyaBiswas-r1p8 ай бұрын
The way you explained everything made it so simple to understand. Earlier it used to be difficult for me to understand the architecture concept and I would forget that in sometime, but now I feel I wont forget it as this video felt like a story and everything was explained in detail and in very simple way. Its such a great content.
@vaibhavmore7936 Жыл бұрын
We got the file and used it as you taught. Didn't feel need to comment this also, after you are giving this much for free. Thanks for awesome content.
@Azure-Mahesh11 ай бұрын
Your inputs are really helping in understanding complicated things easily, thank you
@mmohammedsadiq2483 Жыл бұрын
I mean to say in my last comment, you have explained very well with simple examples
@Varunsharma-sg2nt5 ай бұрын
Sir, Awesome explanation Thank you so much
@shyammohan36118 ай бұрын
Sir I have got you link in the last video, sir i am following you both playlist practical and theory. please don't stop videos
@jigsparikh79619 ай бұрын
You are awesome. I make it simple to understand. thanks
@lazycool36118 ай бұрын
Ab tak bhaut sahi chl rha hai jo aap smjhaye 100 marks aayenge ab hume
@dilipkuncha5728 Жыл бұрын
hi , im learning well through your course . I literally realised after watching this video , about the csv file , because i had to literally make a new one using the data. Thanks for motivation and Please be Motivated !!😊
@vishwajeetkushwaha649 ай бұрын
Most informative @ Thanks Bro
@vilaspatil-r3q Жыл бұрын
Manish You are simply awesome. i'm glad that i have found best video's on KZbin to learn spark.
@bforbhakti111 ай бұрын
Your teaching is really good and helpful.thanks alot
@sarpeshmishra673914 күн бұрын
very nice , please keep it up
@rakeshjadhav_O7 ай бұрын
Great explanation Manish. Thank you very much for such informative videos
@manojkaransingh5848 Жыл бұрын
Outstanding tutorial....maja aa gaya sir..@manish
@khurshidhasankhan4700 Жыл бұрын
Sir aapke lecture se bahut sare doubt clear ho raha hai
@omkarm7865 Жыл бұрын
Very very helpful content.. Please keep it up
@SANJAYYADAV-hm2bs9 ай бұрын
Sir you are doing really awesome job. Please making all your videos. As its will help to others in future as well. I am working as ETL tester and still i keep watching your videos to get some insight on data engineer roadmap.
@adityaabhinav4171 Жыл бұрын
Most underrated Teacher on KZbin I wish you all the best and one day you will be famous because the way you teach is awesome.
@manish_kumar_1 Жыл бұрын
Aap hame jaante hai itna hi kaafi hai😂😂
@bobbygupta8302 ай бұрын
Thank you manish bro,
@praveenkumarrai101 Жыл бұрын
your kadva sach is very very true
@ayeshaagrawal4987 Жыл бұрын
Best teacher ever met , thank you sir
@pushkarratnaparkhi22059 ай бұрын
धन्यवाद मनीष भाई। 🙏
@divyanshusingh39662 ай бұрын
Bro you are doing a great job keep going..
@ranjithbohara37754 ай бұрын
i was there almost to quite spark but after this video i got much interest in spark
@automationwithwasi5 ай бұрын
Manish bhaiya your videos are very valuable. I learn new things every time.
@ujjalroy1442 Жыл бұрын
Phenomenal lectures loved it✌️👍
@divyanshusingh39662 ай бұрын
Bro your content is very good. We need tutors like you. Hats off to you.
@rohit-ll3rj7 ай бұрын
Really appreciate your efforts Manish! You teach really well. I work as a Data Engineer in one of the Big 4 and go through your videos whenever i need understanding of topics in depth.
@pritiiBisht11 ай бұрын
Very Informative.
@SanjayKumar-rw2gj6 ай бұрын
Wonderful explanation bro. Read couple of articles about Narrow and Wide transformation but could not grasp it completely but now after watching your video things got clear.
@younevanoАй бұрын
What resources can a beginner educate themselves from?
@ashishkumar95389 ай бұрын
Thank you for sharing your knowledge among us and I am really enjoying learning the concepts. I am sure these informative videos are helping a lot of people. Keep on teaching us!!
@MrFirstScientist Жыл бұрын
Appreciate your hard work.
@hubspotvalley580 Жыл бұрын
You are doing great job. Your lectures are awesome.
@mohdrizwanahmed55372 ай бұрын
bht ache se undrstand horha bhai
@muizzrehan1433 Жыл бұрын
Amazing work. Thanks for lectures...👏
@KumarKumar-en5xq3 ай бұрын
Hi Manish, Thanks for your efforts. It's really good.
@t1mt0m97 Жыл бұрын
Manish Ji....excellent hands-on course.. Keep adding and let your channel grow !
@Podcast-Bites-Hub28 күн бұрын
Really great content ✨
@samirdeshmukh9886 Жыл бұрын
Very helpful sir.. Thank you..
@pankajsolunke3714 Жыл бұрын
This series is phenomenal.❤
@RishikaJain-f4m4 ай бұрын
Thanks a lot for your course
@adityasaini8437 Жыл бұрын
Manish bhai is real 💎!!
@lucky4-vj10 ай бұрын
Amazing explanation Sir 🤩
@abidsyman Жыл бұрын
good morning sir, ur lectures are great
@sureshydv724 Жыл бұрын
You are outstanding. I love the way you are teaching. Thanks Manish bhai explaining along with flowchart👏
You make each and every topic very simple to understand. Thank you so much and keep up the work. 👍 If possible, please make some videos on Kafka, Kubernetes, Azure, Databricks. Your videos are really very helpful!
@AbhisekLipun Жыл бұрын
Sach bat hai sir
@DpIndia Жыл бұрын
Nice Video, got lots of learning
@souravdas-kt7gg9 ай бұрын
content is very good
@aasthajain481411 ай бұрын
Appreciate it truly
@karansinghrajpurohit3500 Жыл бұрын
Please continue this series
@manish_kumar_1 Жыл бұрын
Sure
@mantukumar-qn9pv Жыл бұрын
Sach me ab promise to myself karta hu ab practice karunga
@shivakrishna1743 Жыл бұрын
I want to tell you that I am following your videos and also implementing them. Whenever possible I saw links from your video and got the doc or data file. Once, I couldn't find data so I asked you :). Please don't loose motivation for doing these videos!!
@Icelander00 Жыл бұрын
Manish I am implementing all side by side
@nilavnayan45212 ай бұрын
Sir one question. At 14:03, when you showed output of question 1 (list of people with age less than 18) Here, as we can see the output contains redundant rows, so in this case once both the executors have returned their outputs which are kind of appended on top of each other in the output you showed - what happens after this step to remove the redundancy? Is a ‘distinct’ run by spark?
@younevanoАй бұрын
Only if you write 'DISTINCT' in your code! Then saprk performs the wide dependency transformation I believe!
@MsSubhrajeet9 ай бұрын
Awesome 👏
@swarupsarangi7347 ай бұрын
you should make another video in which you address whether the spark.read methods are transformation or action.
@younevanoАй бұрын
spark.read() is neither a transformation nor an action in the strict Spark context. It’s more of a data source method used to define a DataFrame by reading data from an external source (like a file, table, or API) into Spark's DataFrame abstraction. Here's a breakdown of Spark operations for context: Transformations (like map(), filter()) define a new dataset based on the current one but are lazily evaluated, meaning they don’t execute until an action is called. Actions (like count(), collect()) trigger the execution of the transformations defined on the DataFrame. In short, spark.read() sets up the DataFrame for subsequent transformations or actions but does not trigger any execution itself.
@dattak-gb7ez6 ай бұрын
Sir App Azure ke bhi kuch session loge to bahot acha hoga..kyuki apki padane ki technique bahot achi hai..
@Finoboost Жыл бұрын
Hi Manish Bhai, csv file was downloaded by the steps which you guided, you are doing great work, i am following and applying all the concepts which you taught.
@manish_kumar_1 Жыл бұрын
Good
@RajvirKumar-n1p6 ай бұрын
great bhai bahut acha concept padha raheho Azure Data Brick ka v lekar aawo aap series
@jbb6906 Жыл бұрын
Awesome
@CctnsHelpdesk8 ай бұрын
well explained
@tnmyk_9 ай бұрын
Very well explained! I have a doubt though - What if for example - One partition can handle only 100 records of data but the original input data consists of 150 records all having the same ID. In that case how will the groupby() function work because it wont be possible to bring all data in to the same partition even after shuffling
@EVSprakash9 ай бұрын
What is the answer
@younevanoАй бұрын
When using `groupBy()` in Spark, the goal is to bring all records with the same key (in this case, the same ID) to the same partition to perform the grouping operation. However, as you've pointed out, if the data for a single key is larger than a single partition's capacity (here, 100 records per partition), Spark can encounter memory issues. Here's how Spark addresses this: ### 1. **Spill to Disk**: Spark can spill data to disk when the in-memory size limit is reached. If all records for a single ID cannot fit in memory within one partition, Spark writes some of the data to disk temporarily. This allows Spark to handle larger datasets without running out of memory, though it may be slower than purely in-memory processing. ### 2. **Tungsten Execution Engine Optimization**: Spark’s Tungsten execution engine is optimized to handle large data processing tasks. It uses techniques like off-heap memory management and binary processing to efficiently manage memory. This can alleviate some memory pressure by keeping only essential data in memory and managing the rest on disk. ### 3. **Alternative Aggregations with Approximate Solutions**: If exact grouping isn't strictly necessary, using approximate aggregations like `approxQuantile()` or `countApproxDistinct()` may help reduce memory usage and avoid data skew issues. ### 4. **Custom Partitioning Strategies**: For cases where specific keys have significantly larger data sizes, a custom partitioning strategy can sometimes help by pre-processing the data to manage key distribution more effectively. In practice, large shuffles with skewed keys can lead to inefficiencies. Optimizing partition sizes, monitoring skew, and using aggregations designed for big data can often prevent these issues from becoming bottlenecks.
@radheshyama448 Жыл бұрын
❤ thank you
@hubspotvalley580 Жыл бұрын
Manish Bhai, I would like to request you, please make a real time end to end project. So, that it can help us during the interview.
@danishthev-log2264 Жыл бұрын
nice content sir
@anirbanadhikary7997 Жыл бұрын
I can find similarity between your tutorial and the book Spark the Definitive Guide by Oriliey . Please correct me if I am wrong.
@manish_kumar_1 Жыл бұрын
Yes I do refer that book.
@shontypoplukabhai15Ай бұрын
Hello @manish, Just wanted to should i go 1 video theory and 1 video practical or should i complete theory first and them move towards practical.