The best Data Engineering course in youtube. Thanks a lot bro for your effort and that to free of cost. Really proud of you!
@rajasdataengineering75852 ай бұрын
You are most welcome
@reach2puneeths3 жыл бұрын
very informative, please come up with end to end projects using databricks
@shivachaitanyachinna98195 ай бұрын
Thanks for providing indepth knowledge about these topics. Amazing.
@rajasdataengineering75855 ай бұрын
Glad you like them! My pleasure!
@amanpathak7507 Жыл бұрын
Hi, could you please provide the slides and notebooks, that would be really helpful for a quick revisions before interview
@learningruchi7 ай бұрын
Thank you for providing such detailed videos.
@rajasdataengineering75857 ай бұрын
Glad you like them! Keep watching
@bharatpogul60143 ай бұрын
Very nicely explained the concepts.
@rajasdataengineering75853 ай бұрын
Glad you liked it! Thanks
@ranjansrivastava925611 ай бұрын
As per your slide for the Differences among the RDD, Dataframe and Dataset- you mentioned the supported language for Dataframe is Java, Scala, Python and R. What about the SQL for these. Could you please clarify on this Raja. If possible.
@rajasdataengineering758511 ай бұрын
Hi Ranjan, yes spark SQL is also supported by dataframe api
@ourmind8677 Жыл бұрын
A doubt: As you said, ultimately spark converts dataframes into RDDs while processing. Then how the benefits like avoiding GC-process and others will eventually comes into play while using DFs instead of RDDs? I'm fairly new in this area. And thanks for this playlist.
@rajasdataengineering7585 Жыл бұрын
GC is related to on heap memory, not related to dataframe or RDD.
@pavanjavvadi990211 ай бұрын
So does it mean dataframes don’t run in heap memory ?
@dineshdeshpande619710 ай бұрын
Hi Raja Sir, The contents are very good in this video and playlist. But not able to understand the sequence to follow as the numbers are missing in serial numbers you given. Also playlist has 65 videos but the serial numbers are above 100 also, can you pl help with sequencing of videos to go through the playlist.
@maruthiraoyarapathineni2012 Жыл бұрын
Great work. 👍👏👏
@rajasdataengineering7585 Жыл бұрын
Thank you! Cheers!
@Ustaad_Phani3 ай бұрын
Very informative
@rajasdataengineering75853 ай бұрын
Glad it was helpful!
@Abdullahkbc Жыл бұрын
Hi, could you please activate the subtitles for this and other videos? these are really great sources, i don't wanna miss anything.
@rajasdataengineering7585 Жыл бұрын
Hi Abdul, sure will activate the subtitles
@Christy-du9jw11 ай бұрын
@@rajasdataengineering7585 I would also appreciate the subtitles so I don't miss information
@rkjunnu72247 ай бұрын
May I know the first video of the series?
@Sandani_Aduri_Group2 жыл бұрын
Hi Raja, Your videos are very informative and interms of RDD/DataFrame/Dataset if some one which one is faster in execution what would be your answer?
@rajasdataengineering75852 жыл бұрын
Hi Sandani, good question. RDD is native api for Spark. So whatever we use dataset or dataframe, it would be internally converted to RDD. But rdd is quite outdated for programming nowadays. Dataframe is widely used across projects due to developer convenience. Would recommend to go with dataframe. Dataset has limitations with programming languages. For detailed information, please refer this video kzbin.info/www/bejne/nWW3Y2iVaa16g5I
@ramangangwani92038 ай бұрын
sir can you please explain what is serialization
@rajasdataengineering75858 ай бұрын
Sure, will create a video on this requirement
@Abdullahkbc2 жыл бұрын
Hi Raja, could you please fix the order of the playlist? thanks in advance
@rajasdataengineering75852 жыл бұрын
Hi Abdullah, sure I will do it
@gulsahtanay23419 ай бұрын
Thank you
@rajasdataengineering75859 ай бұрын
You're welcome
@harithad17577 ай бұрын
amazing
@rajasdataengineering75857 ай бұрын
Thank you! Cheers!
@kanstantsinhulevich4313 Жыл бұрын
dataset also has catalyst optimizations, but in slide it is just "optimization"
@rajasdataengineering7585 Жыл бұрын
Yes dataset and spark SQL also uses catalyst optimizer. Optimization means catalyst optimizer. In the previous slide, mentioned that dataset consolidates best features from both rdd and dataframe
@labib8aug2 жыл бұрын
Could you make a repo for all your videos.. Otherwise it is hard to follow you , thanks a lot Raja
@meghagavade8672 Жыл бұрын
Best One
@rajasdataengineering7585 Жыл бұрын
Thanks!
@sorathiyasmit86025 ай бұрын
Your content is very good can you provide pdf of ppt
@velaatechsolutions97383 жыл бұрын
Super
@premsaikarampudi3944 Жыл бұрын
RDD is not type safety right? they don't enforce datatype; This means that the type of the data in an RDD can change at runtime. This can lead to errors if the data is not properly checked.
@rajasdataengineering7585 Жыл бұрын
Pls check spark official documentation instead of chatgpt to know the truth
@gunar4831 Жыл бұрын
So pyspark uses dataframe and not dataset right?
@rajasdataengineering7585 Жыл бұрын
Yes dataset is only available in scala and Java while dataframe is available with pyspark, R, scala, SQL
@krishnamohan59502 жыл бұрын
Can you please provide sequence number for your vedioes please
@rajasdataengineering75852 жыл бұрын
Sure Krishna, I will arrange the videos and create perfect playlist. Please allow me sometime for that.
@krishnamohan59502 жыл бұрын
Ru providing real time training raja ji
@krishnamohan59502 жыл бұрын
@@rajasdataengineering7585 sent email
@rajasdataengineering75852 жыл бұрын
Thanks, will respond asap
@NikhilGosavi-go7be3 ай бұрын
done
@navjotsingh-hl1jg6 ай бұрын
sir can you share pdf sir
@aravind5310 Жыл бұрын
DataFrames are strong Type safety and RDD are not right. I think you need modify the slide.
@rajasdataengineering7585 Жыл бұрын
No, dataframes are weak type safety, whereas rdd and datasets are strong type safety. For spark engine, dataframe is collection of rows (not individual columns) so it can't validate the column data type during compile time. So it is not strong type safety. Hope you understand. Pls refer spark documentation to know more about type safety
@akash45172 жыл бұрын
Dataframes are mutable .
@rajasdataengineering75852 жыл бұрын
No, dataframe is immutable
@akash45172 жыл бұрын
In Pyspark we can do this Df = Df . Select or any other transformation . Which will change its state ? Or am I considering mutability wrong ? .
@rajasdataengineering75852 жыл бұрын
Yes you can do df=df.select but it does not mean that dataframe is mutable. What happens internally is previous dataframe is dropped and another new df is created based on lazy evaluation, not the previous df is getting modified. Dataframe is always immutable
@akash45172 жыл бұрын
Ok thank you Raja for helping out . Got it .
@akash45172 жыл бұрын
Raja i am confused between two topics , optimize write and auto compact . I saw you had made video on optimize still confused .
@GovardhanaReddy-kp6jt Жыл бұрын
Raja Bro could you please provide your email id i need to learn This couse