23. Databricks | Spark | Cache vs Persist | Interview Question | Performance Tuning

  Рет қаралды 27,807

Raja's Data Engineering

Raja's Data Engineering

Күн бұрын

Пікірлер: 60
@omprakashreddy4230
@omprakashreddy4230 2 жыл бұрын
Only few people have ability to teach in way that even novice can understand. Hats off to you. Keep going !!!
@rajasdataengineering7585
@rajasdataengineering7585 2 жыл бұрын
Thank you for your encouraging words
@Poori1810
@Poori1810 Жыл бұрын
can not agree more
@gulsahtanay2341
@gulsahtanay2341 7 ай бұрын
Thank you for sharing your knowledge with us!
@rajasdataengineering7585
@rajasdataengineering7585 7 ай бұрын
My pleasure! Thank you
@stepup2me1
@stepup2me1 2 жыл бұрын
You have very good way of explaining the concepts. Thank you!
@rajasdataengineering7585
@rajasdataengineering7585 2 жыл бұрын
Thank you Chetan
@pavithraeshwar8881
@pavithraeshwar8881 Ай бұрын
This is the explanation thank you for share the knowledge sir👏
@rajasdataengineering7585
@rajasdataengineering7585 Ай бұрын
Thanks and welcome
@joyo2122
@joyo2122 2 жыл бұрын
your videos are the best
@tanushreenagar3116
@tanushreenagar3116 7 ай бұрын
Nice content sir
@rajasdataengineering7585
@rajasdataengineering7585 7 ай бұрын
Thanks!
@rockykefunday2707
@rockykefunday2707 2 жыл бұрын
you are the real raja bro , super
@rajasdataengineering7585
@rajasdataengineering7585 2 жыл бұрын
Thank you bro
@kamalbhallachd
@kamalbhallachd 3 жыл бұрын
Good 👍
@rajasdataengineering7585
@rajasdataengineering7585 3 жыл бұрын
Thank you! Cheers!
@abinaya7704
@abinaya7704 Жыл бұрын
Your videos are making wonders!!
@rajasdataengineering7585
@rajasdataengineering7585 Жыл бұрын
Thank you
@rahulpandit9082
@rahulpandit9082 2 жыл бұрын
I found many videos on KZbin regarding Cache and Persist, but nobody explain like the way you did...
@rajasdataengineering7585
@rajasdataengineering7585 2 жыл бұрын
Thank you Rahul
@turanfair9364
@turanfair9364 2 жыл бұрын
Best teacher!!! Thank you sir 🙏🏻
@rajasdataengineering7585
@rajasdataengineering7585 2 жыл бұрын
Thank you Turan
@vutv5742
@vutv5742 6 ай бұрын
Great explaination 🎉
@rajasdataengineering7585
@rajasdataengineering7585 6 ай бұрын
Glad it was helpful! Keep watching
@vlogsofsiriii
@vlogsofsiriii 5 ай бұрын
Hi Raja. I have one doubt. Cache - will store the data in memory means is it onheap memory ?? Persist - Will store the data in onheap and off heap both ?? Is it correct ??
@rajasdataengineering7585
@rajasdataengineering7585 5 ай бұрын
Yes that's correct. Cache always stores in memory but persist has flexibility of memory or disk
@vlogsofsiriii
@vlogsofsiriii 5 ай бұрын
@@rajasdataengineering7585 memory means here onheap rgt and disk means offheap??
@rajasdataengineering7585
@rajasdataengineering7585 5 ай бұрын
No onheap and offheap both are memory and disk is different. I have already posted a video on onheap vs offheap. Pls watch that video
@vlogsofsiriii
@vlogsofsiriii 5 ай бұрын
@@rajasdataengineering7585 thank you 😊
@ranjithajit4717
@ranjithajit4717 Жыл бұрын
Can you add the examples for creating persist in the description?
@justvenkyy...3423
@justvenkyy...3423 Жыл бұрын
this is too good . please keep doing. can you post on processing small file problem with spark?
@rajasdataengineering7585
@rajasdataengineering7585 Жыл бұрын
Thanks 👍🏻 Sure will post a video for small file problem
@shakthimaan007
@shakthimaan007 2 ай бұрын
But where and how do we define these? Can you please add a short demo?
@kamalbhallachd
@kamalbhallachd 3 жыл бұрын
Knowledge session
@rajasdataengineering7585
@rajasdataengineering7585 3 жыл бұрын
Thanks Kamal
@coolraviraj24
@coolraviraj24 3 ай бұрын
You explained it so simply... i hope will be able to explain to the interviewer the same way u did😅
@rajasdataengineering7585
@rajasdataengineering7585 3 ай бұрын
Thank you! All the best!
@sanjayr3597
@sanjayr3597 Жыл бұрын
Very good playlist which I have come across.. Could you please provide example with practical example because I was watching some videos regarding this and what I noticed was when we df.cache() then by default it is MEMORY_AND_DISK SER ..there was no just MEMORY_AND_DISK it was always SERIALIZED ..need to know the reason on this.
@iamkiri_
@iamkiri_ 11 ай бұрын
Raja, I really appreciate your explanation :)
@rajasdataengineering7585
@rajasdataengineering7585 11 ай бұрын
Glad to hear that! Thanks for your comment
@sravanthiyethapu9970
@sravanthiyethapu9970 Жыл бұрын
Hi Raja, u said that persist will use both memory and disk. Here memory means both on and off heap memory????
@rajasdataengineering7585
@rajasdataengineering7585 Жыл бұрын
By default, it is cached at on-heap memory. But if off-heap memory is enabled and jvm memory(on-heap) is full, off-heap memory would be used for caching remaining partitions
@pankajchikhalwale8769
@pankajchikhalwale8769 6 ай бұрын
I guess you have at least an M.Tech. + M.Ed. degrees. Expert in Spark and Amazing Teacher. Sir, Tussi Grett Ho !
@rajasdataengineering7585
@rajasdataengineering7585 6 ай бұрын
Thank you Pankaj! Hope you like the tutorial
@pankajchikhalwale8769
@pankajchikhalwale8769 6 ай бұрын
@@rajasdataengineering7585, So far I have watched 9 out of the 22 videos in the "Databricks Performance Optimization" playlist. It is very detailed. Like it.
@rajasdataengineering7585
@rajasdataengineering7585 6 ай бұрын
Glad you like it!
@aayushdesai532
@aayushdesai532 2 жыл бұрын
great video sir! one question - is disc memory same as off heap memory?
@rajasdataengineering7585
@rajasdataengineering7585 2 жыл бұрын
No, off heap and in disc both are different. Off heap memory is part of RAM. on heap is controlled by jvm while off heap is controlled by os itself
@RamaiahChenna
@RamaiahChenna 2 ай бұрын
Hi Sir, we want vidoe for performance issues and solutions while develope the notebook what are the issue comes
@premsaikarampudi3944
@premsaikarampudi3944 Жыл бұрын
Hi, I was asked to prepare for Spark for my next role in the same company I am working, Is this learning series enough ?
@rajasdataengineering7585
@rajasdataengineering7585 Жыл бұрын
Hi, yes this is more than enough if you complete all these videos
@swathi6472
@swathi6472 2 ай бұрын
Please make Video on Salting in Performance optimization
@rajasdataengineering7585
@rajasdataengineering7585 2 ай бұрын
Sure will create a video on salting technique
@suresh.suthar.24
@suresh.suthar.24 Жыл бұрын
Best Explanation. but i have 1 question like cache() is a transformation or action ?
@rajasdataengineering7585
@rajasdataengineering7585 Жыл бұрын
Cache is an action
@tunyestark2633
@tunyestark2633 6 ай бұрын
@@rajasdataengineering7585 No, cache is not an action.It is an transformation, please do try it out.
@Uda_dunga
@Uda_dunga 11 ай бұрын
Try to make videos under 10 mins sir
@rajasdataengineering7585
@rajasdataengineering7585 11 ай бұрын
Sure, will do
@MrPerikala
@MrPerikala 11 ай бұрын
how to avoid the duplicate rows while joining large datasets
@rajasdataengineering7585
@rajasdataengineering7585 11 ай бұрын
Drop_duplicates or distinct can be used to remove duplicates
24. Databricks| Spark | Interview Questions| Catalyst Optimizer
19:42
Raja's Data Engineering
Рет қаралды 27 М.
25. Databricks | Spark | Broadcast Variable| Interview Question | Performance Tuning
13:33
How I Turned a Lolipop Into A New One 🤯🍭
00:19
Wian
Рет қаралды 11 МЛН
黑的奸计得逞 #古风
00:24
Black and white double fury
Рет қаралды 17 МЛН
Yay, My Dad Is a Vending Machine! 🛍️😆 #funny #prank #comedy
00:17
22. Databricks| Spark | Performance Optimization | Repartition vs Coalesce
21:11
Raja's Data Engineering
Рет қаралды 51 М.
[100% Interview Question]  Cache and Persist in Spark
12:14
Learnomate Technologies
Рет қаралды 6 М.
01. Databricks: Spark Architecture & Internal Working Mechanism
41:34
Raja's Data Engineering
Рет қаралды 240 М.
How I Turned a Lolipop Into A New One 🤯🍭
00:19
Wian
Рет қаралды 11 МЛН