It's very tough to make these videos, after all day job. Hats off to your determination. Really inspirational 😇
@TheBigDataShow9 ай бұрын
Thank you Amrita 🥳🎉🎊
@gadgetswisdom93842 жыл бұрын
nice video keep it up
@princeyjaiswal452 жыл бұрын
Great👍
@DURGESHKUMAR-gd5in2 жыл бұрын
Hi Ankur , Durgesh this side 🙂
@shubhambadaya2 жыл бұрын
thank you
@shreyakatti50709 ай бұрын
Amazing Video.
@TheBigDataShow9 ай бұрын
Thank you Shreya for your kind words
@mranaljadhav8259 Жыл бұрын
Thanks alot sir for making such a awesome video...Keep uploading more videos..waiting for more such videos
@nupoornawathey1008 ай бұрын
Only video on YT to explain these parameters well, thanks Ankur !! I had a query. For example, we have 10 partitions, lowerBound=0, upperBound=10000 and provide fetchSize as 1000. Will fetchSize be used as limit 1000 here ? Say one partitioned sql have more rows than fetchSize what may happen here ?
@dataenthusiast_ Жыл бұрын
Great Explanation Ankur So In production scenario, ideally we have to calculate the min max of the bounds at run time right. we cannot hardcode the lowerbound upperbound.
@TheBigDataShow Жыл бұрын
Yes correct. Most of the developers write smart code to dynamically determine these lower or upper case
@shivanshudhawan7714 Жыл бұрын
@@TheBigDataShow I actually did the same, reading from mysql (1053 tables- some really big some medium and some small) and writing them to databricks raw layer. What I did, I was programmatically getting lower and upper bound for the tables and then using them to read the data parallely, in that case my total hits to the source db are doubled. Any advice you can provide on this?
@kamalnayan91572 жыл бұрын
Great!
@RohanKumar-mh3pt Жыл бұрын
hello sir this is very helpful can you please make video regarding what kind of question they asked in data pipeline design round and what are the possible way to handle such questions
@TheBigDataShow9 ай бұрын
Please check the Data Engineering Mock Interview playlist. We have recorded more than 25 Data Engineering mock interviews..
@dpatel9 Жыл бұрын
This is very useful example. Is there any way to optmize writing/insertion into SQL tables where we have millions of rows in dataframe???
@kalpeshswain82072 жыл бұрын
I have doubts here, when we deal with tables from databases, we can use inner bound and outer bound.....but when we read flat files like CSV , can we use inner bound and outerbound
@TheBigDataShow2 жыл бұрын
Use if it is a file format then use columnar file formats like Parquet, ORC or row based file formats like AVRO. It will help you in the predicate pushdown and help you to fetch your column more quickly. CSV files are row based format and it is a very simple format. It is not recommended to store big data in CSV.
@TheBigDataShow2 жыл бұрын
Check my article for understanding Parquet, ORC and Avro www.linkedin.com/feed/update/urn:li:activity:6972381746185064448?