Awesome information... thank you harjeet for the great insights.....
@DataSavvy6 жыл бұрын
Thanks Gaurav
@diptiranjannayak51686 жыл бұрын
How we will restrict the bucket size as per the block size dynamically? if I will mention 4 buckets and then what will happen if 1 bucket size will gradually increase above 1 GB or above? how I will achieve optimization?
@DataSavvy6 жыл бұрын
You have to check your historical data and analyse what number makes sense for you... If you are expecting same size of data in future then just extrapolate those numbers
@r.kishorekumar13882 жыл бұрын
Is it possible to alter bucketted table to change number of buckets ?
@jimitshah76362 жыл бұрын
no, 1) create anther table with changed bucket number, 2) insert data from old table, 3) Drop the old table, 4) Rename the new table. Also you may need to stop the jobs running on this table till you are completed with this process.
@simplecooking13235 жыл бұрын
How we can decide whether we should do partition or bucketing ?
@soumyakantarath50786 жыл бұрын
Thank you so much 😊
@routhmahesh95254 жыл бұрын
Can we create buckets on top of partitioning ...can you please explain this?
@rahulsamyal61593 жыл бұрын
I got confused, block size is 128 mb and our memory can be 4 gb in size. In this case bucket should be 128 mb or 4 gb?
@projjalchakraborty18066 жыл бұрын
Can you plz explain what is Hcatalog and what is the use of it??
@DataSavvy6 жыл бұрын
sure.. will create a video on this
@rajareddy474446 жыл бұрын
Please explain about oozie...how to schedule and workflows concepts...thank u
@DataSavvy6 жыл бұрын
Ozzie is no more considered a good scheduler... It had lot of limitations... Most of companies are moving away from this...
@rajareddy474446 жыл бұрын
Data Savvy okay..can you please explain any scheduler most companies use..where we can talk about it and explain it... thank you
@akashputti5 жыл бұрын
@@DataSavvy but a question is asked about this in interview
@naresh52736 жыл бұрын
Thank you
@DataSavvy6 жыл бұрын
Please subscribe and share it
@lokeshmvs6 жыл бұрын
I have a question? 1) buckets are created by writing clusters by. How we can implicitly give the number of buckets
@DataSavvy6 жыл бұрын
Excuse me.. I could not understand your question... We give number of buckets while creating table. Are you asking how can give automatically fund out number of buckets?
@vipulx15 жыл бұрын
Thank you :)
@divendughati61144 жыл бұрын
Can you please explain how can we optimize if number of buckets get way too much about 1 million?
@DataSavvy4 жыл бұрын
Hi Divendu, can you please elaborate more on this case. Like is bucketing creating lot of small files in this case? and what is ur usecase... if bucketing is creating small files then u should decrease number of buckets while creating table
@AnkitaMishra-di9ub4 жыл бұрын
Can you please explain how to decide number of partition?
@DataSavvy4 жыл бұрын
In hive, one partition is created per unique value... In spark it depends on no of blocks of a file
@gauravpathak70175 жыл бұрын
Harjeet-What is default no of bucket and partition??