Implementing Vertical Sharding

  Рет қаралды 11,265

Arpit Bhayani

Arpit Bhayani

Күн бұрын

Пікірлер: 53
@d4devotion
@d4devotion 2 жыл бұрын
I have hit my head so many times understanding the sharding, but could not get it so well. But this guy never fail to explain the things in so so easy way. I am lucky that I found this channel on YT.
@AsliEngineering
@AsliEngineering 2 жыл бұрын
🙌🙌
@adianimesh
@adianimesh 2 жыл бұрын
binge watching after a break ! So much quality content over the last week. Thanks a lot
@Aditya-us5gj
@Aditya-us5gj 2 жыл бұрын
Designing cannot be anymore intersting and easy when compared to your videos. Just keep those videos comming everyday !! I've already took out a slot from my day to watch your videos.
@AsliEngineering
@AsliEngineering 2 жыл бұрын
Thank you so much :)
@mukeshmahadev7419
@mukeshmahadev7419 2 жыл бұрын
Arpit bhai you just rocked it, ek dum top level content with no clutter even for 1 second. This video filled me up with confidence that I can handle database in production. Started binge watching your channel. Keep making content Sir. One thought that hit me while watching this video : This type of content will catalyse the transition of India from being IT services hub to IT manufacturing hub😄
@AsliEngineering
@AsliEngineering 2 жыл бұрын
Exactly my vision. Glad you resonated ✨
@homestaysandcafes
@homestaysandcafes Жыл бұрын
Really grateful to God that I found this valuable gem like content on time♥️ Never worry about views, because some gem music videos are also hidden and craps are getting 1B views
@vasusharma1192
@vasusharma1192 2 жыл бұрын
Maybe a dumb question but here I go If the table renaming step ( table to table.bak) is done after firing zookeeper update, can’t this be more helpful in reducing the small database down time ( assuming zookeeper updates happen immediately without consistency issues ) Saying this because, if we do this, the second DB server is anyways up and will take requests and renaming can happen later … this will also ensures that the replication is completely done
@AsliEngineering
@AsliEngineering 2 жыл бұрын
If we update the config and then rename the table the tables will diverge i.e. the new table will get some writes and the old one will also get some writes. This would lead to an unresolvable conflict. For example: old table has rows till ID 100, the new table also has updates till row ID 100. Now you update the config and it takes 100 ms reflect it on all servers but one of the API server got the changes in 1ms. So for rest 99ms there would be a situation where both the tables are accepting the writes from a subset of API servers. This would lead to a divergence/conflict. Consider auto increment ID column. There will come a time where both the table will have two different rows with the same ID because two different API servers wrote to the tables in different databases. Which is why to have consistency and no conflict we taking a miniscule downtime, cutting off the traffic, and then sending the update. Hope that helps.
@vasusharma1192
@vasusharma1192 2 жыл бұрын
@@AsliEngineering thanks a lot for the quick reply, that clears everything… Amazing content btw, no one covers such practical aspects of things so well .. Hats off ✌🏻
@6vikas
@6vikas 2 жыл бұрын
One of the best content on YT for Vertical Sharding , looking forward for Horizontal Sharding video. :). One question related to joining between 2 database tables , do we need to use host level join in case?
@AsliEngineering
@AsliEngineering 2 жыл бұрын
We would not join across databases. Joins would happen locally. Also, thank you so much for the kind words 🙌
@arunrahullakkapragada2304
@arunrahullakkapragada2304 Жыл бұрын
One doubt. While copying bin log to shard 2 we record last time stamp or id till which we copied right? After that copy is done, we start replication right? CDC or replication service catches up the shard 2 with live updates What about the updates that are happening to db while we are copying the bin log?
@AsliEngineering
@AsliEngineering Жыл бұрын
Already answered in the video. But still put some more thought and you'll get the answer on your own.
@cnp6501
@cnp6501 Жыл бұрын
how is vertical sharding different from partitioning?
@chiragrajani12
@chiragrajani12 Жыл бұрын
What about the failed requests when we renamed the table ie `Table Not Found` part. Read requests are acceptable but those write requests will be lost, wont be that a consistency issue?
@ashishtewari2162
@ashishtewari2162 Жыл бұрын
Great content Arpit. Very easy to understand, small doubt - Why to rename the table first then go for zookeeper config change? Why not first update the config in zookeeper then take back up the table. This will reduce the availability loss.
@jaisamtani303
@jaisamtani303 6 ай бұрын
Bro assume you have 50 API servers, if you update in zookeeper config first, its watcher will start updating the API servers. Assume this config update happened at 1 API Server at 1ms and at 50th API Server at 50ms, so till 50ms, 50th API server was writing to DB1. We will miss this on DB2. This is called inconsistent databases. Whereas if you are renaming first, then your DB1 will not have any write operation as all requests will be failing due to table not found error. Now after your Zookeeper has updated all API servers after 50ms, both DB2 and DB1 will have same data as your DB1 and DB2 were not operating during 50ms. This is called consistent database but less available database
@jaisamtani303
@jaisamtani303 6 ай бұрын
For Financial domain companies, consistency matters most. Assume you did a credit of 1L which was written into DB1 during 50ms. Now when DB2 is taking in request and this transaction is lost? Will you be okay with it? So for Financial domain companies, consistency is utmost important, availability is not
@jaisamtani303
@jaisamtani303 6 ай бұрын
There are actually 2 ways to do Vertical Sharding: 1) The one which you mentioned where there is no downtime, it's real time 2) The one where there is downtime and T2 is dumped on DB2 during downtime so as to avoid replication and further steps You mentioned in video at last that large tables are not sharded as the replication process for them to get in sync is difficult, for such tables we can use downtime approach. What if you come up to a situation where DB1 shard has 2/3 almost same size tables and all are hot? You can take a downtime and do vertical Sharding. Also for financial domain companies, this realtime vertical sharding with 10-15ms is not acceptable, they also might be using downtime approach!
@shivamsrivastava3076
@shivamsrivastava3076 2 жыл бұрын
Just connecting the dots, is this the same way how we scale blob storage (S3/Azure) when data node in a bucket gets hot? :)
@ramyakrishnan8741
@ramyakrishnan8741 Жыл бұрын
Thanks for an amazing video - may i know the difference between federation and vertical sharding?
@ujjwalsaini5830
@ujjwalsaini5830 11 ай бұрын
Great content. Didn't feel like skipping even for a sec. Kudos!! Also, one question - How do we go about migrating huge table from one database server to another? By huge table I am assuming that the table size is big and also there are huge number of writes happening.
@AsliEngineering
@AsliEngineering 11 ай бұрын
Migrating high write database from one to another is done in 6 broad steps. 0. Take snapshot 1. Load it in a new database 2. Setup replication 3. Let it catch up 4. Stop the write for a fraction of second 5. Failover
@ujjwalsaini5830
@ujjwalsaini5830 11 ай бұрын
@@AsliEngineering so the same strategy is being followed whether the table size is big or small? or Are there any alternate practices being followed to make the migration more efficient?
@aniruddhkhera510
@aniruddhkhera510 11 ай бұрын
Arpit, as always amazing video, thanks for sharing. I was actually planning to join your Feb cohort but couldn't enroll before the registrations got closed. I have some thoughts on this video, maybe I am missing something. I feel migration of table t1 from 1 db server to another with this approach is kind of over-engineering. I have done migration in my previous company, let me explain my approach. 1. We don't need to store the metadata about which db server the table belongs to in zookeeper or any service discovery. Generally in each app server we have our DB configurations file (yaml, xml), we can add and maintain both the DB configs in that. And app server connects to both. 2. The cutover can happen gradually with dual writes to the table in both the DB servers (simple code change). And historic data can be migrated by the db table snapshot. 3. The final cutover can be done by maintaining a config in a remote config, which is basically WIREON/WIREOFF (WOWO) configuration, i.e. turnoff the writes to the previous db server table (example: disable.writes.to.xyz := true) Let me know your thoughts..
@DEEPAKKUMAR-wk5pk
@DEEPAKKUMAR-wk5pk 2 жыл бұрын
you nailed it, man
@rahulsarkar4206
@rahulsarkar4206 Жыл бұрын
How the watch updates config of API server? Are they connected on websocket? Dont think so generally. Please explain.
@AsliEngineering
@AsliEngineering Жыл бұрын
these granular details I cover in my course, so cannot answer it here.
@notionmakeit2888
@notionmakeit2888 Жыл бұрын
how can we get your notes Please help
@Polly10189
@Polly10189 2 жыл бұрын
Thanks Arpit, Allah bless you. Top notch level content. Have one query : If I have a large DB/Table for which I have indexing on some columns as well. While partitioning my data, do my indexing also got partition or I have to do manual indexing on my data partition on it's restored on different DB instance.
@vighneshmahale
@vighneshmahale 2 жыл бұрын
Very Informative!
@vikassrivastava7081
@vikassrivastava7081 2 жыл бұрын
Indepth video! 🙏🏼
@vikassrivastava7081
@vikassrivastava7081 2 жыл бұрын
Arpit bro , can u suggest any book for beginners like me for System design alongside ur awesome videos!!
@sayantankundu4532
@sayantankundu4532 Жыл бұрын
Hey Arpit, Great Video . Have a doubt here You mentioned zookeeper watch will inform the API server when there is a change, but where will the API Server store this config information ? If API server is not storing the config information then with every request we need to hit the zookeeper first to get the config, which will surely add latency.
@AsliEngineering
@AsliEngineering Жыл бұрын
You don't need to make network call everytime. Local copy of config is held at the server.
@sayantankundu4532
@sayantankundu4532 Жыл бұрын
Thank you arpit for clarifying it
@pranjalmishra2602
@pranjalmishra2602 Жыл бұрын
What if a request needs to connect to two tables present in different DB servers?
@AsliEngineering
@AsliEngineering Жыл бұрын
what do you mean when you say "connect"?
@pranjalmishra2602
@pranjalmishra2602 Жыл бұрын
@@AsliEngineering I meant, there comes a request which needs some data from a table which is there in DB1 and another data from the table which is in DB2. I guess I'm still unclear:(
@admadnesstv
@admadnesstv 2 ай бұрын
Until and unless db2 has all the latest information whether it related is related to read, write or update db1 will not be cut off.
@pranjalmishra2602
@pranjalmishra2602 2 ай бұрын
@@admadnesstv ah I see, thanks!
@GieGarczynski
@GieGarczynski 3 ай бұрын
Appreciate the detailed breakdown! 🧐 I wanted to ask something unrelated: 🤔 I have these words 🤨. (behave today finger ski upon boy assault summer exhaust beauty stereo over). How do I use this? 🤨
@raj_kundalia
@raj_kundalia Жыл бұрын
thank you!
@kaustavdas1577
@kaustavdas1577 10 ай бұрын
Price increased 1.8 times in 1 year
@AsliEngineering
@AsliEngineering 10 ай бұрын
in 2 years. also the course has changed significantly. it is much more in depth than what I used to cover.
@deepadeshra7195
@deepadeshra7195 2 жыл бұрын
Maybe a silly question, but I am confused with one thing in DB sharding. Let's say in DB-1 there is T1 and T2. there is one relationship between T1 and T2 (foreign key relationship), and then we moved T2 to another database server DB-2. so T1 in DB1 and T2 in DB2, so in these distributed scenario how the data integrity will be maintained ?
@AsliEngineering
@AsliEngineering 2 жыл бұрын
You have to drop foreign keys. You cannot have cross shard foreign keys.
@deepadeshra7195
@deepadeshra7195 2 жыл бұрын
@@AsliEngineering Thank you :)
@imdsk28
@imdsk28 2 жыл бұрын
Massive Like ❤
@debmalyapan53
@debmalyapan53 2 жыл бұрын
amazing
@AnubhavShrivastava
@AnubhavShrivastava 2 жыл бұрын
awesome
Database Sharding and Partitioning
23:53
Arpit Bhayani
Рет қаралды 109 М.
Why do databases store data in B+ trees?
29:43
Arpit Bhayani
Рет қаралды 48 М.
小丑女COCO的审判。#天使 #小丑 #超人不会飞
00:53
超人不会飞
Рет қаралды 16 МЛН
When should you shard your database?
21:20
Hussein Nasser
Рет қаралды 80 М.
Everything you need to know about REST
26:20
Arpit Bhayani
Рет қаралды 34 М.
Database Sharding in 200 Seconds
4:43
PlanetScale
Рет қаралды 26 М.
Vertical Vs Horizontal Scaling: Key Differences You Should Know
4:34
How do indexes make databases read faster?
23:25
Arpit Bhayani
Рет қаралды 79 М.
Scaling 7M+ Postgres Tables! by  Kailash Nadh CTO @zerodha
19:51
Perfology
Рет қаралды 109 М.
小丑女COCO的审判。#天使 #小丑 #超人不会飞
00:53
超人不会飞
Рет қаралды 16 МЛН