Рет қаралды 39,200
Sharding or Data Partitioning:
- Sharding is a technique to break up a big database into many smaller parts
- Horizontal scaling means adding more machines, which is cheaper & more feasible
- Vertical scaling means improving servers
- Partitioning Methods:
- Horizontal partitioning:
- In this we put different rows into different table(db), i.e. rows can be based on location with zip codes, this is also range based sharding
- Problem is if range is not choosen carefully, it'll lead to unbalanced servers.
- Vertical Partitioning:
- In this we divide data to store tables related to a specfic feature to their own servers.i.e. In Instagram, we can have 1 for user profile, 1 for photos, 1 for friend list.
- Problem is in case of additional growth, its necessary to further partition a feature specific DB across servers.
- Directory based Partitioning:
- In this, create a lookup service which knows your current partitioning scheme & get it from DB access code.
- To find out a data entry, we query directory server that holds the mapping between each tuple key to its DB server.
- This is good to perform tasks like adding servers to the DB pool or change our partitioning scheme without impacting application.
- Partitioning Criteria:
- Hash-based partitioning:
- In this we apply a hash function to some key attribute of the entity we're storing, that gives partition number.
- Make sure to ensure uniform allocation of data among servers
- Problem is it effectively fixes total number of DB servers, since adding a new server means changing the hash function, which would require redistribution of data & downtime, the workaround is Consistent Hashing.
- List Partitioning:
- In this each partition (DB server) is assigned a list of values, i.e. APAC, EMEA, US region has respective partition.
- Round-robin Partitioning:
- One by one assign which ensure uniform data distribution
- Composite partitioning:
- Combining any of above partitioning schemas to devise a new scheme. i.e First list partitioning with hash partitioning in each.
- Common problems with Sharding:
- Joins & Denormalisation:
- Performing joins on a database which is running on one server is straightforward, but if DB is partitioned & spread across multiple machines, it's not feasible to perform joins that span database shards.
- These joins won't be performance efficient.
- Workaround is to denormalise the DB so that queries that required joins can be performed on single table.
- Referential integrity:
- Trying to enforce data integrity constraint suh as foreign keys in a sharded DB can be extremely difficult. Most of RDMS do not support this.
- Rebalancing:
- When data distribution is not uniform.
- When there is lot of load on a particular Shard.
In this video, we're going to reveal Sharding used in System Design:
- What is Sharding/ Data Partitioning
- Sharding Methods
- Sharding Criteria
- Sharding Challenges
★☆★ SUBSCRIBE TO ME ON KZbin: ★☆★
kzbin.info
what is sharding in mongodb,
database sharding example
sharding vs partitioning,
sharding mysql,
sharding techniques,
consistent hashing sharding,
sharding crypto,
what are sharding criteria
CHECK OUT CODING SIMPLIFIED
kzbin.info
I started my KZbin channel, Coding Simplified, during Dec of 2015.
Since then, I've published over 200+ videos. My account is Partner Verified and I get my earnings direct deposited into my account every month.
★☆★ VIEW THE BLOG POST: ★☆★
thecodingsimplified.com
★☆★ SEND EMAIL At: ★☆★
Email: thecodingsimplified@gmail.com