Understanding Database Sharding and Partitioning

Once you build a product and your user base grows, your data grows too. Managing and storing that data efficiently becomes crucial. This is where scaling comes in. You probably didn't start with the biggest database server when you built your small app or SaaS business, but eventually, you'll need to. That's when concepts like sharding and partitioning come into play.

Before we dive into how sharding and partitioning work and how they help with scaling, let's check out what a database server looks like and how it operates

So, a database server is basically just a process or service running on a computer that uses the machine's disk to store data. Let's say you're running a MySQL or Postgres process on your AWS EC2. It runs on a port of that machine and opens up that port so your server or API can communicate with it.

Now, once your database is up and running in production and serving real users, you probably started with a small server with limited hardware because, honestly, why would you go for a big one when you don't have many users yet?

Now, imagine your app is getting tons of traffic, and your database server just can't keep up anymore. Your small server can handle about 120-150 writes per second, but you're getting 200 write requests per sec, and you're seeing bad metrics like longer query times and stuff. But you still need to keep your users happy, so you decide to beef up your current database server with more power—more RAM, more CPU, and more disk space. What you just did is called vertical scaling, which helps you handle more write requests. But now you're facing another problem: even though your writes are capped at 200, your reads have increased a lot. Using just that beefed-up server will increase costs, so what do you do? Well, you can create a read replica—a copy of the existing database that only handles reads.

For now the issue seems solved with this solution .

So, let's say your app is getting even more traffic, and you're hitting 1000 writes per second. Just like before, you think about scaling up your main database server to handle it, and bam, it works! Your database can now handle 1000 writes per second, and everything's running smoothly. But then, requests go up again, and now you have to handle 1250 writes per second. You'd think about scaling up your main database server again, right? But nope, your cloud provider says you've hit the limit and can't add more power. So, what do you do now? How are you going to keep serving your users?

This is when it's time to switch to horizontal scaling because vertical scaling has its limits. You can only upgrade one server so much. So, what's the plan? We split your write requests between two database servers, each handling 700 writes per second, and spread the data across these two nodes. Now, we can easily handle 1250 requests with two database servers.

Now, by adding one more server, we've split the load across two servers, so our system runs more efficiently.

So, here's the deal with sharding—when a database is sharded, it means the data is split up across a bunch of machines (often called computes, nodes, or servers). Each shard holds a chunk (a subset) of the entire data. So, if you've got 2 shards, you've got at least 2 different computes (physical or virtual) holding those shards.

In the example above, we added a new data node/server, made a shard, and split the data evenly across the shards, 50%-50%.

Partitioning = breaking data into parts, within a single database system

2
Subscribe to my newsletter

Read articles from Anshuman Praharaj directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Anshuman Praharaj
Anshuman Praharaj

Mostly MERN and Python Currently learning CS concepts and building things. more on narcissistic bio: full-stack web dev and co-organizer of Google Developer Group (GDG) On campus - at Birla Global University. I love building web applications and sharing knowledge with fellow tech enthusiasts.