Design mechanism for horizontal database sharding

Open aarongable opened this issue 3 years ago • 0 comments

The thing we expect to affect database performance the most in the future is write-sequencing at the primary. When writing large numbers of rows to multiple tables (e.g. a new order with 100 names creates 201 rows across 3 different tables in a single transaction) we can run into performance bottlenecks.

So we want to start thinking about horizontally sharding our database across multiple database clusters, each with a write primary and multiple read replicas.

We expect that the best path forward here will be sharding based on (a hash of) the RegID / account ID of the requesting account. This means that all of a single client's account metadata, orders, authorizations, and certs will be held in a single shard. We do not expect to run into "a single user is bigger than a single shard" problems, as we currently fit everyone into just one "shard".

However, we have some decisions to make:

How/when/where do we determine which shard to talk to?
- When making a request to an SA?
- When an SA makes a request to a DB?
How to handle non-SA services that talk to the DB, like cert-checker and expiration-mailer?
- Should they switch to using the SA as an intermediary, with long streaming queries?
- Should we run multiple instances, one per shard?
- Should they learn how to navigate sharding directly themselves?
Probably lots of other unknowns

This bug exists to track this design work and be an umbrella bug for future pieces of implementation work.

Jan 24 '23 19:01 aarongable