nft.storage Postgres Partitioning, Partial of #782

Included in this PR:

Partition content, pin, and upload tables by cid into 100 partitions each (using hash partitioning)
- This should radically speed up reads, writes, vacuums, and analyze.
- Luckily the way hashing of the CID text works, this provides an extremely even distribution across partitions. Query to check distribution across partitions located at the top of the SQL file if you want to test it yourself.
- This should future-proof the DB up to about 1B records where we will need to add additional partitions - at this point we will probably want to look into sharding the DB.
- This code is very ugly - better syntax for this is supposed to land in PG 15, but for now we have to be very verbose. See notes at the top of the partition.sql file for more info.
Improves the docker-compose for the database by specifying CPU and RAM constraints
Updates the postgres dockerfile to use the latest PG (13) to match what we run in production/staging
- Also improves it by adding additional logging, which is useful when working with the DB on local dev
Improve the DB configuration
- Many other settings aren't able to be configured like this (Heroku doesn't expose most flags), noted in #782 so I did what I could within the constraints of what we can do inside the DB
Improves the "backup" system - instead of having its own table with a many to one relationship, add a new key backup_urls for each upload. This removes a redundant large table (12M uploads = > 12M backup rows), and speeds up create_upload by removing more inserts.

TODO:

Add benchmark system to current main branch + new main branch, collect measurements and compare
Write migration for staging/production - this is going to be a pretty large migration since we will need to move all of the current records into temp tables, create the new tables/partitions, then move everything into the new tables. Will also need to migrate all old backup records to the new upload schema field.

Jan 20 '22 22:01 yocontra

@hugomrdias I wanted a thumbs up on the general approach before I proceeded doing the benchmarks and migration which are going to be the most time consuming part - now that I got your eyes on it I'll rebase + finish up by implementing those and re-apply for your approval when done. Will post results of benchmark before + after these changes in this thread as well.

Feb 14 '22 22:02 yocontra

@alanshaw Yep that's the next step (after finishing some basic benchmarks) - the size of the DB is not massive so it shouldn't be a terribly long migration, I've done a 1B+ row partition migration before and it took a few hours so I would anticipate this might take a few minutes given that we're operating against a few 12M row tables. We may want to temporarily upsize the instance if we want to speed things up, and I'll need to check that we're below 50% storage utilization before we run the migration since this will need to temporarily duplicate those tables.

After I make the migration script I'd like to make a clone of production and test against it locally, then I'd like to clone the production DB onto staging and deploy this code + run the migration on staging and run it for a few days and make sure all is well.

There is one outlier here that needs to be resolved before any deployment/migrations - the version of postgrest running on production is detached from the codebase, it was deployed manually via the heroku UI and is out of date so it doesn't support partitioned tables - it needs to be brought up to the latest version. I don't know who the owner of that task should be.

Mar 14 '22 17:03 yocontra

Also just addressing migrations for the backups table: If desired I can submit the backup changes as a separate PR (and separate migration) so we can roll that out, then deal with the partition stuff separately instead of bunching them into one big thing. Let me know if you want me to do that.

Mar 15 '22 00:03 yocontra

Also just addressing migrations for the backups table: If desired I can submit the backup changes as a separate PR (and separate migration) so we can roll that out, then deal with the partition stuff separately instead of bunching them into one big thing. Let me know if you want me to do that.

yes please

Mar 16 '22 16:03 hugomrdias

ping me for heroku upgrade

Mar 16 '22 16:03 hugomrdias

nft.storage nft.storage copied to clipboard

Postgres Partitioning, Partial of #782

nft.storage
nft.storage copied to clipboard