go-spacemesh icon indicating copy to clipboard operation
go-spacemesh copied to clipboard

sql: add copy-based migrations and VACUUM INTO

Open ivan4th opened this issue 1 year ago • 1 comments


Motivation

Currently, when vacuuming is required after database migration (according to the configured db-vacuum-state value), simple SQL VACUUM command is used. According to the description in the SQLite docs, VACUUM requires as much as twice the size of the original database file in free disk space, as first it makes a copy of the database in the temporary directory, and then it copies the vacuumed database back to the original database utilizing WAL file which grows to the size of the database itself before changes are committed. According to the remark in the SQLite source

** Only 1x temporary space and only 1x writes would be required if
** the copy of step (3) were replaced by deleting the original database
** and renaming the transient database as the original.  But that will
** not work if other processes are attached to the original database.
** And a power loss in between deleting the original and renaming the
** transient would cause the database file to appear to be deleted
** following reboot.

In case of go-spacemesh, we don't need concurrent access to the database from multiple processes, so we can optimize the vacuuming step to use only 1x space of the original database for vacuuming, given that VACUUM INTO makes a vacuumed copy of the database with no space requirements besides the size of the copy. Moreover, we can use the temporary copy of the database to run migrations faster as we can use PRAGMA journal_mode=OFF and PRAGMA synchronous=OFF for the temporary database safely, just dropping it if something goes wrong during migration.

On top of requiring 2x space, normal VACUUM (without INTO) has another problem: very slow last step in which the original database is replaced. Let's compare VACUUM with VACUUM INTO on a Mac M3 Max laptop:

$ ls -lh /tmp/state.sql*
total 164511960
-rw-r--r--  1 ivan4th  staff    78G Jun 28 00:06 state.sql
-rw-r--r--  1 ivan4th  staff    32K Jun 28 01:32 state.sql-shm
-rw-r--r--  1 ivan4th  staff     0B Jun 28 00:06 state.sql-wal

$ time sqlite3 ~/rmme/sm-data/state.sql vacuum
real    67m48.211s
user    57m13.518s
sys     4m41.778s

$ time sqlite3 ~/rmme/sm-data/state.sql "vacuum into '/tmp/state.sql'"
real    2m27.813s
user    0m43.039s
sys     0m56.265s

As it can be seen, VACUUM INTO is about 27 times faster on this particular machine.

Another problem described in #6069 is that on Linux systems, $TMP directory is utilized during normal VACUUM, and in case if it's a RAM disk the vacuuming process may run out of space.

Description

The process below applies to both state and local database.

When no vacuum is required, the migration process is unchanged, with migrations done in-place and each one wrapped in a transaction.

When vacuuming is needed according to db-vacuum-state, the following steps are performed. Marker files which are DBNAME_done are used to indicate that the temporary database is complete and synced and can be used to replace the source database. The temporary database is named DBNAME_migrate.

  1. The database is opened, so that it is locked and any concurrent go-spacemesh processes will fail to open it at the same time.
  2. A temporary copy of the database is created in the go-spacemesh data directory by means of VACUUM INTO. As it can be seen from above "benchmark", VACUUM INTO is relatively fast; also, this way we may hold the lock on the database by simply having it open during the copying. In addition to that, if the source database became bloated somehow due to unused (free) pages, the copy will require less disk space.
  3. The migrations are run on the temporary copy with PRAGMA journal_mode=OFF and PRAGMA synchronous=OFF.
  4. The temporary database is synced to disk after setting PRAGMA journal_mode=WAL and PRAGMA synchronous=NORMAL.
  5. The marker file is created and synced to disk.
  6. The temporary database is copied over the source database by means of VACUUM INTO.
  7. The marker file and the temporary database are deleted.

Additionally, before opening a database no matter what its schema version is, the following steps are performed:

  1. If the temporary database exists along with the "..._done" marker file, this means that a previous migration was interrupted when replacing the source database, or immediately preceding that; in this case, the temporary database is opened, copied over the source database by means of VACUUM INTO, after which the marker file and the temporary database are deleted.
  2. If the temporary database exists but there's no marker file, this means that previous migration has failed and in this case the temporary database is deleted.

In addition to the above, this PR adds logging of parts of migration SQL statements (first lines) when debug logging is set for stateDb in the logging section of the config. This makes it easier to see what parts of a migration are taking long time to execute.

Closes #6069

Test Plan

Tested on a mainnet node.

ivan4th avatar Jun 27 '24 22:06 ivan4th

The PR is ready for review, but #6003 needs to be merged first, after which this PR will be retargeted to the develop branch

ivan4th avatar Jun 28 '24 13:06 ivan4th

bors try

ivan4th avatar Aug 15 '24 23:08 ivan4th

try

Build failed:

spacemesh-bors[bot] avatar Aug 15 '24 23:08 spacemesh-bors[bot]

Codecov Report

Attention: Patch coverage is 61.42857% with 135 lines in your changes missing coverage. Please review.

Project coverage is 81.7%. Comparing base (2af4b65) to head (64f4d38). Report is 5 commits behind head on develop.

Files Patch % Lines
sql/database.go 60.7% 80 Missing and 42 partials :warning:
sql/schema.go 72.2% 5 Missing and 5 partials :warning:
node/node.go 0.0% 2 Missing :warning:
checkpoint/recovery.go 0.0% 1 Missing :warning:
Additional details and impacted files
@@            Coverage Diff            @@
##           develop   #6085     +/-   ##
=========================================
- Coverage     81.9%   81.7%   -0.3%     
=========================================
  Files          312     312             
  Lines        34327   34620    +293     
=========================================
+ Hits         28147   28312    +165     
- Misses        4386    4472     +86     
- Partials      1794    1836     +42     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Aug 16 '24 06:08 codecov[bot]

bors try

ivan4th avatar Aug 16 '24 08:08 ivan4th

try

Build failed:

spacemesh-bors[bot] avatar Aug 16 '24 08:08 spacemesh-bors[bot]

bors merge

ivan4th avatar Aug 22 '24 11:08 ivan4th

:-1: Rejected by code reviews

spacemesh-bors[bot] avatar Aug 22 '24 11:08 spacemesh-bors[bot]

bors merge

fasmat avatar Aug 22 '24 11:08 fasmat

:-1: Rejected by code reviews

spacemesh-bors[bot] avatar Aug 22 '24 11:08 spacemesh-bors[bot]

I think @acud needs to give approval for bors accepting the merge

fasmat avatar Aug 22 '24 11:08 fasmat

bors merge

ivan4th avatar Aug 22 '24 15:08 ivan4th

Build failed:

  • systest-status

spacemesh-bors[bot] avatar Aug 22 '24 16:08 spacemesh-bors[bot]

Systests timed out, trying again

ivan4th avatar Aug 22 '24 16:08 ivan4th

bors try

ivan4th avatar Aug 22 '24 16:08 ivan4th

bors cancel

ivan4th avatar Aug 22 '24 16:08 ivan4th

try

Build failed:

spacemesh-bors[bot] avatar Aug 22 '24 16:08 spacemesh-bors[bot]

bors merge

ivan4th avatar Aug 22 '24 17:08 ivan4th

Pull request successfully merged into develop.

Build succeeded:

spacemesh-bors[bot] avatar Aug 22 '24 18:08 spacemesh-bors[bot]