go-spacemesh
go-spacemesh copied to clipboard
sql: add copy-based migrations and VACUUM INTO
Motivation
Currently, when vacuuming is required after database migration (according to the configured db-vacuum-state value), simple SQL VACUUM command is used. According to the description in the SQLite docs, VACUUM requires as much as twice the size of the original database file in free disk space, as first it makes a copy of the database in the temporary directory, and then it copies the vacuumed database back to the original database utilizing WAL file which grows to the size of the database itself before changes are committed.
According to the remark in the SQLite source
** Only 1x temporary space and only 1x writes would be required if
** the copy of step (3) were replaced by deleting the original database
** and renaming the transient database as the original. But that will
** not work if other processes are attached to the original database.
** And a power loss in between deleting the original and renaming the
** transient would cause the database file to appear to be deleted
** following reboot.
In case of go-spacemesh, we don't need concurrent access to the database from multiple processes, so we can optimize the vacuuming step to use only 1x space of the original database for vacuuming, given that VACUUM INTO makes a vacuumed copy of the database with no space requirements besides the size of the copy. Moreover, we can use the temporary copy of the database to run migrations faster as we can use PRAGMA journal_mode=OFF and PRAGMA synchronous=OFF for the temporary database safely, just dropping it if something goes wrong during migration.
On top of requiring 2x space, normal VACUUM (without INTO) has another problem: very slow last step in which the original database is replaced. Let's compare VACUUM with VACUUM INTO on a Mac M3 Max laptop:
$ ls -lh /tmp/state.sql*
total 164511960
-rw-r--r-- 1 ivan4th staff 78G Jun 28 00:06 state.sql
-rw-r--r-- 1 ivan4th staff 32K Jun 28 01:32 state.sql-shm
-rw-r--r-- 1 ivan4th staff 0B Jun 28 00:06 state.sql-wal
$ time sqlite3 ~/rmme/sm-data/state.sql vacuum
real 67m48.211s
user 57m13.518s
sys 4m41.778s
$ time sqlite3 ~/rmme/sm-data/state.sql "vacuum into '/tmp/state.sql'"
real 2m27.813s
user 0m43.039s
sys 0m56.265s
As it can be seen, VACUUM INTO is about 27 times faster on this particular machine.
Another problem described in #6069 is that on Linux systems, $TMP directory is utilized during normal VACUUM, and in case if it's a RAM disk the vacuuming process may run out of space.
Description
The process below applies to both state and local database.
When no vacuum is required, the migration process is unchanged, with migrations done in-place and each one wrapped in a transaction.
When vacuuming is needed according to db-vacuum-state, the following steps are performed. Marker files which are DBNAME_done are used to indicate that the temporary database is complete and synced and can be used to replace the source database. The temporary database is named DBNAME_migrate.
- The database is opened, so that it is locked and any concurrent
go-spacemeshprocesses will fail to open it at the same time. - A temporary copy of the database is created in the go-spacemesh data directory by means of
VACUUM INTO. As it can be seen from above "benchmark",VACUUM INTOis relatively fast; also, this way we may hold the lock on the database by simply having it open during the copying. In addition to that, if the source database became bloated somehow due to unused (free) pages, the copy will require less disk space. - The migrations are run on the temporary copy with
PRAGMA journal_mode=OFFandPRAGMA synchronous=OFF. - The temporary database is synced to disk after setting
PRAGMA journal_mode=WALandPRAGMA synchronous=NORMAL. - The marker file is created and synced to disk.
- The temporary database is copied over the source database by means of
VACUUM INTO. - The marker file and the temporary database are deleted.
Additionally, before opening a database no matter what its schema version is, the following steps are performed:
- If the temporary database exists along with the "..._done" marker file, this means that a previous migration was interrupted when replacing the source database, or immediately preceding that; in this case, the temporary database is opened, copied over the source database by means of
VACUUM INTO, after which the marker file and the temporary database are deleted. - If the temporary database exists but there's no marker file, this means that previous migration has failed and in this case the temporary database is deleted.
In addition to the above, this PR adds logging of parts of migration SQL statements (first lines) when debug logging is set for stateDb in the logging section of the config. This makes it easier to see what parts of a migration are taking long time to execute.
Closes #6069
Test Plan
Tested on a mainnet node.
The PR is ready for review, but #6003 needs to be merged first, after which this PR will be retargeted to the develop branch
bors try
Codecov Report
Attention: Patch coverage is 61.42857% with 135 lines in your changes missing coverage. Please review.
Project coverage is 81.7%. Comparing base (
2af4b65) to head (64f4d38). Report is 5 commits behind head on develop.
| Files | Patch % | Lines |
|---|---|---|
| sql/database.go | 60.7% | 80 Missing and 42 partials :warning: |
| sql/schema.go | 72.2% | 5 Missing and 5 partials :warning: |
| node/node.go | 0.0% | 2 Missing :warning: |
| checkpoint/recovery.go | 0.0% | 1 Missing :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## develop #6085 +/- ##
=========================================
- Coverage 81.9% 81.7% -0.3%
=========================================
Files 312 312
Lines 34327 34620 +293
=========================================
+ Hits 28147 28312 +165
- Misses 4386 4472 +86
- Partials 1794 1836 +42
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
bors try
bors merge
:-1: Rejected by code reviews
bors merge
:-1: Rejected by code reviews
I think @acud needs to give approval for bors accepting the merge
bors merge
Build failed:
- systest-status
Systests timed out, trying again
bors try
bors cancel
bors merge