spicedb
spicedb copied to clipboard
Proposal: Enhanced migration framework for SpiceDB
Problem: The SpiceDB datastores occasionally require migrations of the schema and data for the underlying services or engines. For example, adding a new index into Postgres, or writing a default transaction into MySQL.
Currently, this is handled by the spicedb migrate command, which executes all migrations up until the specified revision.
However, this design has a number of issues:
- The
IsReady()call on datastores (except MySQL) checks for only the current migration revision: if a revision has moved onward, then it will return false, and if there is a health check based on it (which we do not currently do), it will break- In MySQL, it only checks one revision backwards
- The migrate command mixes both DDL and DML into a single set of migrations, which can be a problem for customers using SpiceDB with migration frameworks that don’t support a mix
- The migrate command encapsulates the DDL changes completely, preventing external migration frameworks from easily being used (they have to run the migration into an empty DB, then copy its schema)
- Each datastore implementation defines its own
IsReady,which leads to some duplication - We currently have no handling of the scenario where different code needs to run as part of a four phase migration
Proposal
Update the migration revision system to handle backwards and forwards compatibility, break out DDL and DML migrations and define a means of testing.
DDL vs DML
- DDL and DML are two independent types of migrations
- DDL revisions are defined as a set of queries to run
- DML revisions are defined as function to run
DDL Revision
const createUniqueIDTable = `CREATE TABLE metadata (
unique_id VARCHAR PRIMARY KEY
);`
func init() {
DatabaseRevisions.RegisterDDL(
"6.8.0",
[]string{createUniqueIDTable},
emptyNonTxMigration,
Compatibility: {AtLeast: "6"}
)
}
DML revision
func init() {
DatabaseRevisions.RegisterDML[PostgresMigrationDriver](
"7.0.0",
addUniqueId,
Compatibility: {Only: "6.8.0"}
)
}
const insertUniqueID = `INSERT INTO metadata (unique_id) VALUES ($1);`
func addUniqueId(driver PostgresMigrationDriver) error {
… execute update here…
}
Tracking Revision
- Two revisions will be tracked: a DDL one and a DML one
- DDL revision will be stored in a column name, which is not the primary key
- DML revision will be stored in a row under the primary key (likely called
dml_revision) - To determine the current “HEAD” revision, both will be checked and compared based on the integer prefix
Updating a DDL revision
Rename the column from its format ID to its current ID
ALTER TABLE current_revision RENAME COLUMN v6_7_0 TO v6_8_0
Updating a DML revision
Place the current revision into the dml_revision column
UPDATE current_revision SET dml_revision='7.0.0'
Checking the current revision
Select the single row from the table, which grabs both the DML revision and the column name containing the DDL revision
SELECT * from current_revision
{
"dml_revision": "7.0.0",
"v6_8_0": 1,
}
latest("6.8.0", "7.0.0") -> 7.0.0
Compatibility
- Compatibility is explicitly defined by each migration, indicating with which revision(s) it is backwards compatible
IsReadywill use the revision compatibility portion of the migration revision to determine whether the current code is compatible with the current revision
Defining compatibility
func init() {
DatabaseRevisions.RegisterDDL(
"6.8.0",
[]string{createUniqueIDTable},
emptyNonTxMigration,
Compatibility: {AtLeast: "6"}
)
}
Encoding compatibility
Compatibility will be encoded into a “migration revision and compatibility” string that will be stored for DDL as a column name and DML as the value in the primary key:
| Compatibility Struct | Description | Encoded | Column Name |
{ AtLeast: “6" }
|
Revision is compatible with any other revision of major version 6, which can continue to run against it | 6.8.0-a-6
|
v6_8_0__a__6
|
{ AtLeast: “6.1" }
|
Revision is compatible with any other revision of 6.1 or later for major revision 6 | 6.8.0-a-6.1
|
v6_8_0__a__6_1
|
{ Only: “6.8.0" }
|
Revision is only compatible with version 6.8.0 and only 6.8.0 (or 7.0.0) can run against it | 7.0.0-o-6.8.0
|
v7_0_0__o__6_8_0
|
Note
Onlyrevisions will also define a gate: a version of SpiceDB that must be run and deployed before the next revision can be run. In the above example,6.8.0would have to be deployed before7.0.0can be deployed, otherwise older instances would go unhealthy.
Checking compatibility
If SpiceDB sees a revision different from its latest, it will parse the encoded form and determine whether it can continue running against that revision
If it cannot, it will return IsReady false and go unhealthy with an informative error
Note If the compatibility string format is not known by the current code (such as if we add a new form of compatibility, it will be treated as a breaking change
Testing
Provide a means for defining test data for each step of a migration, to ensure it can be end-to-end tested in CI
const createUniqueIDTable = `CREATE TABLE metadata (
unique_id VARCHAR PRIMARY KEY
);`
func init() {
DatabaseRevisions.RegisterDDL(
"000007-create-metadata-table",
[]string{createUniqueIDTable},
…
SampleDataForTesting{
Statements: []string{"INSERT INTO …"},
}
)
}
CLI changes
- Add new flags for running only DDL or only DML migrations
- Add a new command for retrieving the statements to execute for a DDL migration
- Add a new command for visualizing the “flow” of migrations and SpiceDB versions for gates
cc @palacerteupgrade @jhalleeupgrade
Great proposal! Thanks for taking the time to write this; it's very thorough.
Based on this, my understanding is that ddl & dml changes aren't separated as of now so it wouldn't be possible to run or output them separately. As a first step, do you think it would be acceptable to only add a CLI command/flag that allows to output the SQL? (ddl + dml) from a certain revision? (mvp)
e.g: spicedb migrate --dry-run (code would already detect which revisions need to be applied)
Perhaps I'm missing something and it isn't possible to narrow this the way I'm proposing? Please let me know.
Side question: Any reason why you guys decided on implementing the migration process yourselves instead of using something like https://github.com/golang-migrate/migrate ?
As a first step, do you think it would be acceptable to only add a CLI command/flag that allows to output the SQL? (ddl + dml) from a certain revision? (mvp)
We don't have that distinction right now, which would mean changing some of the existing migrations; if we are going to do so, that's about half of the work for the proposal anyway, so it would likely be possible to start with that, but with the caveat that properly handling versions would require re-versioning the migrations yet again
Any reason why you guys decided on implementing the migration process yourselves instead of using something like https://github.com/golang-migrate/migrate ?
It appears to not support migrations that are purely code based, which is a requirement we have
As a first step, do you think it would be acceptable to only add a CLI command/flag that allows to output the SQL? (ddl + dml) from a certain revision? (mvp)
We don't have that distinction right now, which would mean changing some of the existing migrations; if we are going to do so, that's about half of the work for the proposal anyway, so it would likely be possible to start with that, but with the caveat that properly handling versions would require re-versioning the migrations yet again
Any reason why you guys decided on implementing the migration process yourselves instead of using something like https://github.com/golang-migrate/migrate ?
It appears to not support migrations that are purely code based, which is a requirement we have
That makes sense. Thanks for the detailed answer.
A few remaining questions:
-
Would the cmd only output the SQL by comparing with the current vision running or it would also be able to output the SQL between 2 versions? e.g:
spicedb migrate 1.0.0 1.0.2 --output-only -
From looking at the cli changes mentioned in the proposal:
retrieving the statements to execute for a DDL migration
It seems that it isn't planned to include DML changes. However, it seems that there are DML changes which are SQL statements. (e.g backfill_xid_add_indices migration). Is excluding this from the proposal on purpose?