spicedb icon indicating copy to clipboard operation
spicedb copied to clipboard

Proposal: Enhanced migration framework for SpiceDB

Open josephschorr opened this issue 3 years ago • 5 comments

Problem: The SpiceDB datastores occasionally require migrations of the schema and data for the underlying services or engines. For example, adding a new index into Postgres, or writing a default transaction into MySQL.

Currently, this is handled by the spicedb migrate command, which executes all migrations up until the specified revision.

However, this design has a number of issues:

  1. The IsReady() call on datastores (except MySQL) checks for only the current migration revision: if a revision has moved onward, then it will return false, and if there is a health check based on it (which we do not currently do), it will break
    1. In MySQL, it only checks one revision backwards
  2. The migrate command mixes both DDL and DML into a single set of migrations, which can be a problem for customers using SpiceDB with migration frameworks that don’t support a mix
  3. The migrate command encapsulates the DDL changes completely, preventing external migration frameworks from easily being used (they have to run the migration into an empty DB, then copy its schema)
  4. Each datastore implementation defines its own IsReady, which leads to some duplication
  5. We currently have no handling of the scenario where different code needs to run as part of a four phase migration

Proposal

Update the migration revision system to handle backwards and forwards compatibility, break out DDL and DML migrations and define a means of testing.

DDL vs DML

  • DDL and DML are two independent types of migrations
  • DDL revisions are defined as a set of queries to run
  • DML revisions are defined as function to run

DDL Revision

const createUniqueIDTable = `CREATE TABLE metadata (
	unique_id VARCHAR PRIMARY KEY
);`

func init() {
  DatabaseRevisions.RegisterDDL(
    "6.8.0",
    []string{createUniqueIDTable},
    emptyNonTxMigration,
    Compatibility: {AtLeast: "6"}
  )
}

DML revision

func init() {
  DatabaseRevisions.RegisterDML[PostgresMigrationDriver](
    "7.0.0",
    addUniqueId,
    Compatibility: {Only: "6.8.0"}
  )
}

const insertUniqueID = `INSERT INTO metadata (unique_id) VALUES ($1);`

func addUniqueId(driver PostgresMigrationDriver) error {
   … execute update here…
}

Tracking Revision

  • Two revisions will be tracked: a DDL one and a DML one
  • DDL revision will be stored in a column name, which is not the primary key
  • DML revision will be stored in a row under the primary key (likely called dml_revision)
  • To determine the current “HEAD” revision, both will be checked and compared based on the integer prefix

Updating a DDL revision

Rename the column from its format ID to its current ID

ALTER TABLE current_revision RENAME COLUMN v6_7_0 TO v6_8_0

Updating a DML revision

Place the current revision into the dml_revision column

UPDATE current_revision SET dml_revision='7.0.0'

Checking the current revision

Select the single row from the table, which grabs both the DML revision and the column name containing the DDL revision

SELECT * from current_revision

{
  "dml_revision": "7.0.0",
  "v6_8_0": 1,
}

latest("6.8.0", "7.0.0") -> 7.0.0

Compatibility

  • Compatibility is explicitly defined by each migration, indicating with which revision(s) it is backwards compatible
  • IsReady will use the revision compatibility portion of the migration revision to determine whether the current code is compatible with the current revision

Defining compatibility

func init() {
  DatabaseRevisions.RegisterDDL(
    "6.8.0",
    []string{createUniqueIDTable},
    emptyNonTxMigration,
    Compatibility: {AtLeast: "6"}
  )
}

Encoding compatibility

Compatibility will be encoded into a “migration revision and compatibility” string that will be stored for DDL as a column name and DML as the value in the primary key:

Compatibility Struct Description Encoded Column Name
{ AtLeast: “6" } Revision is compatible with any other revision of major version 6, which can continue to run against it 6.8.0-a-6 v6_8_0__a__6
{ AtLeast: “6.1" } Revision is compatible with any other revision of 6.1 or later for major revision 6 6.8.0-a-6.1 v6_8_0__a__6_1
{ Only: “6.8.0" } Revision is only compatible with version 6.8.0 and only 6.8.0 (or 7.0.0) can run against it 7.0.0-o-6.8.0 v7_0_0__o__6_8_0

Note Only revisions will also define a gate: a version of SpiceDB that must be run and deployed before the next revision can be run. In the above example, 6.8.0 would have to be deployed before 7.0.0 can be deployed, otherwise older instances would go unhealthy.

Checking compatibility

If SpiceDB sees a revision different from its latest, it will parse the encoded form and determine whether it can continue running against that revision

If it cannot, it will return IsReady false and go unhealthy with an informative error

Note If the compatibility string format is not known by the current code (such as if we add a new form of compatibility, it will be treated as a breaking change

Testing

Provide a means for defining test data for each step of a migration, to ensure it can be end-to-end tested in CI

const createUniqueIDTable = `CREATE TABLE metadata (
	unique_id VARCHAR PRIMARY KEY
);`

func init() {
  DatabaseRevisions.RegisterDDL(
    "000007-create-metadata-table",
    []string{createUniqueIDTable},
    …
    SampleDataForTesting{
      Statements: []string{"INSERT INTO …"},
    }
  )
}

CLI changes

  • Add new flags for running only DDL or only DML migrations
  • Add a new command for retrieving the statements to execute for a DDL migration
  • Add a new command for visualizing the “flow” of migrations and SpiceDB versions for gates

josephschorr avatar Jul 07 '22 17:07 josephschorr

cc @palacerteupgrade @jhalleeupgrade

jzelinskie avatar Jul 07 '22 17:07 jzelinskie

Great proposal! Thanks for taking the time to write this; it's very thorough.

Based on this, my understanding is that ddl & dml changes aren't separated as of now so it wouldn't be possible to run or output them separately. As a first step, do you think it would be acceptable to only add a CLI command/flag that allows to output the SQL? (ddl + dml) from a certain revision? (mvp) e.g: spicedb migrate --dry-run (code would already detect which revisions need to be applied)

Perhaps I'm missing something and it isn't possible to narrow this the way I'm proposing? Please let me know.

Side question: Any reason why you guys decided on implementing the migration process yourselves instead of using something like https://github.com/golang-migrate/migrate ?

jhalleeupgrade avatar Jul 20 '22 13:07 jhalleeupgrade

As a first step, do you think it would be acceptable to only add a CLI command/flag that allows to output the SQL? (ddl + dml) from a certain revision? (mvp)

We don't have that distinction right now, which would mean changing some of the existing migrations; if we are going to do so, that's about half of the work for the proposal anyway, so it would likely be possible to start with that, but with the caveat that properly handling versions would require re-versioning the migrations yet again

Any reason why you guys decided on implementing the migration process yourselves instead of using something like https://github.com/golang-migrate/migrate ?

It appears to not support migrations that are purely code based, which is a requirement we have

josephschorr avatar Jul 20 '22 14:07 josephschorr

As a first step, do you think it would be acceptable to only add a CLI command/flag that allows to output the SQL? (ddl + dml) from a certain revision? (mvp)

We don't have that distinction right now, which would mean changing some of the existing migrations; if we are going to do so, that's about half of the work for the proposal anyway, so it would likely be possible to start with that, but with the caveat that properly handling versions would require re-versioning the migrations yet again

Any reason why you guys decided on implementing the migration process yourselves instead of using something like https://github.com/golang-migrate/migrate ?

It appears to not support migrations that are purely code based, which is a requirement we have

That makes sense. Thanks for the detailed answer.

jhalleeupgrade avatar Jul 21 '22 20:07 jhalleeupgrade

A few remaining questions:

  • Would the cmd only output the SQL by comparing with the current vision running or it would also be able to output the SQL between 2 versions? e.g: spicedb migrate 1.0.0 1.0.2 --output-only

  • From looking at the cli changes mentioned in the proposal:

retrieving the statements to execute for a DDL migration

It seems that it isn't planned to include DML changes. However, it seems that there are DML changes which are SQL statements. (e.g backfill_xid_add_indices migration). Is excluding this from the proposal on purpose?

jhalleeupgrade avatar Nov 23 '22 22:11 jhalleeupgrade