pgroll icon indicating copy to clipboard operation
pgroll copied to clipboard

Validate a migration file

Open divyenduz opened this issue 1 year ago • 5 comments

Frontend uses the JSON schema to validate pgroll JSON migrations, however, columns like default should be an SQL expression can only be represented as String in the JSON schema. This makes validating them hard. Maybe pgroll can provide a validate command/API that can be used to see if a JSON migration is correct or not? Frontend can use this to validate before starting the migration and pgroll CLI users too can benefit from it.

As a CLI command, this can be something like: pgroll validate <migrationfile>

divyenduz avatar Jul 11 '24 09:07 divyenduz

Related https://github.com/xataio/pgroll/issues/169

divyenduz avatar Jul 11 '24 10:07 divyenduz

This feels almost like a dry run where we check things, but don't actually make any changes.

I'm also thinking that in order to properly validate the migration we would need to actually connect to the db. The syntax may be correct for example but fail because the tables don't exist. Another reason to connect to the DB is that I think we probably want to avoid doing any kind of syntax parsing in our code but leave that to Postgres.

@andrew-farries WDYT?

ryanslade avatar Oct 15 '24 14:10 ryanslade

I think #169 is a better description of what we want here; the ability to dry-run a migration to ensure that the up and down SQL works against all values in the target database before starting the migration.

I'm not sure there is a useful middle ground between:

  • the validation we do via the Validate method in each operation
  • performing a dry run as described above.

I'm not sure how a dry-run would be implemented though; how would it be possible to ensure that the data migration resulted in values that satisfy any new constraints without actually performing the data migration.

Did you have any thoughts on how dry-run could be implemented @exekias ?

andrew-farries avatar Oct 15 '24 15:10 andrew-farries

I would scope this issue to strictly validating the JSON against the spec. In the future we could improve this by:

  • Checking the migration is valid against the current schema
  • Dry-run

About dry run (again, out of the scope here): In my mind, we can know what are the constraints on a column, so we can apply the up function against current values (without touching the table) and check those values against the constraints (we can leverage postgres to evaluate this).

exekias avatar Oct 16 '24 08:10 exekias

Re-reading the internal discussion that prompted this issue it seems that what we wanted here was a way to run migration validation as a separate step prior to migration execution.

Currently migrations are validated prior to migration start, but that validation can't be run as a separate step without invoking migration start.

Providing a way via CLI and API to validate a migration independently of starting a migration would allow clients to catch a wider range of errors before running a migration than simple JSON schema validation.

andrew-farries avatar Oct 16 '24 08:10 andrew-farries