migrate
migrate copied to clipboard
[PROPOSAL] Use manifest file to track migration sources
Questions and feedback are most welcome!
Is your feature request related to a problem? Please describe.
Developers need to detect when concurrent changes are made to migrations during the application development and testing life cycles.
This issue has been raise multiple times:
- #179
- #65
We need to handle the following cases:
- 2 developers add migrations at the same time resulting in conflicting migration versions. e.g. if A is the latest migration and Alice adds migration B on top of migration A and Bob adds migration C, also on top of A.
Describe the solution you'd like
In a backwards compatible manner, change tooling to detect and alert developers whenever:
- Conflicting changes to migrations are made during development
- Migration sources are corrupted/inconsistent
- Could be caused by:
- Corrupt/incomplete migration source upload
- Bad VCS merges
- Could be caused by:
Changes
Note - These changes are backwards compatible. e.g. If the manifest file doesn't exist, no checks/validations are performed and old CLI versions won't validate migrations against the manifest file.
- Add a manifest file that contains:
- List of known migrations each migration's parent migration
- Checksum of every migration contents
- Checksum of every migration name and contents checksum
- Tooling
- Source drivers
- Support fetching the manifest file with
ErrNoManifestFile
andErrManifestNotImplemented
- will break non-official drivers
- Support fetching the manifest file with
- Library
- After source driver is "opened", load manifest file
- Validate integrity of manifest file
- Validate source migrations list against manifest file
- Validate integrity of migration contents against manifest file before running each migration
- CLI
- New
migrate manifest
command with the following sub-commands:-
update
- updates the manifest file with newly added migrations or updating existing migrations -
init
- Initializes the manifest file-
--force
- re-initializes the manifest file
-
-
check
- checks the integrity of manifest file and referenced migrations. Any migration with multiple children is a conflict. -
reparent
changes the parent for the specified migration to be the latest to fix migration conflicts
-
- Existing sub-commands updated:
-
create
- adds the newly create migration files to the manifest file -
up
,down
,goto
- verifying all migrations against manifest file before applying any migrations. Any migration with multiple children is a conflict.
-
- New
- Source drivers
Command | Add migration to manifest | Remove migration from manifest | Re-compute global checksum | Re-compute migration file checksum | Verify migrations |
---|---|---|---|---|---|
manifest init |
✅ | ||||
manifest init --force |
✅ | ✅ | ✅ | ✅ | |
manifest update |
✅ | ✅ | ✅ | ||
manifest reparent |
✅ | ✅ | ✅ | ||
manifest check |
✅ | ||||
create |
✅ | ✅ | |||
up |
✅ | ||||
down |
✅ | ||||
goto |
✅ |
Migration development workflow:
Create manifest file
-
migrate manifest init
Creating migrations
- Create your up and down migration files using
migrate create
- Write your migrations
- Update the manifest file with
migrate manifest update
Fix conflicting migrations
- Run
migrate manifest reparent
to remove the specified conflicting migration and re-add it as the last migration
If all else fails
-
migrate manifest init --force
- This is a big hammer approach and will re-compute the manifest file
Manifest file specs
Manifest file name: MANIFEST
Manifest file location: Source driver dependent but will probably be with the migration files Manifest file format: TOML Manifest file fields:
version = "SemVer"
checksumAlgo = "algorithm used for computing checksums" # Use SHA-256 for now
checksum = "checksum of all migration file names and checksums"
# In lexicographical order
[[migrations]]
name = "migration base file name"
parent = "parent migration base file name"
upChecksum = "checksum of up migration file contents"
downChecksum = "checksum of down migration file contents"
In the future, we could support migrating manifest file changes. e.g. change checksum algorithm, manifest file schema, and/or manifest file format
Describe alternatives you've considered
- Track applied migrations in schema versions table
- Not backwards compatible
- Would need to be implemented by all DB drivers
- Would need to manage migrations for schema versions table
- May not catch issues if migrations aren't run in development workflow
Additional context
N/A
@stephenchu would this design solve your problem? I believe a manifest file will work for both timestamp and sequence versioned migrations
A question if the manifest tooling support the following use case?
Background: One of our core applications is a Django web app which manages the database schema with its own migration tools. Other applications does not interact with the APIs the Django app provides, instead they connect to the database and fetches/updates the information they need. To control access we use migrations that adds new uses (applications) and granting access to required tables per application. These user migrations are appleid with migrate
.
I our CI we test that the user migrations can be applied again the latest Django migration state. It does not happened very often, but sometimes a we rename a table or delete a table in Django, which might cause our user migrations to fail in CI as a table that used to exist, does not longer exist. The failing migration can be an early migration (e.g. version 10) whereas the latest migration can be e.g. version 100.
To solve this with our CI, we have to update the failing migration (e.g. delete it or comment our all changes so it becomes a no-op, but leaves history) and possibly create a new migration granting a user access to new tables.
Will these kind of changes be supported with the manifest and proposed tooling?
@jabbors
I don't think you were asking this, but to be clear, interoperability with other migration tools/systems/frameworks is not a goal for migrate
.
For your specific use case, this proposed design shouldn't break your workflow.
You'll need to run migrate manifest update
after modifying past migrations to update the manifest file. Using your example, after fixing migration version 10, you'll need to run migrate manifest update
to up the checksums in the manifest file.
@dhui
Thanks for clarifying.
interoperability with other migration tools/systems/frameworks is not a goal for migrate.
You're right, I'm not asking for it and I won't. We happen to use two migrations systems and so far it has been a working concept. Just wanted to make sure the new manifest proposal will allow us to continue using migrate
the way we do.
Really nice to see some progress in this area, this is a much needed feature!
However, I still think applied migrations should be tracked in the database for the following reasons:
- with a manifest file we cannot track which migration file has been applied to a database and when (lack of full history).
- with a manifest file we cannot implement an optional "out of order" mode (the most opinionated will rage at me here).
Concerning the arguments against it:
Not backwards compatible
This is the the feature that v5 needs :). Why bothering with direct backward compatibility and just offer a migration path from v4 to v5?
Would need to be implemented by all DB drivers
Probably the "hardest" but the requirements are fairly low. A single and simple table should be enough and easy to port in to any DB. Something like this maybe?
Column | Type | Nullable |
----------------+--------------------------+----------+
version | integer | not null |
name | character varying(1000) | not null |
checksum | bytea | not null |
applied_at | timestamp with time zone | not null |
execution_time | integer | not null |
success | boolean | not null |
Would need to manage migrations for schema versions table
Isn't this tool supposed to solve this problem? Joke apart, it should be fairly easy to manage migration of a single table and this is something we may not even need for the first version.
May not catch issues if migrations aren't run in development workflow
I think this is a job for the CI.
@dahu33 Do you need to know when a particular migration was applied? With linearized migrations, you can tell which migrations have and have not been applied by seeing if the version number is less than or greater than the last applied migration.
with a manifest file we cannot implement an optional "out of order" mode (the most opinionated will rage at me here).
What's your use case for not running migrations in order? What's prohibiting you from reordering and serializing your migrations?
Why bothering with direct backward compatibility and just offer a migration path from v4 to v5?
How would a multi-node deployment upgrade migrate
from v4 to v5 without any issues or down time without backwards compatibility?
@mgdelacroix
we've checked the proposal that was made on #470, but this will not fix our use case, as a file is local to a given system, and this will not support clustered scenarios with multiple servers using the same database.
Can you provide more details as to why the manifest file approach wouldn't work for your use case? Each node in the cluster should validate the manifest file before applying any migrations. The manifest file will force migrations to be linear as they're committed to VCS. Are your migrations managed by VCS?
The idea we have it basically to reuse the existing migrations table without changing it. Instead of tracking only the last version applied, we would have one row per applied migration, and
go-migrate
would check which migrations it should apply based on those.This should make easy to maintain backwards-compatibility for several drivers, as in the case of relational databases like mysql or postgres, migrating from the old system is just a matter of reading the last migration applied, and marking all migrations up to that point as applied creating a row for each.
This is backwards compatible from a data perspective but it'd still need to be implemented by all DB drivers for feature parity. Also marking migrations as applied without actually applying them is dangerous. We'd need a way to detect that the schema migrations table hasn't been migrated yet before backfilling the data. e.g. noticing that there's only 1 row in the schema migrations but more than 1 migration file. However, with this detection approach, there's the edge case of a team starting out with zero (or one) migration(s) and starts with a conflict.
Another disadvantage with tracking all applied migrations in the DB to detect conflicts is that it requires a DB to detect migration issues early, which increases the complexity of a developer's toolchain making a more complicated dev environment. With the manifest file approach, detecting migration conflicts is builtin and only requires VCS, which most devs are already using.
Also marking migrations as applied without actually applying them is dangerous.
@dhu this is only dangerous if we do not know beforehand the state of the storage/data, which is not the case on the example that @mgdelacroix was referring to. We see it as a matter of convergence between source files(intention) and data (result).
The flow will be like this:
check how many migration files you have in the source SF
, check how many migrations you have in the data DM
,
- If DM == 1 and SF > 1 then we know for sure that we need to no-op apply migrations up to the DM version, then we can continue normally by adding new rows for each new migration added.
- Otherwise, we are in the new system(a.k.a tracking migrations in the db).
Does that make sense?
Another disadvantage with tracking all applied migrations in the DB to detect conflicts is that it requires a DB to detect migration issues early
Can you please elaborate more on this? We are not suggesting of adding any new components in the toolchain that devs already do not have in their arsenal.
The birds-eye view of the proposed design is the following:
Source files say I have N
migrations, Store says I have M
migrations. Converge M
to be equal to N
by applying the difference. This mechanism is used from other migration frameworks in other communities like ActiveRecord for Rails and Django migrate.
We probably need to consider a migration version to be the full file name seq+name
or timestamp+name
, since we can have scenarios that two people generated migrations at the exact same time but with different names, which essentially we should treat them as two distinct valid migrations.
@nronas @mgdelacroix Can you talk about why the manifest file approach not fit your needs? e.g. what capability is lacking?
Another disadvantage with tracking all applied migrations in the DB to detect conflicts is that it requires a DB to detect migration issues early
Can you please elaborate more on this? We are not suggesting of adding any new components in the toolchain that devs already do not have in their arsenal.
The manifest file approach will detect migration conflicts due to ordering earlier in the development lifecycle. e.g. at commit/merge time With tracking applied migrations in a DB, ordering conflicts won't be detected until migrations are applied on a shared DB. e.g. after merge and deploy
we can have scenarios that two people generated migrations at the exact same time but with different names
Good call out! This push more towards managing the schema migrations table using migrate
Hey @dhui
Here's an issue from me where I describe this problem and try to propose a solution https://github.com/golang-migrate/migrate/issues/587
I like the idea of having TOML manifest, but here are my thoughts on this:
I'm not sure how hashsum of migration files will help with avoiding file modifications. If someone really wants to do that, I don't see why we should prevent it. Your proposal does not bring full history support, so we can't really do anything with hashsum check when running the migration. It will only force developer to update manifest as well as migration files. Keeping track of file changes is a responsibility of VCS. I see that having hashsum is only useful together with full migration history which brings a lot of new challenges.
If you stop thinking about hashsum for a bit, we are left with the only problem: having dependency chain between migration files and validating it. This could be easily achievable without manifest if we put this data inside migration files:
-- migrate:prev 101010_previous_migration
ALTER TABLE ...
I know that this example is pretty naive as we have json
migrations and stuff, but I'm pretty sure we can figure out other formats as well.
I'm just throwing ideas around in hope that we can really fix this one day. This or MANIFEST file works for us, but in my opinion overcomplicating a problem makes it too hard to solve
Hi @AnatolyRugalev
Thanks for your input!
The manifest file is storing what you're referring to as the "dependency chain" and the updated tooling will verify this.
Storing migration metadata within the migrations themselves adds complexity and another avenue for introducing errors. Also, migrate
avoids parsing migrations files.
This or MANIFEST file works for us, but in my opinion overcomplicating a problem makes it too hard to solve
The manifest file + updated tooling is backwards compatible and adds a bit more complexity but I'm all for a simpler design. Do you have any suggestions for a simpler approach?
What I meant by overcomplicating is the fact that this proposal attempts to solve multiple problems which makes it hard for all parties to agree. To be clear, I completely support manifest idea, just wanted to provide some alternative ideas :)
Let me try to be helpful and ask some questions:
- What level of control on manifest
source.Dirver
will have?- provide file path? (seems too limited)
- provide reader/writer?
- provide unmarshaled manifest struct? (this has potential support for other formats, including storing in the database, see last paragraph)
- How to handle manifest writes in remote sources (git/s3)? Should support of this be optional?
- Will manifest modification tools be available as a public package? We don't use migrate CLI directly, so this is kind of requirement for our use-case
On a side note, I noticed that having all manifest-related infrastructure in Migrate could help us to bring full migration history support to it as well. In database.Driver
we can optionally support storing latest version of manifest with relatively simple additions and leave it to database.Driver
to decide the format of it.
I can allocate some time to implement a draft, but my vision could be a bit different from the proposal. Let me know if you are OK with that
What I meant by overcomplicating is the fact that this proposal attempts to solve multiple problems which makes it hard for all parties to agree. To be clear, I completely support manifest idea, just wanted to provide some alternative ideas :)
The main goal is to prevent conflicting or concurrent migration development as early in the development lifecycle as possible and provide tooling detect and resolve the conflicts as easily as possible.
Migration content validation is nice to have but is not necessary.
I'd love to discuss to ideas/solutions and how they compare w/ the manifest file approach with respect to these goals and how it fits within the project's existing goals and constraints!
Let me try to be helpful and ask some questions:
- What level of control on manifest
source.Dirver
will have?
I was thinking that the source drivers would be more simple and the migrate
library would manage the manifest file. e.g. create, update, and validate it. The source driver would only be responsible for reading the manifest file.
On a related note, I don't know if it's useful to expose any of the manifest file management methods since I don't know of any use cases where developers are programmatically creating and modifying migrations in Go. I only know of cases where migrations are run in Go. We might be able to avoid exposing any manifest management methods but maybe it'd be useful to expose a manifest validate/check method to allow people to proactively validate/check the manifest file. We'd also need to update the CLI with the proposed commands.
- How to handle manifest writes in remote sources (git/s3)? Should support of this be optional?
migrate
doesn't manage deploy of sources so we shouldn't manage the deploy of the manifest file either. Creating/updating manifest files should be done in the dev environment e.g. local filesystem
I was thinking that the MANIFEST
file would be in the same directory/folder as the migration files.
- Will manifest modification tools be available as a public package? We don't use migrate CLI directly, so this is kind of requirement for our use-case
Can you describe your use case? e.g. how are migrations created and do you modify existing migrations during development?
On a side note, I noticed that having all manifest-related infrastructure in Migrate could help us to bring full migration history support to it as well. In
database.Driver
we can optionally support storing latest version of manifest with relatively simple additions and leave it todatabase.Driver
to decide the format of it.
I don't see how the manifest file helps the database driver track history. Tracking all applied migrations in the DB is separate from manifest files.
Also to be clear, the version
field in the MANIFEST
file tracks the manifest schema (e.g. TOML format/fields) version. It does not version your migrations.
I can allocate some time to implement a draft, but my vision could be a bit different from the proposal. Let me know if you are OK with that
This is great! We can iterate and iron out the differences between the current proposal and your vision. However, I can't promise a timely review or merge since I'm swamped at the moment.
@dhui thank you for posting out this idea. Has this approach been taken ahead yet to implementation yet? Since we are in our project we are facing the problem being addressed by this approach and if we might want to consider our options if the fix isn't available soon. It would help us plan better, thank you.
go-migrate
doesn't work for us, either as it is now or with this proposal. As a result, I'm probably going to write our own, but I wanted to avoid that, and I definitely want to document why. I'll describe our situation, and then give my theory about linearizing migrations.
Our situation
We currently have a monorepo, though it is a relatively small one at 116Kloc. We have six or seven teams working on the code with independent work streams. The number of developers is hard to estimate because this isn't the only project, but it is somewhere between 30 and 40. We have 15-20 pull requests open at a time, about half of which have DDL migrations. We have a single, monolithic database.
We have compliance requirements that every change needs a PR with review and an approval by a second developer before merge. We enforce this with GitHub branch protection rules.
Our processes have been very manual, which worked great for a long time but now cannot scale, and we are working toward a completely automated CI/CD pipeline -- which is why I'm looking at migration solutions.
Current State Consequences
Using the current migrate library would lose migrations.
If we have 15-20 PRs open with about half of them having database migrations, this means we have 7-10 migrations "in flight". It's probably not true, but for sake of simplicity, let's assume that all of these migrations have as "parent" the most recent migration merged to the trunk branch. Since migrations are kept in separate files and therefore do not cause merge conflicts, it is possible that all 7-10 migrations can be merged in any order after passing our rigorous CI steps, including tests against migrated versions of the database. If we had 7 migrations and our CD pipeline ships each feature on merge, then 5039 of the 5040 possible merge orders would skip at least one migration.
This would roll us back from our intended automated CI/CD pipline to a manual merge process run by a team that we have affectionately caused "the bottleneck team" or "the manual CI team."
MANIFEST State Consequences
Using this MANIFEST proposal forces us to keep our manual merge/resolution process - "the bottleneck team."
Instead of skipping migrations in some of the 7-10 pull requests, all but one of the set of pull requests becomes a merge conflict which must be manually resolved. If this merge conflict is resolved correctly, we can guarantee that the migrations are run correctly.
Aside 1: This prevents us from using unix timestamps as sequence numbers, since the order must be the merge order, which is not the authoring order.
Aside 2: This makes it more difficult to having different environments managed on different branches. While we are trying to get away from this, it's useful to point out that the order things enter our staging/QA environment in our current process is NOT the order they enter our production environment, due to feature priorities, QA failures, etc. This means the conflicts must be resolved differently when merging a feature to staging or to production.
So the consequences here are that, in order to merge a pull request, a person will have to:
- Submit a pull request, request a review
- Get an approval, possibly some time later
- Almost always learn that something has been merged in between making the PR and getting the approval.
- Update the PR, fixing the migration. This dismisses the approval.
- Re-request review. Since this is just re-syncing the branch, hopefully the same approver will glance at it and re-approve.
- Check - and hope - that if something has been merged in the interim, it has no migrations. If so, go to step 4.
- Merge.
The manual process that we are trying to get away from also linearizes the git history, and this is similar to our current struggles in this way.
A Note
Five or maybe six years ago now, I worked at Groupon, and the primary Rails app had 300 committers at that time, with about 100 commits per day. The CI pipeline there ran all tests on the branch, and if passed, merged to a locked trunk branch and re-ran all the tests (applying all migrations). If that succeeded, that became the new trunk HEAD and the lock was released.
Note that while this linearizes merges in a way, this is not the same ordering as linearizing migrations. There would be no way to use this tool in with MANIFEST, since the CI would almost never be able to automatically merge and resolve the orders (and doing so would essentially be undoing what MANIFEST attempts to do). In this environment, development would come grinding to a very terrible halt.
Theory
In a nutshell, attempting to linearize the application of migrations linearizes the development process as a whole. The MANIFEST idea achieves linearization effectively, but what causes us problems is not an implementation detail of the MANIFEST, but the attempt at linearization itself. We are very purposely trying to delinearize the development process and decouple the teams so that we can alleviate our bottleneck.
But then there's the question, "Don't we need to linearize migrations?" I think the answer here is, "No."
It's clear that applying migrations is not fully commutative -- there are ordering requirements, clearly:
- You cannot add a column to a table before creating it.
- You cannot rename a column before adding it.
But it's also true that we can construct migrations which can succeed in different orders but produce different results:
INSERT INTO b VALUES (SELECT COUNT(*) FROM a);
and
INSERT INTO a VALUES (42);
This could possibly break some kind of invariant in the database.
I think the key here is that any migration could depend on some prior parent (and perhaps more than one), but not necessarily the immediately prior parent. We clearly have many common use cases where two parallel lines of development go like so:
A1:
CREATE TABLE a ( ... );
A2:
ALTER TABLE a ADD COLUMN ...;
B1:
CREATE TABLE b ( ... );
B2:
ALTER TABLE b ADD COLUMN ...;
A1 and B1 have no dependencies, while A2 depends on A1 and B2 depends on B1, and no ordering dependencies are required among A2 and B2. I assert this is a preorder and not the total ordering that requires linearization.
So how can we model it? Someone here proposed that we put a comment at the top indicating which thing a migration depends on. That works. But using plain timestamps as sequence numbers also works. While we could, if we worked at it, produce counter examples like the COUNT one above, timestamps are actually surprisingly useful.
When we are working, we are implicitly depending on everything that exists in our local repo at the time we make our migration. It's true that when we merge, we unintentionally model more dependencies, and could even bring in one that was created after our branch was started but before our migration was written, but I think the effects of this can be controlled.
I think there are parallels here to merging code and the history of version control solutions. RCS and CVS and even SVN didn't do branches very well, and everything had a mostly linear number. This didn't scale, and that is why BitKeeper was invented and Linus chose to use it (until that relationship fell apart). This was a huge boon which unblocked a lot of asynchronouse, divergent development and added merging of histories. Sometimes these actually conflict -- and sometimes they conflict in subtle ways that do not fail to automatically merge! -- but we still use them because they enable our teams to work independently. We use tests - hopefully automated, sometimes not - to ensure that we don't violate any of those subtle merge issues. In the database world, we have a lot more constraint options than just tests, but we have at least tests when it matters.
@dhui first off, thanks for all your hard work. 🙇
It would be difficult to build a one sized fits all solution.
I'm personally in favor of the manifest proposal. I think it would solve big issues for the majority of the community.
Is there an experimental branch with this feature?
If not, are you aware of any community maintained solutions that can pair well with golang-migrate
that use the manifest approach described here?