fix(async-mig): Hande bad upgrade past required async migration better
Problem
There are various problems we run into:
- The UI didn't load if definition wasn't available because
get_parameter_definitionsthrows.
is_hiddenintroduced in https://github.com/PostHog/posthog/pull/10758 works only until we never release it, as once we push a version where it's no longer hidden we'd be adding the info to the DB. But for the current release this is fine as we never released it yet.- The migration can get stuck in starting state if worker was down or if someone tried to run the migration when the code wasn't available yet, in both cases to get out of this state we needed to connect to postgres and update it.
context: https://posthog.slack.com/archives/C01MM7VT7MG/p1658349746616759?thread_ts=1658349390.528819&cid=C01MM7VT7MG
Changes
- Loading the async migrations page works if we have a migration in the DB that doesn't have matching code (rolled back when trying to bypass an async migration).
- Disallow starting a migration not in Starting or NotStarted state.
- Fail migration at start-up if migration definition is not available & better error from invalid versions.
- Migrations in starting state can be stopped from the UI. We can trigger a stop and rollback for the migration, which is the same actions we can do from a running state. What happens in the backend: if the migration is running already, then it will be stopped as if it hit the button in the running state, if it was in starting state, then try to atomically move it to RolledBack (note that we now check that no-one messed with the migration state after the pre-checks).
- Instead of the is_hidden we now just have the ignoring posthog version flag that we can override in instance settings and we can use an unreleased PostHog version to make it not available. This means we don't need to add a special flag per async migration in the future like we did for 0006 currently either.
- Split migrations into 3 tabs: actionable, future and competed. For future migrations we'll show versions - this is especially useful when folks updated too far to know which version they need to update to (instead of checking the logs from the migration pod). Also set some min and max versions to make this look nicer.
- Updated the version comparison for PostHog version too, so folks on master (or playground pre-release) won't see a new version update being available
👉 Stay up-to-date with PostHog coding conventions for a smoother review.
How did you test this code?
Ran locally - added a migration id=10 to the db locally for which code doesn't exist, the UI loads & can start, stop that migration as expected.
Changed the ignore setting to True and saw this error as expected (in the code after version checks).
Note that the pending migrations check already works as needed looking only at async migrations that are in version range

Added some custom migrations to show what things would look like:

You've committed a junk package.lock file. Please remove.
This PR hasn't seen activity in a week! Should it be merged, closed, or further worked on? If you want to keep it open, post a comment or remove the stale label – otherwise this will be closed in another week.