synapse icon indicating copy to clipboard operation
synapse copied to clipboard

Bail during start-up if there are old background updates.

Open clokep opened this issue 2 years ago • 6 comments

This is a basic implementation to enforce that some old background updates have run.

My hope is that this could be used to avoid future situations like #16192. This would fix #16047 in a simplistic manner and doesn't attempt to differentiate between different types of background updates.

This still would require us to do thinking about what old updates are now required, but I think it would fix the issue where we have to continually create background jobs due to dependencies.

TODO

  • [x] Enforce that ordering is filled in (non-nullable).
  • [x] ~~Maybe try to fix that BACKGROUND_UPDATES_COMPAT_VERSION < last full schema dump?~~
  • [x] Documentation.

clokep avatar Sep 28 '23 13:09 clokep

Putting up for review as I'd be interested in feedback on this. Is this approach worth pursuing?

clokep avatar Sep 28 '23 13:09 clokep

One thought I had was that we need to think a bit about how this interacts with security releases, i.e. if we put out a high severity fix would this potentially block large servers from upgrading (if they were still churning through a large BG update)?

erikjohnston avatar Sep 29 '23 09:09 erikjohnston

Similarly: suppose we write a bad background update that takes aaaaaaaaaaages to run. Months later we discover this, after we've bumped the schema version. Do we have the ability to patch the background update itself?

(See e.g. the rebuild user directory background update, for example)

DMRobertson avatar Sep 29 '23 09:09 DMRobertson

I think the response to both those is the same -- you want the minimum background update version to trail ages behind your current release. So it really becomes a process thing.

You could still have an issue where a deploy of Synapse is also very far behind and then needs to jump due to security releases though, I don't have a good answer to that.

Similarly: suppose we write a bad background update that takes aaaaaaaaaaages to run. Months later we discover this, after we've bumped the schema version. Do we have the ability to patch the background update itself?

I think the (annoying) solution you do is a point release before you bump the required version. People can then update to that version with the improved performance.

clokep avatar Sep 29 '23 11:09 clokep

Enforce that ordering is filled in (non-nullable).

ordering is NOT NULL, but has a default of 0. 🤷

clokep avatar Oct 02 '23 17:10 clokep

I think this is ready for an initial review for real.

clokep avatar Oct 16 '23 20:10 clokep