operator-controller icon indicating copy to clipboard operation
operator-controller copied to clipboard

Ability to indicate that when upgrading, you can't skip y+1

Open ncdc opened this issue 2 years ago • 14 comments

If an operator performs certain actions when upgrading to the next minor version, such as database schema migrations, we should find a way for the operator author to indicate "when you upgrade from the current version x.y.z, you must go to x.y+1.* before you can go to x.y+2.*". This ensures that the required upgrade tasks are not missed.

For a more concrete example, let's say you have 1.0.0 installed. You can upgrade to any 1.0.z. You can upgrade to any 1.1.z. You cannot upgrade to 1.2.z or newer until you've upgraded to 1.1.z. Subsequently, you can't go to 1.3.z until you've gone to 1.2.z.

This specification needs to be in the bundle/catalog metadata, controlled by the operator author. It isn't something a user would set on an Operator CR spec.

ncdc avatar Nov 14 '23 17:11 ncdc

RFC​: Initial upgrade support talks about something similar as a potential future improvement. The idea is that we replace ForceSemverUpgradeConstraints feature gate with new functionality on catalog side so that a package author can say - "I want my package to use semver" or "I want my package to use Replaces/Skip/SkipRanges semantics".

This will ensure backward compatibility of OLMv1 with OLMv0 catalogs and give package authors control over how their packages are being upgraded. For example, if they already have a package where 1.0.x is not compatible with 1.2.x they could chose to stick with OLM Replaces/Skip/SkipRanges semantics.

Most likely there are packages in the wild where even patch versions have breaking changes.

In my opinion:

  • Authors who have breaking changes between >=x.y.z <x.y+1.0 (patch versions) or more widely between >=x.y.z <x+1.0.0 (minor versions) should not be using semver. Let's enable them to stick with Replaces/Skip/SkipRanges semantics.
  • By adding extra knobs like this on top of semver we are risking to create a custom flavour of semver and confusing UX. I think we should avoid doing this if we can. In this situation I beleive we can avoid this

m1kola avatar Nov 14 '23 18:11 m1kola

I am not a fan of replaces/skips/skipRanges. I much prefer a way to indicate that you have to pass through a specific version/range before going further. Things like database migrations are extremely common in apps that use a backend database. In my head, "make sure this migration happens" is not the same category as "breaking change"; it's an implementation detail. I would not consider this a deviation from semver.

ncdc avatar Nov 14 '23 18:11 ncdc

In my head, "make sure this migration happens" is not the same category as "breaking change"; it's an implementation detail. I would not consider this a deviation from semver.

If package can not move between 1.0.0 and 1.2.0 without upgrade to some intermediate version - that means that version 1.0.0 is not backward compatible with 1.2.0 version. Packages like this are not compliant with semver becuase semver literally says:

  1. MAJOR version when you make incompatible API changes
  2. MINOR version when you add functionality in a backward compatible manner
  3. PATCH version when you make backward compatible bug fixes

And perhaps they should not be using semver for upgrades becuase it sets false expectations.

IMO we need to decide - do we want OLM to use semver? Or do we want OLM to use semver-like semantics (but not semver)?

I'm happy to discuss the topic of database migrations specifically in more details (I have some experience in this area). But summary is basically is what I mentioned in more general context above: if software at version 1.2.0 can not work as expected with schema from version 1.0.0 (or the other way around) - that means 1.0.0 is not compatible with 1.2.0.

I am not a fan of replaces/skips/skipRanges. I much prefer a way to indicate that you have to pass through a specific version/range before going further.

I'm also not a fan of replaces/skips/skipRanges, but:

  • They already exist and as far as I know we plan to support them in OLMv1 anyway
  • replaces/skips/skipRanges provides a way to indicate that an user has to pass through a specific version/range before going further
  • My understand also is that we want semver to become the primary semantic for upgrades.

Given the above I think adding another second way to deal with this use case is unnecessary and potentially can harm UX (false expectations of semver which is not really semver).

m1kola avatar Nov 15 '23 11:11 m1kola

Best case scenario: app includes all historical migration logic such that you can always upgrade from any version to any other version

Next best case scenario: you have to step through each minor (y) at least once before going to y+1. IMHO for something like a required database migration, this still fulfills the spirit of adding functionality in a backward compatible manner. To the end user (not the person upgrading the app), all functionality is backward compatible. To the person upgrading the app, this is a "gotcha" where they can't skip a y level. I don't think it is unreasonable for us to allow the package author to indicate "you must traverse through each y level at least once."

Worst case scenario 1: when the app has a database migration, it's a new major version

Worst case scenario 2: when the app has a database migration, the app author uses skips/replaces to control upgrade flow.

I'm also not a fan of replaces/skips/skipRanges, but: They already exist and as far as I know we plan to support them in OLMv1 anyway

I would like to deprecate and eventually remove them entirely, and have everyone move to semver.

ncdc avatar Nov 15 '23 14:11 ncdc

Is it common for projects to follow strict semantic versioning? Even very-well-funded projects like Kubernetes don't, and require migrations through versions. Is it realistic to only support strict semver?

stevekuznetsov avatar Nov 15 '23 14:11 stevekuznetsov

Best case scenario: app includes all historical migration logic such that you can always upgrade from any version to any other version

That is very common approach. In that case there should be no issue going from 1.0.0 to 1.2.0. But downgrade might be problematic (depending on the nature of migrations). I think we are good here already: our semver upgrade constraints do not allow downgrading at the moment. One has to explicitly set upgrade policy to Ignore and it implies that by doing so the operation was independently verified. E.g. OLM provides no guarantees here.

Next best case scenario: you have to step through each minor (y) at least once before going to y+1. IMHO for something like a required database migration, this still fulfills the spirit of adding functionality in a backward compatible manner.

This means the package in this scenario is not semver compliant. If one can not skip minor version without breaking - this means there is no compatibility. E.g. if I can go from 1.0.0 to 1.1.0, but can't go straight to 1.2.0 this means that:

  • 1.1.0 is compatible with 1.0.0
  • 1.2.0 is not compatible with 1.0.0

Hence the package is not semver compliant. Whole 1.x should be backward compatible according to semver.

Worst case scenario 1: when the app has a database migration, it's a new major version

When using semver - it depends. I'm thinking - if each version includes whole migration history and one can upgrade from 1.0.0 straight to 1.2.0 without having to go trough 1.1.x or some other version - we can say that upgrades are compatible. If each version only includes limited history - then it might be compatible and might be incompatible depending on the nature of migrations.

But incrementing a major version seems a reasonable when there are breaking changes in migrations. It is complian with semver.

When using replaces/skips/skipRange - author doesn't have to increment major version.

Worst case scenario 2: when the app has a database migration, the app author uses skips/replaces to control upgrade flow.

Also sounds reasonable if package author wants to have explicit controll over upgrade path or can't follow semver for some reason.

I'm also not a fan of replaces/skips/skipRanges, but: They already exist and as far as I know we plan to support them in OLMv1 anyway

I would like to deprecate and eventually remove them entirely, and have everyone move to semver.

That is a whole new perspective on the question. Should we decide on that first? It seems like it will be easier to decide UX taking this (or some other) approach in mind.


Is it common for projects to follow strict semantic versioning? Even very-well-funded projects like Kubernetes don't, and require migrations through versions. Is it realistic to only support strict semver?

Not everyone follows semver. Even project who say they follow semver often in reality do not do that. The thing is - we have a second option: replaces/skips/skipRanges. Today it is controlled by a feature gate on operator controller, but the idea was to enable package authors to choose which one to use on the catalog side.

m1kola avatar Nov 15 '23 15:11 m1kola

I'm not sure that I can change your mind, but I still believe that what I'm proposing is both reasonable and better than replaces/skips/skipRanges as the sole driver for the upgrade graph.

ncdc avatar Nov 15 '23 15:11 ncdc

The thing is - we have a second option: replaces/skips/skipRanges.

We also have many, many years of customer feedback on this system and it is not very positive.

stevekuznetsov avatar Nov 15 '23 15:11 stevekuznetsov

@ncdc I'm not saying it is unreasonable. As I said above - I think we should make a decision on whether we want to deprecate and remove replaces/skips/skipRanges or not. If we want to deprecate it - then that is a different story and different UX.

If we want to keep replaces/skips/skipRanges - I think having two ways to do the samy thing is going to be suboptimal and potentially confusing.

What about this?

Should we decide on that first? It seems like it will be easier to decide UX taking this (or some other) approach in mind.


We also have many, many years of customer feedback on this system and it is not very positive.

I think if we can cover 80% use cases with semver and make 20% possible (not necessary easy) - that is a good outcome. It will be additional motivation to stop using replaces/skips/skipRanges for new projects and use semver. But again. It is only if we want to keep replaces/skips/skipRanges.

m1kola avatar Nov 15 '23 15:11 m1kola

We have had quite a few organizations indicate that they need more control than "pure semver", and IMHO some structured latitude is not antithetical to semver.

MINOR version when you add functionality in a backward compatible manner

doesn't mean that one has to be expected to be able to make the leap across multiple minor versions.

For example, the semver template is designed specifically to take a collection of semver-ordered bundles and organize them into a skips/replaces graph with replaces between minor versions so that the user can skip all Z but must hop from Y to Y across the graph.

It puts some guard rails on the overly-complex replaces/skips/skipRange relationships.

grokspawn avatar Nov 16 '23 15:11 grokspawn

MINOR version when you add functionality in a backward compatible manner

doesn't mean that one has to be expected to be able to make the leap across multiple minor versions.

Maybe I'm interpreting the semver spec incorrectly. In my head whole 1.x is meant to be backward compatible. If that is not the case in reality - then the need for extra knobs seems more prominent.

m1kola avatar Nov 16 '23 15:11 m1kola

I think you understand the semantic versioning spec correctly, it's just that the theoretical ideal is rarely, if ever, adhered to. I mentioned above that even extremely-well-staffed projects like Kubernetes that take backwards compatibility very seriously do require upgrade flows that are more restrictive than pure semver.

stevekuznetsov avatar Nov 16 '23 16:11 stevekuznetsov

I've been thinking about this more and looking into different projects (including revisiting Kubernetes' version skew policy). I'm now more convinced that this is useful (especially if we want to deprecate and remove replaces/etc).

I'm still concerned that it will be confusing UX for cluster admins (package users). From what I normally see - it is package users's responsibility to follow the upgrade constraints correctly. E.g. Kubernetes advices:

  • Ensure that components are on the most recent patch version of your current minor version.
  • Upgrade components to the most recent patch version of the target minor version.

If I understand correctly - we want package authors to define constraints like this and OLM to enforce these constraints.

We need to very clearly communicate to cluster admins (via docs and/or some messages) that a pacakge is not a pure semver so there are no false expectations.


it's just that the theoretical ideal is rarely, if ever, adhered to.

I agree with you. That what I meant when I said: "Not everyone follows semver. Even project who say they follow semver often in reality do not do that".

m1kola avatar Nov 16 '23 16:11 m1kola