operator-lifecycle-manager How to update CR when an operator is upgraded

How to update CR when an operator is upgraded #1354 THis is an existing question and not able to open the previous issue. so creating a new one. https://github.com/operator-framework/operator-lifecycle-manager/issues/1354

Mar 22 '22 10:03 wslaasya

You should structure your controllers to reconcile all CRs in the cluster when they start up. This accounts for: a) Any changes brought in by the new operator version b) Any changes that happened in the cluster when the operator was not running

Mar 22 '22 12:03 joelanford

If you're using a standard operator-sdk/kubebuilder/controller-runtime based operator, you're very likely already getting this for free.

Mar 22 '22 12:03 joelanford

@joelanford For example I have 2 versions of operator

Operator V1 - CRD V1 - alpha1
Operator V2 - CRD V2 - alpha2

My Operator V1 is already deployed and there are 100 CRs for the alpha1 version. Now if i publish a new version V2 of my operator to OLM's catalog, OLM will auto detect that the operator V2 is ready to be deployed and it deploys V2 version of Operator.

Here are few questions that i have:

I want to understand what happens to the 100CRs with this new version update?
- Is OLM going to handle update of all these 100CRs to the new version as part of the operator update?
- If yes how is the status of these 100CRs notified ?
Also after all these 100CRs are updated to latest version, is the Operator CRD V1 version going to be deleted?

Can you explain in a little detail the behavior of how OLM is handling the CRD and CR upgrades. Or point me to any documentation if available

Mar 23 '22 12:03 wslaasya

can someone please provide inputs here? I am looking for an answer for the below questions:

Does OLM provide capabilities to update CRs on Operator update(CRD)
If yes please share the reference documentation.
If not then is the operator controller solely responsible for doing this?

Mar 24 '22 13:03 wslaasya

@joelanford can you please provide your inputs on this? Can someone else please reply to the above question ?

Mar 28 '22 04:03 wslaasya

@wslaasya When you publish a newer version of your CRD there are a couple of safe-guards in place that OLM runs before proceeding with the update. Primarily it will try to see if the existing set of CRs on the cluster can be converted to the new version. If it cannot, the update will be stopped. This is why you should generally use versioned CRDs, to establish migration.

The next thing that's important is that something on the cluster actually needs to update all the existing CRs to the new schema version of the newer CRD. This is in order to avoid data loss problems and described in more detail here

OLM doesn't do this because this is generally seen as a problem to be solved at the API server level. So the kube storage version migrator was developed as part of this KEP: https://github.com/kubernetes/enhancements/pull/2856 - development there is currently stalled.

So as of today your operator would be responsible for updating all its CRs to the newest CRD version. You should pay attention to a scenario where you remove a CRD version when going from one CRD spec in your package to the next. That's when the problems discussed in StoredVersion API might kick in. For now the best thing to do is not to remove CRD versions among operator updates and have your operator auto-migration all stored CR versions upon startup.

Mar 28 '22 09:03 dmesser

@wslaasya thanks for bringing this up -- we don't necessarily think that OLM should handle updating the CRs themselves, it's more up to the api-server. @awgreene worked on the storage version migrator proposalthat was referenced in the comment above that would help with this particular issue, but unfortunately it is held up at the moment for technical reasons.

Mar 31 '22 19:03 exdx