InstallPlan adding "replaces" field to CSV
Bug Report
What did you do? I created a catalog image without any replace field in the CSV.
{"name":"ibm-truststore-mgr.v1.1.0","version":"1.1.0","replaces":"ibm-truststore-mgr.v1.0.0","skipRange":">1.0.0 <1.1.0","skips":null,"channelName":"1.x","bundlePath":"<registry-path>/ibm-truststore-mgr-operator-bundle@sha256:e17760f6b71f09dbaed848a948e9da485685221e1415b241d19d9a0b1c02ce29"}
{"name":"ibm-truststore-mgr.v1.2.0","version":"1.2.0","replaces":"ibm-truststore-mgr.v1.1.0","skipRange":">1.1.0 <1.2.0","skips":null,"channelName":"1.x","bundlePath":"<registry-path>/ibm-truststore-mgr-operator-bundle@sha256:2cd046c6636f4608ee8cb0335a9d227527e73fdd79571e8fbe05aedc42f25edc"}
{"name":"ibm-truststore-mgr.v1.2.2","version":"1.2.2","replaces":"ibm-truststore-mgr.v1.2.0","skipRange":">=1.2.0 <1.2.2","skips":null,"channelName":"1.x","bundlePath":"<registry-path>/ibm-truststore-mgr-operator-bundle@sha256:f0a8d46d2697e36f650246aff2023a6dfb7211b68a3addc17a2d7d1aadddbf04"}
{"name":"ibm-truststore-mgr.v1.3.0-pre.stable","version":"1.3.0-pre.stable","replaces":null,"skipRange":">=1.0.0 <=99.0.0","skips":null,"channelName":"stable","bundlePath":"<registry-path>/ibm-truststore-mgr-operator-bundle:latest-stable"}
{"name":"ibm-truststore-mgr.v1.3.0-pre.tnoppc","version":"1.3.0-pre.tnoppc","replaces":null,"skipRange":">=1.0.0 <=99.0.0","skips":null,"channelName":"tnoppc","bundlePath":"<registry-path>/ibm-truststore-mgr-operator-bundle:latest-tnoppc"}
tnoppc is the default channel and has no replaces in the CSV.
What did you expect to see? Subscription, CSV and installplan should all work as usual, installing the operator deployment.
What did you see instead? Under which circumstances?
For some reason, installplan status field is showing a replaces field pointing to the same CSV name, causing it to be added to my CSV. It is causing a loop as the CSV cannot replace itself.
This issue is intermittent though. Sometimes it just works and I don't see replaces line in the CSV spec, which is very weird. How can the same catalog/channel show different deployment behaviors?
What did add replaces field in my CSV? It is clearly not there when I look at the installplan config map before approving it.
Environment
- operator-lifecycle-manager version:
b3aabf273e0ac0bd6e84d257332e2eac08f5e6cf
- Kubernetes version information:
Openshift 4.8: Server Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.6+b82a451", GitCommit:"cefce093e4e5bc9a1916eb5a489ed37c7d467f6f", GitTreeState:"clean", BuildDate:"2022-02-05T06:58:30Z", GoVersion:"go1.16.6", Compiler:"gc", Platform:"linux/amd64"}
- Kubernetes cluster kind:
Possible Solution
Identify why some OLM component is adding replaces field to my CSV.
Additional context
My install plan shows replaces field even though I don't have it in the CSV.
{"kind":"ConfigMap","name":"10f8e94f1384dd22e6072068ab641b53552f83704f22f271a7f156d4dd6c397","namespace":"openshift-marketplace","catalogSourceName":"ibm-truststore-mgr-operators","catalogSourceNamespace":"openshift-marketplace","replaces":"ibm-truststore-mgr.v1.3.0-pre.tnoppc","properties":"{\"properties\":[{\"type\":\"olm.gvk\",\"value\":{\"group\":\"truststore-mgr.ibm.com\",\"kind\":\"Truststore\",\"version\":\"v1\"}},{\"type\":\"olm.package\",\"value\":{\"packageName\":\"ibm-truststore-mgr\",\"version\":\"1.3.0-pre.tnoppc\"}}]}"}
A few more details - and trying to be as generic as possible with the description. The problem with the replaces happens when we try to create a subscription that results in picking a release from a single release channel in our catalog. Nothing special about the subscription - for example.
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
name: ibm-truststore-mgr-stable-ibm-truststore-mgr-operators-openshift-marketplace
namespace: ibm-sls
spec:
channel: stable
name: ibm-truststore-mgr
source: ibm-truststore-mgr-operators
sourceNamespace: openshift-marketplace
installPlanApproval: Automatic
I then end up with this:
% oc get csv
NAME DISPLAY VERSION REPLACES PHASE
ibm-truststore-mgr.v1.2.3-pre.stable IBM Truststore Manager 1.2.3-pre.stable ibm-truststore-mgr.v1.2.3-pre.stable Pending
The CSV ends up with a replaces attribute:
provider:
name: IBM
url: https://ibm.com
replaces: ibm-truststore-mgr.v1.2.3-pre.stable
version: 1.2.3-pre.stable
The CSV named ibm-truststore-mgr.v1.2.3-pre.stable is in the state Pending. The only status condition is:
oc get csv ibm-truststore-mgr.v1.2.3-pre.stable -o yaml
status:
cleanup: {}
conditions:
- lastTransitionTime: "2022-04-28T17:28:46Z"
lastUpdateTime: "2022-04-28T17:28:46Z"
message: requirements not yet checked
phase: Pending
reason: RequirementsUnknown
lastTransitionTime: "2022-04-28T17:28:46Z"
lastUpdateTime: "2022-04-28T17:28:46Z"
message: requirements not yet checked
phase: Pending
reason: RequirementsUnknown
Looking at the logs for the olm-operator in the namespace openshift-operator-lifecycle-manager the following error messages are observed:
{"level":"error","ts":1651166928.3758118,"logger":"controllers.operatorcondition","msg":"Error ensuring OperatorCondition Deployment EnvVars","request":"ibm-sls/ibm-truststore-mgr.v1.2.3-pre.stable","error":"Deployment.apps \"ibm-truststore-mgr-controller-manager\" not found","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/build/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:298\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/build/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:253\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1.2\n\t/build/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:216\nk8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext.func1\n\t/build/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:185\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1\n\t/build/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:155\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil\n\t/build/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:156\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/build/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133\nk8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext\n\t/build/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:185\nk8s.io/apimachinery/pkg/util/wait.UntilWithContext\n\t/build/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:99"}
{"level":"error","ts":1651166928.3759263,"logger":"controller-runtime.manager.controller.operatorcondition","msg":"Reconciler error","reconciler group":"operators.coreos.com","reconciler kind":"OperatorCondition","name":"ibm-truststore-mgr.v1.2.3-pre.stable","namespace":"ibm-sls","error":"Deployment.apps \"ibm-truststore-mgr-controller-manager\" not found","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/build/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:253\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1.2\n\t/build/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:216\nk8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext.func1\n\t/build/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:185\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1\n\t/build/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:155\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil\n\t/build/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:156\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/build/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133\nk8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext\n\t/build/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:185\nk8s.io/apimachinery/pkg/util/wait.UntilWithContext\n\t/build/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:99"}
time="2022-04-28T17:28:48Z" level=warning msg="Unable to replace previous CSV" csv=ibm-truststore-mgr.v1.2.3-pre.stable error="CSV being replaced is in phase Pending instead of Replacing" id=CKA7e namespace=ibm-sls phase=Pending
time="2022-04-28T17:28:49Z" level=warning msg="Unable to replace previous CSV" csv=ibm-truststore-mgr.v1.2.3-pre.stable error="CSV being replaced is in phase Pending instead of Replacing" id=oUMul namespace=ibm-sls phase=Pending
Hello @terenceq, thanks for submitting this issue and for using OLM.
For some reason, installplan status field is showing a replaces field pointing to the same CSV name, causing it to be added to my CSV. It is causing a loop as the CSV cannot replace itself.... What did add replaces field in my CSV?
This is expected behavior.
When OLM is determining if an upgrade is available for an operator, it will look at the existing CSV and determine if:
- It is explicitly replaced by a newer CSV via the
replacesfield. - It is replaced because it exists within a
skipsorskipRangefield of a newer CSV.
If the existing CSV has an upgrade due to the second option, the newer CSV will be have its replaces field set to the existing CSV version. This allows OLM to use a single process for upgrading CSVs on cluster.
It is causing a loop as the CSV cannot replace itself.
This is happening because the skipRange your using is >=1.0.0 <=99.0.0, which is greater than the version of the CSV (v1.3.0-xxx). You need to set the skipRange to less than the version of the CSV. In this case, it seems like you should set the skipRange to >=1.0.0 <SEMVER.
This issue is intermittent though. Sometimes it just works and I don't see replaces line in the CSV spec, which is very weird. How can the same catalog/channel show different deployment behaviors?
The replaces field is only set during upgrades, I suspect you've seen a blank replaces field when installing the operator from scratch and are not upgrading from an existing version. If this is happening at other times, please share the steps to reproduce.
Note: Edited for clarity.