operator-sdk
operator-sdk copied to clipboard
How to troubleshoot a failure during run bundle-upgrade
Type of question
Best practices How to implement a specific feature
Question
I'm trying to test the upgrade path for my helm-based operator using the sdk bundle. It's failing because of a missing install plan, but I'm not sure exactly how to get more information about what I'm doing wrong.
What did you do?
# Install current release of the operator
operator-sdk run bundle quay.io/pelorus/pelorus-operator-bundle:v0.0.9 --namespace test-pelorus-operator
# Once installed successfully, attempt the upgrade
operator-sdk run bundle-upgrade quay.io/pelorus/rc-pelorus-operator-bundle:vpr1157-34d9eef --namespace test-pelorus-operator --verbose
What did you expect to see?
I hoped to see the upgrade succeed.
What did you see instead? Under which circumstances?
The install failed after the deleting of the old registry pod:
INFO[0018] Generated a valid Upgraded File-Based Catalog
INFO[0020] Created registry pod: quay-io-pelorus-rc-pelorus-operator-bundle-vpr1157-34d9eef
INFO[0020] Updated catalog source pelorus-operator-catalog with address and annotations
INFO[0021] Deleted previous registry pod with name "quay-io-pelorus-pelorus-operator-bundle-v0-0-9"
FATA[0120] Failed to run bundle upgrade: install plan is not available for the subscription pelorus-operator-v0-0-9-sub: context deadline exceeded
I also see the following subscriptions and installplans:
$ oc get subscription -n test-pelorus-operator
NAME PACKAGE SOURCE CHANNEL
grafana-operator-v4-community-operators-openshift-marketplace grafana-operator community-operators v4
pelorus-operator-v0-0-9-sub pelorus-operator pelorus-operator-catalog operator-sdk-run-bundle
prometheus-beta-community-operators-openshift-marketplace prometheus community-operators beta
$ oc get installplan -n test-pelorus-operator
NAME CSV APPROVAL APPROVED
install-t2fjd grafana-operator.v4.8.0 Manual true
NOTE: the prometheus and grafana operators are dependencies of this operator, which is why you see them in this namespace.
Environment
Operator type:
/language helm
Kubernetes cluster type:
OpenShift 4.15
$ operator-sdk version
operator-sdk version: "v1.33.0", commit: "542966812906456a8d67cf7284fc6410b104e118", kubernetes version: "1.27.0", go version: "go1.21.5", GOOS: "linux", GOARCH: "amd64"
$ kubectl version
Additional context
This happens with or without the actual operand resource created, so it seems to be some pretty basic issue, maybe with how the bundle is configured.