operator-sdk
operator-sdk copied to clipboard
Unable to install/uninstall OLM
Bug Report
What did you do?
Install OLM
operator-sdk olm install
Output
INFO[0000] Fetching CRDs for version "latest"
INFO[0000] Fetching resources for resolved version "latest"
I0720 11:14:25.291974 27229 request.go:665] Waited for 1.04716478s due to client-side throttling, not priority and fairness, request: GET:https://kubernetes.docker.internal:6443/apis/authorization.k8s.io/v1?timeout=32s
INFO[0012] Creating CRDs and resources
INFO[0012] Creating CustomResourceDefinition "catalogsources.operators.coreos.com"
INFO[0012] Creating CustomResourceDefinition "clusterserviceversions.operators.coreos.com"
INFO[0012] Creating CustomResourceDefinition "installplans.operators.coreos.com"
INFO[0012] Creating CustomResourceDefinition "olmconfigs.operators.coreos.com"
INFO[0012] Creating CustomResourceDefinition "operatorconditions.operators.coreos.com"
INFO[0012] Creating CustomResourceDefinition "operatorgroups.operators.coreos.com"
INFO[0012] Creating CustomResourceDefinition "operators.operators.coreos.com"
INFO[0012] Creating CustomResourceDefinition "subscriptions.operators.coreos.com"
INFO[0012] Creating Namespace "olm"
INFO[0012] Creating Namespace "operators"
INFO[0012] Creating ServiceAccount "olm/olm-operator-serviceaccount"
INFO[0012] Creating ClusterRole "system:controller:operator-lifecycle-manager"
INFO[0012] Creating ClusterRoleBinding "olm-operator-binding-olm"
INFO[0012] Creating OLMConfig "cluster"
INFO[0014] Creating Deployment "olm/olm-operator"
INFO[0014] Creating Deployment "olm/catalog-operator"
INFO[0014] Creating ClusterRole "aggregate-olm-edit"
INFO[0014] Creating ClusterRole "aggregate-olm-view"
INFO[0014] Creating OperatorGroup "operators/global-operators"
INFO[0014] Creating OperatorGroup "olm/olm-operators"
INFO[0014] Creating ClusterServiceVersion "olm/packageserver"
INFO[0014] Creating CatalogSource "olm/operatorhubio-catalog"
INFO[0014] Waiting for deployment/olm-operator rollout to complete
INFO[0014] Waiting for Deployment "olm/olm-operator" to rollout: 0 of 1 updated replicas are available
FATA[0120] Failed to install OLM version "latest": deployment/olm-operator failed to rollout: timed out waiting for the condition
Tried installing OLM again using operator-sdk olm install
Output
INFO[0000] Fetching CRDs for version "latest"
INFO[0000] Fetching resources for resolved version "latest"
FATA[0002] Failed to install OLM version "latest": detected existing OLM resources: OLM must be completely uninstalled before installation
Hence, tried to uninstall OLM
operator-sdk olm uninstall
Output
INFO[0000] Fetching CRDs for version "v0.21.2"
INFO[0000] Fetching resources for resolved version "v0.21.2"
INFO[0016] Uninstalling resources for version "v0.21.2"
INFO[0016] Deleting CustomResourceDefinition "catalogsources.operators.coreos.com"
INFO[0016] Deleting CustomResourceDefinition "clusterserviceversions.operators.coreos.com"
INFO[0017] Deleting CustomResourceDefinition "installplans.operators.coreos.com"
INFO[0017] Deleting CustomResourceDefinition "olmconfigs.operators.coreos.com"
INFO[0017] Deleting CustomResourceDefinition "operatorconditions.operators.coreos.com"
INFO[0017] Deleting CustomResourceDefinition "operatorgroups.operators.coreos.com"
INFO[0017] Deleting CustomResourceDefinition "operators.operators.coreos.com"
INFO[0017] Deleting CustomResourceDefinition "subscriptions.operators.coreos.com"
INFO[0018] Deleting Namespace "olm"
FATA[0120] Failed to uninstall OLM: timed out waiting for the condition
Tried other CLI options to see if installation somehow works
operator-sdk olm status
Output
I0720 11:19:00.711017 27283 request.go:665] Waited for 1.047892855s due to client-side throttling, not priority and fairness, request: GET:https://kubernetes.docker.internal:6443/apis/batch/v1?timeout=32s
FATA[0001] Failed to get OLM status: error getting installed OLM version (set --version to override the default version): no existing installation found
Tried uninstalling with flags
operator-sdk olm uninstall --version=latest --timeout=5m0s
Output
I0720 11:30:52.838951 27378 request.go:665] Waited for 1.046319563s due to client-side throttling, not priority and fairness, request: GET:https://kubernetes.docker.internal:6443/apis/storage.k8s.io/v1beta1?timeout=32s
INFO[0001] Fetching CRDs for version "latest"
INFO[0001] Fetching resources for resolved version "latest"
INFO[0010] Uninstalling resources for version "latest"
INFO[0010] Deleting CustomResourceDefinition "catalogsources.operators.coreos.com"
INFO[0010] CustomResourceDefinition "catalogsources.operators.coreos.com" does not exist
INFO[0010] Deleting CustomResourceDefinition "clusterserviceversions.operators.coreos.com"
INFO[0010] CustomResourceDefinition "clusterserviceversions.operators.coreos.com" does not exist
INFO[0010] Deleting CustomResourceDefinition "installplans.operators.coreos.com"
INFO[0010] CustomResourceDefinition "installplans.operators.coreos.com" does not exist
INFO[0010] Deleting CustomResourceDefinition "olmconfigs.operators.coreos.com"
INFO[0010] CustomResourceDefinition "olmconfigs.operators.coreos.com" does not exist
INFO[0010] Deleting CustomResourceDefinition "operatorconditions.operators.coreos.com"
INFO[0010] CustomResourceDefinition "operatorconditions.operators.coreos.com" does not exist
INFO[0010] Deleting CustomResourceDefinition "operatorgroups.operators.coreos.com"
INFO[0010] CustomResourceDefinition "operatorgroups.operators.coreos.com" does not exist
INFO[0010] Deleting CustomResourceDefinition "operators.operators.coreos.com"
INFO[0010] CustomResourceDefinition "operators.operators.coreos.com" does not exist
INFO[0010] Deleting CustomResourceDefinition "subscriptions.operators.coreos.com"
INFO[0010] CustomResourceDefinition "subscriptions.operators.coreos.com" does not exist
INFO[0010] Deleting Namespace "olm"
INFO[0010] Namespace "olm" does not exist
INFO[0010] Deleting Namespace "operators"
INFO[0010] Namespace "operators" does not exist
INFO[0010] Deleting ServiceAccount "olm/olm-operator-serviceaccount"
INFO[0010] ServiceAccount "olm/olm-operator-serviceaccount" does not exist
INFO[0010] Deleting ClusterRole "system:controller:operator-lifecycle-manager"
INFO[0010] ClusterRole "system:controller:operator-lifecycle-manager" does not exist
INFO[0010] Deleting ClusterRoleBinding "olm-operator-binding-olm"
INFO[0010] ClusterRoleBinding "olm-operator-binding-olm" does not exist
INFO[0010] Deleting OLMConfig "cluster"
I0720 11:31:02.999798 27378 request.go:665] Waited for 1.047825429s due to client-side throttling, not priority and fairness, request: GET:https://kubernetes.docker.internal:6443/apis/discovery.k8s.io/v1?timeout=32s
FATA[0011] Failed to uninstall OLM: no matches for kind "OLMConfig" in version "operators.coreos.com/v1"
Uninstalled previous version of Operator-SDK and tried to reproduce issue on the latest version and I got the same output.
What did you expect to see?
OLM installation successful
What did you see instead? Under which circumstances?
TImeout during OLM installation. Cannot uninstall OLM completely.
Environment
Kubernetes cluster type:
Local Kubernetes Kind cluster run by Docker Desktop on WSL 2
$ operator-sdk version
operator-sdk version: "v1.21.0", commit: "89d21a133750aee994476736fa9523656c793588", kubernetes version: "1.23", go version: "go1.17.10", GOOS: "linux", GOARCH: "amd64"
# after installing latest version
operator-sdk version: "v1.22.1", commit: "46ab175459a775d2fb9f0454d0b4a8850dd745ed", kubernetes version: "1.24.1", go version: "go1.18.3", GOOS: "linux", GOARCH: "amd64"
$ go version
(if language is Go)
go version go1.17 linux/amd64
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.0", GitCommit:"4ce5a8954017644c5420bae81d72b09b735c21f0", GitTreeState:"clean", BuildDate:"2022-05-03T13:46:05Z", GoVersion:"go1.18.1", Compiler:"gc", Platform:"linux/amd64"}
Kustomize Version: v4.5.4
Server Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.0", GitCommit:"4ce5a8954017644c5420bae81d72b09b735c21f0", GitTreeState:"clean", BuildDate:"2022-05-03T13:38:19Z", GoVersion:"go1.18.1", Compiler:"gc", Platform:"linux/amd64"}
Possible Solution
- Please suggest what I can do about the timeout issue so that installation is successful the next time
- How do I completely uninstall OLM?
- Add a section in documentation or provide a script to force uninstall of OLM
For now, I have installed OLM for now after referring to OperatorHub's documentation
kubectl create -f https://raw.githubusercontent.com/operator-framework/operator-lifecycle-manager/master/deploy/upstream/quickstart/crds.yaml
kubectl create -f https://raw.githubusercontent.com/operator-framework/operator-lifecycle-manager/master/deploy/upstream/quickstart/olm.yaml
However, I would be glad have the ease of use of operator-sdk
for OLM commands than kubectl
I like the idea of a force uninstall script. @varshaprasad96 do you have a preference on how this is fixed? I would like to work on a fix as a contribution
@aemperador We already have olm install/uninstall/status
command which actually download the exact same manifests which you mentioned here (https://github.com/operator-framework/operator-sdk/issues/5951#issuecomment-1189896347). If those commands are not working, especially in a vanilla Kubernetes cluster, then there are high chances that it is a bug with the SDK tool somewhere and we would have to investigate it.
@pk-218 I was unable to reproduce this on operator-sdk version v1.23.0 have you attempted with this version?
No, I haven't attempted it with v1.23, v1.22 is the version I have on my system
Are you using a single node cluster?
Yes
I was able to install/uninstall using the --version=latest
tag on v1.22.
However, I also tried an install on v1.22, then upgraded to v1.23 and tried to uninstall olm. There I was able to reproduce the same errors you were getting. By reverting back to v1.22, I was able to successfully uninstall olm.
Perhaps you upgraded your operator-sdk version without uninstalling olm beforehand. I would try reverting back to an older version of sdk where you installed olm and then uninstalling using the latest
tag and then upgrading to the newest version and installing olm there.
@jcho02 Yes, I think that caused the issue.
Thank you for looking into it!
Hi @pk-218 it seems this issue reproduced in the latest version(v1.30.0). I was able to use this command to uninstall OLM and check the status of OLM, but there is still exceptions when I tried to install OLM.
here are the exceptions I encountered:
INFO[0000] Fetching CRDs for version "latest"
INFO[0000] Fetching resources for resolved version "latest"
INFO[0001] Creating CRDs and resources
INFO[0001] Creating CustomResourceDefinition "catalogsources.operators.coreos.com"
INFO[0001] Creating CustomResourceDefinition "clusterserviceversions.operators.coreos.com"
INFO[0001] Creating CustomResourceDefinition "installplans.operators.coreos.com"
INFO[0001] Creating CustomResourceDefinition "olmconfigs.operators.coreos.com"
INFO[0001] Creating CustomResourceDefinition "operatorconditions.operators.coreos.com"
INFO[0001] Creating CustomResourceDefinition "operatorgroups.operators.coreos.com"
INFO[0001] Creating CustomResourceDefinition "operators.operators.coreos.com"
INFO[0001] Creating CustomResourceDefinition "subscriptions.operators.coreos.com"
INFO[0001] Creating Namespace "olm"
INFO[0001] Creating Namespace "operators"
INFO[0001] Creating ServiceAccount "olm/olm-operator-serviceaccount"
INFO[0001] Creating ClusterRole "system:controller:operator-lifecycle-manager"
INFO[0001] Creating ClusterRoleBinding "olm-operator-binding-olm"
INFO[0001] Creating OLMConfig "cluster"
FATA[0001] Failed to install OLM version "latest": failed to create CRDs and resources: no matches for kind "OLMConfig" in version "operators.coreos.com/v1"
Here is the status:
INFO[0000] Fetching CRDs for version "latest"
INFO[0000] Fetching resources for resolved version "latest"
INFO[0001] Successfully got OLM status for version "latest"
NAME NAMESPACE KIND STATUS system:controller:operator-lifecycle-manager ClusterRole Installed catalogsources.operators.coreos.com CustomResourceDefinition Installed operators Namespace Installed olm Namespace Installed subscriptions.operators.coreos.com CustomResourceDefinition Installed operators.operators.coreos.com CustomResourceDefinition Installed operatorgroups.operators.coreos.com CustomResourceDefinition Installed operatorconditions.operators.coreos.com CustomResourceDefinition Installed olmconfigs.operators.coreos.com CustomResourceDefinition Installed installplans.operators.coreos.com CustomResourceDefinition Installed clusterserviceversions.operators.coreos.com CustomResourceDefinition Installed olm-operator-binding-olm ClusterRoleBinding Installed olm-operator-serviceaccount olm ServiceAccount Installed cluster OLMConfig olmconfigs.operators.coreos.com "cluster" not found olm-operator olm Deployment deployments.apps "olm-operator" not found catalog-operator olm Deployment deployments.apps "catalog-operator" not found aggregate-olm-edit ClusterRole clusterroles.rbac.authorization.k8s.io "aggregate-olm-edit" not found aggregate-olm-view ClusterRole clusterroles.rbac.authorization.k8s.io "aggregate-olm-view" not found global-operators operators OperatorGroup operatorgroups.operators.coreos.com "global-operators" not found olm-operators olm OperatorGroup operatorgroups.operators.coreos.com "olm-operators" not found packageserver olm ClusterServiceVersion clusterserviceversions.operators.coreos.com "packageserver" not found operatorhubio-catalog olm CatalogSource catalogsources.operators.coreos.com "operatorhubio-catalog" not found
kubectl version: Client Version: version.Info{Major:"1", Minor:"27", GitVersion:"v1.27.2", GitCommit:"7f6f68fdabc4df88cfea2dcf9a19b2b830f1e647", GitTreeState:"clean", BuildDate:"2023-05-17T14:20:07Z", GoVersion:"go1.20.4", Compiler:"gc", Platform:"darwin/arm64"} Kustomize Version: v5.0.1 Server Version: version.Info{Major:"1", Minor:"27", GitVersion:"v1.27.3", GitCommit:"25b4e43193bcda6c7328a6d147b1fb73a33f1598", GitTreeState:"clean", BuildDate:"2023-06-15T00:38:14Z", GoVersion:"go1.20.5", Compiler:"gc", Platform:"linux/arm64"}
Golang version: go version go1.20.6 darwin/arm64
Hi @pk-218 it seems this issue reproduced in the latest version(v1.30.0). I was able to use this command to uninstall OLM and check the status of OLM, but there is still exceptions when I tried to install OLM.
here are the exceptions I encountered: INFO[0000] Fetching CRDs for version "latest" INFO[0000] Fetching resources for resolved version "latest" INFO[0001] Creating CRDs and resources INFO[0001] Creating CustomResourceDefinition "catalogsources.operators.coreos.com" INFO[0001] Creating CustomResourceDefinition "clusterserviceversions.operators.coreos.com" INFO[0001] Creating CustomResourceDefinition "installplans.operators.coreos.com" INFO[0001] Creating CustomResourceDefinition "olmconfigs.operators.coreos.com" INFO[0001] Creating CustomResourceDefinition "operatorconditions.operators.coreos.com" INFO[0001] Creating CustomResourceDefinition "operatorgroups.operators.coreos.com" INFO[0001] Creating CustomResourceDefinition "operators.operators.coreos.com" INFO[0001] Creating CustomResourceDefinition "subscriptions.operators.coreos.com" INFO[0001] Creating Namespace "olm" INFO[0001] Creating Namespace "operators" INFO[0001] Creating ServiceAccount "olm/olm-operator-serviceaccount" INFO[0001] Creating ClusterRole "system:controller:operator-lifecycle-manager" INFO[0001] Creating ClusterRoleBinding "olm-operator-binding-olm" INFO[0001] Creating OLMConfig "cluster" FATA[0001] Failed to install OLM version "latest": failed to create CRDs and resources: no matches for kind "OLMConfig" in version "operators.coreos.com/v1"
Here is the status: INFO[0000] Fetching CRDs for version "latest" INFO[0000] Fetching resources for resolved version "latest" INFO[0001] Successfully got OLM status for version "latest"
NAME NAMESPACE KIND STATUS system:controller:operator-lifecycle-manager ClusterRole Installed catalogsources.operators.coreos.com CustomResourceDefinition Installed operators Namespace Installed olm Namespace Installed subscriptions.operators.coreos.com CustomResourceDefinition Installed operators.operators.coreos.com CustomResourceDefinition Installed operatorgroups.operators.coreos.com CustomResourceDefinition Installed operatorconditions.operators.coreos.com CustomResourceDefinition Installed olmconfigs.operators.coreos.com CustomResourceDefinition Installed installplans.operators.coreos.com CustomResourceDefinition Installed clusterserviceversions.operators.coreos.com CustomResourceDefinition Installed olm-operator-binding-olm ClusterRoleBinding Installed olm-operator-serviceaccount olm ServiceAccount Installed cluster OLMConfig olmconfigs.operators.coreos.com "cluster" not found olm-operator olm Deployment deployments.apps "olm-operator" not found catalog-operator olm Deployment deployments.apps "catalog-operator" not found aggregate-olm-edit ClusterRole clusterroles.rbac.authorization.k8s.io "aggregate-olm-edit" not found aggregate-olm-view ClusterRole clusterroles.rbac.authorization.k8s.io "aggregate-olm-view" not found global-operators operators OperatorGroup operatorgroups.operators.coreos.com "global-operators" not found olm-operators olm OperatorGroup operatorgroups.operators.coreos.com "olm-operators" not found packageserver olm ClusterServiceVersion clusterserviceversions.operators.coreos.com "packageserver" not found operatorhubio-catalog olm CatalogSource catalogsources.operators.coreos.com "operatorhubio-catalog" not found
kubectl version: Client Version: version.Info{Major:"1", Minor:"27", GitVersion:"v1.27.2", GitCommit:"7f6f68fdabc4df88cfea2dcf9a19b2b830f1e647", GitTreeState:"clean", BuildDate:"2023-05-17T14:20:07Z", GoVersion:"go1.20.4", Compiler:"gc", Platform:"darwin/arm64"} Kustomize Version: v5.0.1 Server Version: version.Info{Major:"1", Minor:"27", GitVersion:"v1.27.3", GitCommit:"25b4e43193bcda6c7328a6d147b1fb73a33f1598", GitTreeState:"clean", BuildDate:"2023-06-15T00:38:14Z", GoVersion:"go1.20.5", Compiler:"gc", Platform:"linux/arm64"}
Golang version: go version go1.20.6 darwin/arm64
I rollbacked to v1.25.4, it works well now.