starrocks-kubernetes-operator icon indicating copy to clipboard operation
starrocks-kubernetes-operator copied to clipboard

ClusterRole is missing finalizers declarations

Open sgaragan opened this issue 11 months ago • 3 comments

When deploying to OpenShift, we see the following errors in the operator manager logs

2024-03-19T03:14:25.923+0800	INFO	StarRocksClusterReconciler	begin to reconcile StarRocksCluster	{"name": "starrockscluster", "namespace": "starrocks"}
2024-03-19T03:14:25.923+0800	INFO	StarRocksClusterReconciler	get StarRocksCluster CR from kubernetes	{"name": "starrockscluster", "namespace": "starrocks"}
2024-03-19T03:14:25.923+0800	INFO	StarRocksClusterReconciler	sub controller sync spec	{"name": "starrockscluster", "namespace": "starrocks", "subController": "feController"}
2024-03-19T03:14:25.923+0800	INFO	StarRocksClusterReconciler.feController	fetch configmap from kubernetes	{"name": "starrockscluster", "namespace": "starrocks", "action": "SyncCluster", "name": "starrockscluster-fe-cm"}
2024-03-19T03:14:25.923+0800	INFO	StarRocksClusterReconciler.feController	create or update statefulset	{"name": "starrockscluster", "namespace": "starrocks", "action": "SyncCluster", "name": "starrockscluster-fe"}
2024-03-19T03:14:25.943+0800	ERROR	StarRocksClusterReconciler.feController	deploy statefulset failed	{"name": "starrockscluster", "namespace": "starrocks", "action": "SyncCluster", "error": "statefulsets.apps \"starrockscluster-fe\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , <nil>"}
github.com/StarRocks/starrocks-kubernetes-operator/pkg/subcontrollers/fe.(*FeController).SyncCluster
	/go/src/app/pkg/subcontrollers/fe/fe_controller.go:115
github.com/StarRocks/starrocks-kubernetes-operator/pkg/controllers.(*StarRocksClusterReconciler).Reconcile
	/go/src/app/pkg/controllers/starrockscluster_controller.go:93
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile
	/go/src/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:121
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
	/go/src/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:320
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
	/go/src/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:273
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
	/go/src/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:234
2024-03-19T03:14:25.944+0800	ERROR	StarRocksClusterReconciler	sub controller reconciles spec failed	{"name": "starrockscluster", "namespace": "starrocks", "subController": "feController", "error": "statefulsets.apps \"starrockscluster-fe\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , <nil>"}
github.com/StarRocks/starrocks-kubernetes-operator/pkg/controllers.(*StarRocksClusterReconciler).Reconcile
	/go/src/app/pkg/controllers/starrockscluster_controller.go:94
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile
	/go/src/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:121
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
	/go/src/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:320
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
	/go/src/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:273
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
	/go/src/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:234
2024-03-19T03:14:25.951+0800	ERROR	Reconciler error	{"controller": "starrockscluster", "controllerGroup": "starrocks.com", "controllerKind": "StarRocksCluster", "StarRocksCluster": {"name":"starrockscluster","namespace":"starrocks"}, "namespace": "starrocks", "name": "starrockscluster", "reconcileID": "bd08514b-9430-4bb0-99b0-e4bc62476dfe", "error": "statefulsets.apps \"starrockscluster-fe\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , <nil>"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
	/go/src/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:326
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
	/go/src/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:273
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
	/go/src/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:234

The issue is that the ClusterRole is missing the finalizers needed when deploying to OpenShift. We added the following to the ClusterRole YAML which fixed the errors

- apiGroups:
  - apps
  resources:
  - deployments/finalizers
  - statefulsets/finalizers
  verbs:
  - '*'
- apiGroups:
  - autoscaling
  resources:
  - horizontalpodautoscalers/finalizers
  verbs:
  - '*'
- apiGroups:
  - ""
  resources:
  - configmaps/finalizers
  - serviceaccounts/finalizers
  - services/finalizers
  verbs:
  - '*'
- apiGroups:
  - ""
  resources:
  - endpoints/finalizers
  - pods/finalizers
  - secrets/finalizers
  verbs:
  - get
  - list
  - watch
  • Operator Version: 1.9.3

Thanks, Sean

sgaragan avatar Mar 19 '24 13:03 sgaragan

‌‌‌‌‌‌‌We recently removed the configuration related to finalizers because we have not encountered the issue you mentioned in our environment, so we thought it was useless and deleted it.

However, from the error message above,

"statefulsets.apps \"starrockscluster-fe\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on,"

I think we should add Finalizer information to starrocksclusters.

yandongxiao avatar Mar 20 '24 02:03 yandongxiao

I summit a PR to try to fix it, see #484. Can you please to verify it?

yandongxiao avatar Mar 20 '24 02:03 yandongxiao

I am not able to check right away but the error seems to point to the statefulset resource not having a finalizer:

2024-03-19T03:14:25.943+0800	ERROR	StarRocksClusterReconciler.feController	deploy statefulset failed	{"name": "starrockscluster", "namespace": "starrocks", "action": "SyncCluster", "error": "statefulsets.apps \"starrockscluster-fe\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , <nil>"}

As mentioned, what fixed it was adding finalizers to each of the ClusterRole resources. When I researched this issue, it was the ClusterRole that needed the finalizers apparently, which it why we added to those resources (and not to other roles).

The reason for the error is that OpenShift by default enforces owner reference permissions.

https://sdk.operatorframework.io/docs/faqs/#after-deploying-my-operator-why-do-i-see-errors-like-is-forbidden-cannot-set-blockownerdeletion-if-an-ownerreference-refers-to-a-resource-you-cant-set-finalizers-on-

sgaragan avatar Mar 20 '24 12:03 sgaragan

I summit a PR to try to fix it, see #484. Can you please to verify it?

Hello we was facing the same problem and your PR fixed it ! When ll be this merged ?

anthony974 avatar Aug 05 '24 07:08 anthony974

I summit a PR to try to fix it, see #484. Can you please to verify it?

Hello we was facing the same problem and your PR fixed it ! When ll be this merged ?

Thank you very much for your feedback. We will evaluate it as soon as possible and merge it.

yandongxiao avatar Aug 05 '24 08:08 yandongxiao