scylla-operator icon indicating copy to clipboard operation
scylla-operator copied to clipboard

ScyllaDB installation on openshift

Open gautam-borkar opened this issue 4 years ago • 1 comments

Describe the bug Trying to install scylladb cluster on Openshift using help. However the installation is failing with the error message in the description.

To Reproduce Steps to reproduce the behavior:

  1. helm repo add scylla https://scylla-operator-charts.storage.googleapis.com/stable
  2. helm repo update
  3. kubectl apply -f examples/common/cert-manager.yaml
  4. kubectl wait -n cert-manager --for=condition=ready pod -l app=cert-manager --timeout=60s
  5. helm install scylla-operator scylla/scylla-operator --values examples/helm/openshit/values.operator.yaml --create-namespace --namespace scylla-operator
  6. kubectl wait -n scylla-operator --for=condition=ready pod -l app.kubernetes.io/name=scylla-operator --timeout=240s
  7. helm install scylla scylla/scylla --values examples/helm/openshift/values.cluster.yaml --create-namespace --namespace scylla

Expected behavior ScyllaDB successfully installed on Openshift.

Logs Operator logs :-

{"L":"ERROR","T":"2021-07-21T15:55:29.960Z","N":"cluster-controller","M":"An error occurred during cluster 
reconciliation","cluster":"scylla/scylla-incident-mgmt","resourceVersion":"1062782","error":"failed to sync headless service: 
error syncing headless service scylla-incident-mgmt-client: services \"scylla-incident-mgmt-client\" is forbidden: cannot set 
blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , <nil>","_trace_id":"_3epK41ZSS-
1WYf-RF_mjQ","errorStack":"github.com/scylladb/scylla-operator/pkg/controllers/cluster.
(*ClusterReconciler).sync\n\tgithub.com/scylladb/scylla-
operator/pkg/controllers/cluster/sync.go:55\ngithub.com/scylladb/scylla-operator/pkg/controllers/cluster.
(*ClusterReconciler).Reconcile\n\tgithub.com/scylladb/scylla-
operator/pkg/controllers/cluster/cluster_controller.go:122\nsigs.k8s.io/controller-runtime/pkg/internal/controller.
(*Controller).reconcileHandler\n\tsigs.k8s.io/controller-
[email protected]/pkg/internal/controller/controller.go:298\nsigs.k8s.io/controller-runtime/pkg/internal/controller.
(*Controller).processNextWorkItem\n\tsigs.k8s.io/controller-
[email protected]/pkg/internal/controller/controller.go:253\nsigs.k8s.io/controller-runtime/pkg/internal/controller.
(*Controller).Start.func1.2\n\tsigs.k8s.io/controller-
[email protected]/pkg/internal/controller/controller.go:216\nk8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext.func1\n\tk8
s.io/[email protected]/pkg/util/wait/wait.go:185\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1\n\tk8s.io/apimachin
[email protected]/pkg/util/wait/wait.go:155\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil\n\tk8s.io/[email protected]/pkg/util
/wait/wait.go:156\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\tk8s.io/[email protected]/pkg/util/wait/wait.go:133\nk8s.
io/apimachinery/pkg/util/wait.JitterUntilWithContext\n\tk8s.io/[email protected]/pkg/util/wait/wait.go:185\nk8s.io/apimac
hinery/pkg/util/wait.UntilWithContext\n\tk8s.io/[email protected]/pkg/util/wait/wait.go:99\nruntime.goexit\n\truntime/asm
_amd64.s:1371\n"}

Environment:

  • Platform: Openshift
  • Kubernetes version:
Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.2", GitCommit:"092fbfbf53427de67cac1e9fa54aaa09a28371d7", GitTreeState:"clean", BuildDate:"2021-06-16T12:59:11Z", GoVersion:"go1.16.5", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.0+2817867", GitCommit:"2817867655bb7b68215b4e77873a8facf82bee06", GitTreeState:"clean", BuildDate:"2021-06-02T22:14:22Z", GoVersion:"go1.15.7", Compiler:"gc", Platform:"linux/amd64"}
  • Scylla version: 4.4.3
  • Openshift version: 4.7.16

Additional context I have added privilege access to serviceaccount and user following yam file for role bindings roles.txt Note :- Please rename the file with .yaml extension

gautam-borkar avatar Jul 21 '21 19:07 gautam-borkar

we don't support OpenShift yet :( although except for tuning permissions it should work this looks like a permissions issue, it needs to be able to set finalizers

failed to sync headless service: 
error syncing headless service scylla-incident-mgmt-client: services \"scylla-incident-mgmt-client\" is forbidden: cannot set 
blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on

tnozicka avatar Jul 22 '21 05:07 tnozicka

In order to run Scylla on Openshift I did:

  • allowed clusterrole to create/update/delete finalizers of the following resources (file: helm/scylla-operator/templates/clusterrole_def.yaml)
    • peristentvolumes
    • secrets
    • services
    • statefulsets
    • scyllaclusters
    • configmaps
    • poddisruptionbudgets
    • daemonsets
    • nodeconfigs
    • serviceaccounts
    • jobs
  • allowed clusterrole to create/update/delete finalizer of configmaps (file: helm/scylla-operator/templates/scyllacluster_member_clusterrole_def.yaml)
  • created security context constraint
allowedCapabilities:
  - SYS_NICE
allowPrivilegeEscalation: true
allowPrivilegedContainer: true
allowHostDirVolumePlugin: true
allowHostIPC: true
allowHostPID: true
apiVersion: security.openshift.io/v1
kind: SecurityContextConstraints
metadata:
  name: scylla-operator-scc
allowHostNetwork: true
allowHostPorts: true
readOnlyRootFilesystem: false
runAsUser:
  type: RunAsAny
seLinuxContext:
  type: RunAsAny
fsGroup:
  type: RunAsAny
users:
  - system:serviceaccount:scylla-operator:scylla-operator
  - system:serviceaccount:scylla-operator:webhook-server
  - system:serviceaccount:scylla:simple-cluster-member
  - system:serviceaccount:scylla-operator-node-tuning:scylla-node-config
  • Changed NodeConfig Clusterrole to give permission to create daemonset/finalizers (file: pkg/controller/nodeconfig/resource.go)
{
        APIGroups: []string{"apps"},
	Resources: []string{"daemonsets", "daemonsets/finalizers"},
	Verbs:     []string{"create", "delete", "get", "list", "patch", "update", "watch"},
}
  • Added ServiceAccount for the jobs (file: pkg/controller/nodeconfigdaemon/resource.go)
func makePerftuneJobForNode(...) {
... 
      Spec: corev1.PodSpec{
            ServiceAccountName: naming.NodeConfigAppName,
            ...
      }
}
func makePerftuneJobForContainers(...) {
... 
      Spec: corev1.PodSpec{
            ServiceAccountName: naming.NodeConfigAppName,
            ...
      }
}

Choraden avatar Nov 08 '22 17:11 Choraden

Nice work! Currently we don't aim to support OCP/OKD. We'll focus on EKS first.

mykaul avatar Nov 13 '22 19:11 mykaul

tracked in https://github.com/scylladb/scylla-operator/issues/424

tnozicka avatar Jun 25 '24 06:06 tnozicka