etcd-cluster-operator Not able to install the operator and provision an instance by following the documentation

Versions of relevant software used Openshift 4.3 which uses Kubenetes 1.16

What happened I'm trying to get the etcd-cluster-operator run on my cluster and provision EtcdClusters. Followed these Installing instructions and Contributing documentation separately, I was not able to successfully create the EtcdCluster in both ways.

What you expected to happen Install etcd-cluster-operator in my cluster and provision etcdclusters.

How to reproduce it (as minimally and precisely as possible):

Install cert manager

Option 1: Follow Installing Instructions

Step 1: Clone this github repo to your local. Step 2: From the root directory of the repo - cd config/default Step 3: export ECO_VERSION=v0.2.0 Step 4: kustomize edit set image controller=$ECO_VERSION Step 5: kustomize edit set image proxy=$ECO_VERSION Step 6: kubectl apply --kustomize .

Output

error: rawResources failed to read Resources: Load from path ../crd failed: '../crd' must be a file (got d='/<path to repo>/etcd-cluster-operator/config/crd')

Option 2: Follow Contributing Instructions

Step 1: export DOCKER_REPO=<registryname> Step 2: make docker-build Step 3: make docker-push Step 4: make deploy

Current status:

oc get crd | grep improbable 
etcdbackups.etcd.improbable.io                              2020-06-18T17:34:59Z
etcdbackupschedules.etcd.improbable.io                      2020-06-18T17:35:00Z
etcdclusters.etcd.improbable.io                             2020-06-18T17:35:00Z
etcdpeers.etcd.improbable.io                                2020-06-18T17:35:00Z
etcdrestores.etcd.improbable.io                             2020-06-18T17:35:01Z

oc get pods 
NAME                                      READY   STATUS             RESTARTS   AGE
eco-controller-manager-796f74db94-jpfp7   1/1     Running            0          119m
eco-proxy-cfdb688bb-pb5rh                 0/1     CrashLoopBackOff   48         119m

oc logs eco-proxy-cfdb688bb-pb5rh -n eco-system 
2020-06-18T20:18:21.298Z	INFO	setup	Starting proxy	{"version": "v0.2.0-23-gf84abc6"}
2020-06-18T20:18:21.299Z	INFO	setup	Listening	{"grpc-address": ":8080"}

Step 5: kubectl apply -f config/samples/etcd_v1alpha1_etcdcluster.yaml

I tried creating this CR in both hari namespace and eco-system namespace

 oc get etcdclusters.etcd.improbable.io 
NAME         AGE
my-cluster   149m

 oc logs etcdclusters.etcd.improbable.io/my-cluster
error: no kind "EtcdCluster" is registered for version "etcd.improbable.io/v1alpha1" in scheme "k8s.io/kubernetes/pkg/kubectl/scheme/scheme.go:28"

Output Full logs to relevant components eco-controller-manager.log

Anything else we need to know

Jun 18 '20 20:06 HariNarayananMohan

Thanks for the report.

Will investigate if the installation instructions are still up to date. Note to self: CI uses standalone kustomize v3, kubectl ships with kustomize v2. make deploy uses kustomize build <config> | kubectl apply -f.

With the cluster deployment error, the manager is logging

2020-06-18T17:54:30.724Z	ERROR	controller-runtime.controller	Reconciler error	{"controller": "etcdcluster", "request": "hari/my-cluster", "error": "Failed to reconcile: unable to create service: services \"my-cluster\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , <nil>"}

A quick search suggests that OpenShift requires an additional RBAC permission for <resource>/finalizer to set finalizers. Our tests and deployments are currently on Kind/GKE, so would have missed this. Will investigate what is needed for a fix.

https://github.com/jaegertracing/jaeger-operator/issues/461 https://github.com/spotahome/redis-operator/issues/98

Jun 28 '20 20:06 cheahjs

Thanks for the report.

Will investigate if the installation instructions are still up to date. Note to self: CI uses standalone kustomize v3, kubectl ships with kustomize v2. make deploy uses kustomize build <config> | kubectl apply -f.

With the cluster deployment error, the manager is logging
2020-06-18T17:54:30.724Z	ERROR	controller-runtime.controller	Reconciler error	{"controller": "etcdcluster", "request": "hari/my-cluster", "error": "Failed to reconcile: unable to create service: services \"my-cluster\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , <nil>"}
A quick search suggests that OpenShift requires an additional RBAC permission for <resource>/finalizer to set finalizers. Our tests and deployments are currently on Kind/GKE, so would have missed this. Will investigate what is needed for a fix.

jaegertracing/jaeger-operator#461 spotahome/redis-operator#98

Thank you! I followed it and was able to make it work few days before. Thought of sharing it back.

Jun 29 '20 03:06 HariNarayananMohan

Thank you! I followed it and was able to make it work few days before. Thought of sharing it back.

Hi! Could you share how? :) It's unclear which resource needs this exactly.

In the meantime I've read the sources to see which ownerReferences were created.. and found this works:

- apiGroups:
  - etcd.improbable.io
  resources:
  - etcdbackupschedules/finalizers
  verbs:
  - update
- apiGroups:
  - etcd.improbable.io
  resources:
  - etcdclusters/finalizers
  verbs:
  - update
- apiGroups:
  - etcd.improbable.io
  resources:
  - etcdpeers/finalizers
  verbs:
  - update
- apiGroups:
  - etcd.improbable.io
  resources:
  - etcdrestores/finalizers
  verbs:
  - update

Sep 08 '21 08:09 jimmy-scott