noobaa-operator
noobaa-operator copied to clipboard
Operator restarts multiple times on the 0518-nsfs build
Environment info
-
NooBaa Operator Version: VERSION
-
Platform: Kubernetes 1.14.1 | minikube 1.1.1 | OpenShift 4.1 | other: specify OCP oc version Client Version: 4.7.8 Server Version: 4.7.8 Kubernetes Version: v1.20.0+7d0a2b2
Noobaa version is posted down
Actual behavior
- oc get pods NAME READY STATUS RESTARTS AGE noobaa-core-0 1/1 Running 0 20h noobaa-db-0 1/1 Running 0 20h noobaa-default-backing-store-noobaa-pod-0c22b5a3 0/1 Terminating 0 18h noobaa-endpoint-6bf7d8457d-t82ss 1/1 Running 0 20h noobaa-operator-647bbcf485-gncb7 1/1 Running 22 20h ---> this restarts is a concern
Expected behavior
Steps to reproduce
- Installed the 0518-nsfs on the OCP 4.7.8 cluster
More information - Screenshots / Logs / Other output
noobaa version noobaa-operator-restarts-0518nsfsbld.log
INFO[0000] CLI version: 5.8.0 INFO[0000] noobaa-image: noobaa/noobaa-core:master-20210518-nsfs INFO[0000] operator-image: noobaa/noobaa-operator:5.8.0
noobaa status INFO[0001] CLI version: 5.8.0 INFO[0001] noobaa-image: noobaa/noobaa-core:master-20210518-nsfs INFO[0001] operator-image: noobaa/noobaa-operator:master-20210518-nsfs INFO[0001] noobaa-db-image: centos/mongodb-36-centos7 INFO[0001] Namespace: noobaa ....
collected "oc logs noobaa-operator-647bbcf485-gncb7"
let me know if anything further required
Hey @rkomandu can you please add the failing logs? using --previous flag
oc logs --previous=true noobaa-operator-647bbcf485-gncb7 >& /tmp/noobaa-operator-restarts-previous-0518nsfsbld.log
ls -lrt /tmp/noobaa-operator-restarts-previous-0518nsfsbld.log -rw-r--r-- 1 root root 640 May 20 03:23 /tmp/noobaa-operator-restarts-previous-0518nsfsbld.log
cat /tmp/noobaa-operator-restarts-previous-0518nsfsbld.log time="2021-05-20T09:02:42Z" level=info msg="CLI version: 5.8.0\n" time="2021-05-20T09:02:42Z" level=info msg="noobaa-image: noobaa/noobaa-core:master-20210518-nsfs\n" time="2021-05-20T09:02:42Z" level=info msg="operator-image: noobaa/noobaa-operator:5.8.0\n" I0520 09:02:43.134682 1 request.go:645] Throttling request took 1.008213008s, request: GET:https://172.30.0.1:443/apis/objectbucket.io/v1alpha1?timeout=32s time="2021-05-20T09:02:49Z" level=fatal msg="Failed to become leader: Get "https://172.30.0.1:443/api/v1/namespaces/noobaa/pods/noobaa-operator-647bbcf485-gncb7": dial tcp 172.30.0.1:443: connect: connection refused"
i don't think have got any further info in the log and it is very limited in size
updating the noobaa core log and then the oc describe log noobaa-operator..
at current state the noobaa-operator restarts are about 35 in number.. am not doing any IO anything on the cluster..
Updated to the 20210520 build as per discussion with Romy
[root@rkomandu-hpo-inf ~]# oc get pods NAME READY STATUS RESTARTS AGE noobaa-core-0 1/1 Running 0 125m noobaa-db-0 1/1 Running 0 125m noobaa-default-backing-store-noobaa-pod-0c8f0a32 0/1 Terminating 0 3s noobaa-endpoint-877dfcd54-4vwlf 1/1 Running 0 118m noobaa-operator-68bb5bff97-hbpzh 1/1 Running 2 126m --> this is happening
[root@rkomandu-hpo-inf ~]# oc logs --previous noobaa-operator-68bb5bff97-hbpzh > /tmp/noobaa-operator-20Maybuild.log [root@rkomandu-hpo-inf ~]# ls -lrt /tmp/noobaa-operator-20Maybuild.log -rw-r--r-- 1 root root 531 May 21 01:28 /tmp/noobaa-operator-20Maybuild.log [root@rkomandu-hpo-inf ~]# less /tmp/noobaa-operator-20Maybuild.log [root@rkomandu-hpo-inf ~]# noobaa version INFO[0000] CLI version: 5.9.0 INFO[0000] noobaa-image: noobaa/noobaa-core:master-20210520 INFO[0000] operator-image: noobaa/noobaa-operator:5.9.0 [root@rkomandu-hpo-inf ~]# oc version Client Version: 4.7.8 Server Version: 4.7.8 Kubernetes Version: v1.20.0+7d0a2b2 [root@rkomandu-hpo-inf ~]# noobaa status INFO[0001] CLI version: 5.9.0 INFO[0001] noobaa-image: noobaa/noobaa-core:master-20210520 INFO[0001] operator-image: noobaa/noobaa-operator:master-20210520 INFO[0001] noobaa-db-image: centos/mongodb-36-centos7 INFO[0001] Namespace: noobaa
cat /tmp/noobaa-operator-20Maybuild.log time="2021-05-21T07:00:20Z" level=info msg="CLI version: 5.9.0\n" time="2021-05-21T07:00:20Z" level=info msg="noobaa-image: noobaa/noobaa-core:master-20210520\n" time="2021-05-21T07:00:20Z" level=info msg="operator-image: noobaa/noobaa-operator:5.9.0\n" I0521 07:00:21.181798 1 request.go:645] Throttling request took 1.045524803s, request: GET:https://172.30.0.1:443/apis/imageregistry.operator.openshift.io/v1?timeout=32s time="2021-05-21T07:00:37Z" level=fatal msg="Failed to become leader: etcdserver: request timed out"
@romayalon
I am successful this time with the --previous option..
oc get pods NAME READY STATUS RESTARTS AGE noobaa-core-0 1/1 Running 0 6h noobaa-db-0 1/1 Running 0 6h noobaa-default-backing-store-noobaa-pod-0c8f0a32 0/1 Terminating 0 5s noobaa-endpoint-877dfcd54-4vwlf 1/1 Running 0 5h52m noobaa-operator-68bb5bff97-hbpzh 1/1 Running 4 6h
oc logs --previous=true noobaa-operator-68bb5bff97-hbpzh > /tmp/noobaa-operator-logs-20210520bld
similar to #449
@rkomandu assuming this is no longer relevant?
@nimrod-becker , as we are using the ODF builds for the release and as well to the d/s builds, this wouldn't be relevant I suppose as long as the MG is collecting the data accordingly. WDYT ?
agree