etcd-operator icon indicating copy to clipboard operation
etcd-operator copied to clipboard

ETCD pods going into completed state

Open Madhu-1 opened this issue 6 years ago • 4 comments

HI we are using ETCD operator to deploy ETCD cluster for our service as a backend storage after some time all the ETCD pods are going into the completed state

Below provided are the yaml files used for deployment

ETCD operator template

---
kind: ServiceAccount
apiVersion: v1
metadata:
  name: etcd-operator
  namespace: gcs
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: etcd-operator
  namespace: gcs
rules:
  - apiGroups:
      - etcd.database.coreos.com
    resources:
      - etcdclusters
      - etcdbackups
      - etcdrestores
    verbs:
      - "*"
  - apiGroups:
      - apiextensions.k8s.io
    resources:
      - customresourcedefinitions
    verbs:
      - "*"
  - apiGroups:
      - ""
    resources:
      - pods
      - services
      - endpoints
      - persistentvolumeclaims
      - events
    verbs:
      - "*"
  - apiGroups:
      - apps
    resources:
      - deployments
    verbs:
      - "*"
  - apiGroups:
      - ""
    resources:
      - secrets
    verbs:
      - get
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: etcd-operator
  namespace: gcs
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: etcd-operator
subjects:
  - kind: ServiceAccount
    name: etcd-operator
    namespace: gcs
---
kind: Deployment
apiVersion: apps/v1
metadata:
  name: etcd-operator
  namespace: gcs
  labels:
    app.kubernetes.io/part-of: gcs
    app.kubernetes.io/component: etcd
    app.kubernetes.io/name: etcd-operator
spec:
  replicas: 1
  selector:
    matchLabels:
      app.kubernetes.io/part-of: gcs
      app.kubernetes.io/component: etcd
      app.kubernetes.io/name: etcd-operator
  template:
    metadata:
      labels:
        app.kubernetes.io/part-of: gcs
        app.kubernetes.io/component: etcd
        app.kubernetes.io/name: etcd-operator
      namespace: gcs
    spec:
      serviceAccountName: etcd-operator
      containers:
        - name: etcd-operator
          image: quay.io/coreos/etcd-operator:v0.9.2
          command:
            - etcd-operator
          env:
            - name: MY_POD_NAMESPACE
              valueFrom:
                fieldRef:
                  fieldPath: metadata.namespace
            - name: MY_POD_NAME
              valueFrom:
                fieldRef:
                  fieldPath: metadata.name

ETCD cluster template

---
kind: EtcdCluster
apiVersion: etcd.database.coreos.com/v1beta2
metadata:
  name: etcd
  namespace: gcs
  labels:
    app.kubernetes.io/part-of: gcs
    app.kubernetes.io/component: etcd
    app.kubernetes.io/name: etcd-cluster
spec:
  size: 3
  version: 3.3.8

After some time ETCD pods going into the completed state

NAME                                   READY   STATUS      RESTARTS   AGE
etcd-hgkcl46ldg                        0/1     Completed   0          18h
etcd-operator-7cb5bd459b-6jbvr         1/1     Running     0          18h
etcd-pfj48x67nd                        1/1     Running     0          18h
etcd-r7cffdlxqs                        0/1     Completed   0          18h

Are we doing any mistake while deploying ETCD operator or ETCD cluster?

does operator supportes deployment of ETCD version 3.3.8?

Madhu-1 avatar Oct 26 '18 16:10 Madhu-1

Even we are seeing this issue with ETCD version 3.2.x

Madhu-1 avatar Oct 29 '18 04:10 Madhu-1

We are seeing this issue as well. We will not be able to use this operator if it the etcd nodes randomly exit.

rahmnathan avatar Mar 21 '19 17:03 rahmnathan

Same here, after restart the state are Completed.


Related https://github.com/coreos/etcd-operator/issues/1323

brunowego avatar Sep 15 '19 22:09 brunowego

@Madhu-1 @rahmnathan take a look https://github.com/coreos/etcd-operator/pull/2097

brunowego avatar Sep 15 '19 22:09 brunowego