etcd-operator icon indicating copy to clipboard operation
etcd-operator copied to clipboard

etcdEnv changes in etcdcluster not reflected in updated pod

Open jmcmeek opened this issue 6 years ago • 7 comments

I need to change the ETCD_CIPHER_SUITES in an existing cluster at quay.io/coreos/etcd:v3.3.1. I can create a new etcd cluster at v3.3.7 with the correct ETCD_CIPHER_SUITES. My attempts to modify an existing etcdcluster resource give me a cluster at the new version, but with no changes to the etcd pod environment.

I created a simple TLS enabled cluster from examples/tls/example-tls-cluster.yaml, but edited to use etcd version 3.3.1 (also included ETCD_DEBUG in my example).

I then modified the yaml to change the version to 3.3.7 and add ETCD_CIPHER_SUITES and apply the modified yaml.

After doing this, the etcdcluster resource shows the new version and etcdEnv.

Snippet from kubectl get etcdcluster example -o yaml:

  TLS:
    static:
      member:
        peerSecret: etcd-peer-tls
        serverSecret: etcd-server-tls
      operatorSecret: etcd-client-tls
  pod:
    etcdEnv:
    - name: ETCD_DEBUG
      value: "true"
    - name: ETCD_CIPHER_SUITES
      value: TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
    resources: {}
  repository: quay.io/coreos/etcd
  size: 3
  version: 3.3.7

But the updated pods have only the version. The environment spec does not include ETCD_CIPHER_SUITES.

Snippet from kubectl get pod -o yaml:

spec:
  automountServiceAccountToken: false
  containers:
  - command:
    - /usr/local/bin/etcd
    - --data-dir=/var/etcd/data
    - --name=example-8rnxd597sz
    - --initial-advertise-peer-urls=https://example-8rnxd597sz.example.default.svc:2380
    - --listen-peer-urls=https://0.0.0.0:2380
    - --listen-client-urls=https://0.0.0.0:2379
    - --advertise-client-urls=https://example-8rnxd597sz.example.default.svc:2379
    - --initial-cluster=example-8rnxd597sz=https://example-8rnxd597sz.example.default.svc:2380
    - --initial-cluster-state=new
    - --peer-client-cert-auth=true
    - --peer-trusted-ca-file=/etc/etcdtls/member/peer-tls/peer-ca.crt
    - --peer-cert-file=/etc/etcdtls/member/peer-tls/peer.crt
    - --peer-key-file=/etc/etcdtls/member/peer-tls/peer.key
    - --client-cert-auth=true
    - --trusted-ca-file=/etc/etcdtls/member/server-tls/server-ca.crt
    - --cert-file=/etc/etcdtls/member/server-tls/server.crt
    - --key-file=/etc/etcdtls/member/server-tls/server.key
    - --initial-cluster-token=b34bfed2-d2b8-40b0-94af-d71cda41ee9e
    env:
    - name: ETCD_DEBUG
      value: "true"
    image: quay.io/coreos/etcd:v3.3.7

jmcmeek avatar Jan 11 '19 20:01 jmcmeek

Running on Kubernetes 1.12 Initial etcdcluster yaml: example-tls-cluster.yaml.txt

Updated etcdcluster yaml: update-tls-cluster.yaml.txt

End of etcd-operator log showing the update: etcd-operator.log

jmcmeek avatar Jan 11 '19 21:01 jmcmeek

@jmcmeek thanks for the report it looks like this was an update from 3.3.1 to 3.3.7. Could you try a new cluster with 3.3.7 just to confirm same result.

hexfusion avatar Jan 11 '19 21:01 hexfusion

@hexfusion Creating a new cluster works fine.

jmcmeek avatar Jan 12 '19 12:01 jmcmeek

@jmcmeek thanks that gives me something to go on unless you are interested in taking a look?

hexfusion avatar Jan 12 '19 12:01 hexfusion

Looks like the existing upgrade code can only change version - patches the image name and version. And after trying to edit the environment via kubectl edit pod, I think that may be all it can do:

# * spec: Forbidden: pod updates may not change fields other than `spec.containers[*].image`, `spec.initContainers[*].image`, `spec.activeDeadlineSeconds` or `spec.tolerations` (only additions to existing tolerations)

Updating the environment can be done by deleting the existing pods one by one - the new pods have the new environment. That wouldn't work well if you had a cluster of one pod (lose the database).

We're looking at alternatives like building our own docker image with default environment in the Dockerfile.

jmcmeek avatar Jan 12 '19 23:01 jmcmeek

@hexfusion Why is etcd-operator designed to not allow modification of etcdEnv? https://github.com/coreos/etcd-operator/blob/master/pkg/apis/etcd/v1beta2/cluster.go#L134

xigang avatar Jan 16 '20 02:01 xigang

@hexfusion Why is etcd-operator designed to not allow modification of etcdEnv? https://github.com/coreos/etcd-operator/blob/master/pkg/apis/etcd/v1beta2/cluster.go#L134

I know the reason, container env is not allowed in patch. image

xigang avatar Jan 17 '20 01:01 xigang