etcd
etcd copied to clipboard
Bad Certificates should be explained, not just stated.
What happened?
I previously set up a 3-node cluster using bi-directional TLS, which was working before Clients attempting to call etcd using a previously working certificate now fails with:
{"level":"warn","ts":"2022-08-09T10:28:48.597Z","caller":"embed/config_logging.go:169","msg":"rejected connection","remote-addr":"172.30.214.50:39334","server-name":"etcd-0.etcd","error":"remote error: tls: bad certificate"}
What did you expect to happen?
I expected the logs to explain why the certificate was bad instead of stonewalling me
How can we reproduce it (as minimally and precisely as possible)?
Create an etcd cluster and connect to it using a bad certificate. Any flavor you want. Pretend to be a clueless user who knows nothing about certificates and wonder what's wrong with your certificate
Anything else we need to know?
The client and etcd use the same CA Issuer, hence the etcd configuration below.
Etcd Server's working peer certificate
Certificate:
Data:
Version: 3 (0x2)
Serial Number:
...
Signature Algorithm: sha256WithRSAEncryption
Issuer: CN=wmlserving-ca
Validity
Not Before: Jun 20 20:00:46 2022 GMT
Not After : Sep 18 20:00:46 2022 GMT
Subject: CN=*.etcd
Subject Public Key Info:
Public Key Algorithm: rsaEncryption
RSA Public-Key: (4096 bit)
Modulus:
...
Exponent: 65537 (0x10001)
X509v3 extensions:
X509v3 Key Usage: critical
Digital Signature, Key Encipherment
X509v3 Basic Constraints: critical
CA:FALSE
X509v3 Authority Key Identifier:
keyid:...
X509v3 Subject Alternative Name:
DNS:localhost, DNS:etcd, DNS:*.etcd, DNS:etcd.argo-wo, DNS:*.etcd.argo-wo, DNS:etcd.argo-wo.svc, DNS:*.etcd.argo-wo.svc, DNS:etcd.argo-wo.svc.cluster.local, DNS:*.etcd.argo-wo.svc.cluster.local
Signature Algorithm: sha256WithRSAEncryption
...
(Decoded using the following version of OpenSSL: OpenSSL 1.1.1b 26 Feb 2019)
Client's "bad certificate" information
Certificate:
Data:
Version: 3 (0x2)
Serial Number:
...
Signature Algorithm: sha256WithRSAEncryption
Issuer: CN=wmlserving-ca
Validity
Not Before: Jun 28 18:58:16 2022 GMT
Not After : Sep 26 18:58:16 2022 GMT
Subject: CN=wml-serving
Subject Public Key Info:
Public Key Algorithm: rsaEncryption
RSA Public-Key: (4096 bit)
Modulus:
...
Exponent: 65537 (0x10001)
X509v3 extensions:
X509v3 Extended Key Usage:
TLS Web Server Authentication, TLS Web Client Authentication
X509v3 Basic Constraints: critical
CA:FALSE
X509v3 Authority Key Identifier:
keyid:...
X509v3 Subject Alternative Name:
DNS:localhost, DNS:wml-serving, DNS:wml-serving.argo-wo, DNS:wml-serving.argo-wo.svc, DNS:wml-serving.argo-wo.svc.cluster.local
Signature Algorithm: sha256WithRSAEncryption
...
(Decoded using the following version of OpenSSL: OpenSSL 1.1.1b 26 Feb 2019)
Etcd version (please run commands below)
quay.io/coreos/etcd:v3.5.4
$ etcd --version
etcd Version: 3.5.4
Git SHA: 08407ff76
Go Version: go1.16.15
Go OS/Arch: linux/amd64
$ etcdctl version
etcdctl version: 3.5.4
API version: 3.5
Etcd configuration (command line flags or environment variables)
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: etcd
spec:
replicas: 3
selector:
matchLabels:
name: etcd
serviceName: etcd
template:
metadata:
labels:
name: etcd
spec:
containers:
- name: app
image: quay.io/coreos/etcd:v3.5.4
imagePullPolicy: Always
volumeMounts:
- name: data
mountPath: /var/run/etcd
- name: etcd-ssl
mountPath: /etc/etcd/ssl
command:
- /bin/sh
- -c
- |
PEERS="etcd-0=https://etcd-0.etcd:2380,etcd-1=https://etcd-1.etcd:2380,etcd-2=https://etcd-2.etcd:2380"
exec etcd --name ${HOSTNAME} \
--listen-peer-urls https://0.0.0.0:2380 \
--listen-client-urls https://0.0.0.0:2379 \
--advertise-client-urls https://${HOSTNAME}.etcd:2379 \
--initial-advertise-peer-urls https://${HOSTNAME}:2380 \
--initial-cluster-token etcd-cluster \
--initial-cluster ${PEERS} \
--initial-cluster-state new \
--trusted-ca-file=/etc/etcd/ssl/ca.crt \
--cert-file=/etc/etcd/ssl/tls.crt \
--key-file=/etc/etcd/ssl/tls.key \
--peer-cert-file=/etc/etcd/ssl/tls.crt \
--peer-key-file=/etc/etcd/ssl/tls.key \
--peer-trusted-ca-file=/etc/etcd/ssl/ca.crt \
--peer-client-cert-auth \
--client-cert-auth \
--data-dir /var/run/etcd/default.etcd
volumes:
- name: etcd-ssl
secret:
secretName: etcd-cert
items:
- key: tls.crt
path: tls.crt
- key: tls.key
path: tls.key
- key: ca.crt
path: ca.crt
volumeClaimTemplates:
- metadata:
name: data
spec:
storageClassName: {{ .Values.etcdStorageClass }}
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 1Gi
Etcd debug information (please run commands blow, feel free to obfuscate the IP address or FQDN in the output)
$ etcdctl member list -w table
{"level":"warn","ts":"2022-08-09T10:36:11.729Z","logger":"etcd-client","caller":"v3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc000366a80/127.0.0.1:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
Error: context deadline exceeded
$ export PEERS=https://etcd-0.etcd:2380,https://etcd-1.etcd:2380,https://etcd-2.etcd:2380
$ etcdctl --endpoints=$PEERS endpoint status -w table
{"level":"warn","ts":"2022-08-09T10:39:32.744Z","logger":"etcd-client","caller":"v3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0001a4000/etcd-0.etcd:2380","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest balancer error: last connection error: connection error: desc = \"transport: authentication handshake failed: x509: certificate signed by unknown authority\""}
Failed to get the status of endpoint https://etcd-0.etcd:2380 (context deadline exceeded)
{"level":"warn","ts":"2022-08-09T10:39:37.745Z","logger":"etcd-client","caller":"v3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0001a4000/etcd-0.etcd:2380","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest balancer error: last connection error: connection error: desc = \"transport: authentication handshake failed: x509: certificate signed by unknown authority\""}
Failed to get the status of endpoint https://etcd-1.etcd:2380 (context deadline exceeded)
{"level":"warn","ts":"2022-08-09T10:39:42.746Z","logger":"etcd-client","caller":"v3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0001a4000/etcd-0.etcd:2380","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest balancer error: last connection error: connection error: desc = \"transport: authentication handshake failed: x509: certificate signed by unknown authority\""}
Failed to get the status of endpoint https://etcd-2.etcd:2380 (context deadline exceeded)
+----------+----+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+----------+----+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
+----------+----+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
Relevant log output
No response
It turns out that the CA certificate was expired, although that isn't relevant to the issue, which is that the logs should be explaining this.
WIP PR in progress: https://github.com/etcd-io/etcd/pull/14617
This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.