使用kk创建集群 etcd异常
What is version of KubeKey has the issue?
./kk version >3.1.2
What is your os environment?
Ubuntu 22.04
KubeKey config file
No response
A clear and concise description of what happend.
目前使用kubekey 创建集群时 默认使用3.5.13版本 按工具目前操作后 etcd检查异常 无法继续进行集群创建 即便进行删除集群操作后再尝试也不行 在yaml中指定etcd版本失败 必定使用3.5.13版本进行覆盖 覆盖后集群 单点 君无法正常启动 无法通过健康检查 While using kubekey to create a cluster, ETCD version 3.5.13 is used by default (It can’t be set to previous version). After running the tool create action, etcd check exception occurs and cluster creation cannot continue. Even if you try again after deleting the cluster action, it still won't work. Failed to specify etcd version in yaml. Version 3.5.13 must be used for overwriting. After overwriting, the single point of the cluster cannot start normally and cannot pass the health check.
Relevant log output
downloading image: registry.cn-beijing.aliyuncs.com/kubesphereio/node:v3.27.3
13:19:26 CST message: [kubesphere-master-1]
downloading image: registry.cn-beijing.aliyuncs.com/kubesphereio/pod2daemon-flexvol:v3.27.3
13:19:27 CST success: [kubesphere-node-2]
13:19:27 CST success: [kubesphere-node-1]
13:19:27 CST success: [kubesphere-master-1]
13:19:27 CST [ETCDPreCheckModule] Get etcd status
13:19:27 CST success: [kubesphere-master-1]
13:19:27 CST [CertsModule] Fetch etcd certs
13:19:27 CST success: [kubesphere-master-1]
13:19:27 CST [CertsModule] Generate etcd Certs
[certs] Using existing ca certificate authority
[certs] Using existing admin-kubesphere-master-1 certificate and key on disk
[certs] Using existing member-kubesphere-master-1 certificate and key on disk
[certs] Using existing node-kubesphere-master-1 certificate and key on disk
13:19:27 CST success: [LocalHost]
13:19:27 CST [CertsModule] Synchronize certs file
13:19:29 CST success: [kubesphere-master-1]
13:19:29 CST [CertsModule] Synchronize certs file to master
13:19:29 CST skipped: [kubesphere-master-1]
13:19:29 CST [InstallETCDBinaryModule] Install etcd using binary
13:19:31 CST success: [kubesphere-master-1]
13:19:31 CST [InstallETCDBinaryModule] Generate etcd service
13:19:32 CST success: [kubesphere-master-1]
13:19:32 CST [InstallETCDBinaryModule] Generate access address
13:19:32 CST success: [kubesphere-master-1]
13:19:32 CST [ETCDConfigureModule] Health check on exist etcd
13:19:32 CST skipped: [kubesphere-master-1]
13:19:32 CST [ETCDConfigureModule] Generate etcd.env config on new etcd
13:19:32 CST success: [kubesphere-master-1]
13:19:32 CST [ETCDConfigureModule] Refresh etcd.env config on all etcd
13:19:32 CST success: [kubesphere-master-1]
13:19:32 CST [ETCDConfigureModule] Restart etcd
13:19:37 CST success: [kubesphere-master-1]
13:19:37 CST [ETCDConfigureModule] Health check on all etcd
13:19:37 CST message: [kubesphere-master-1]
etcd health check failed: Failed to exec command: sudo -E /bin/bash -c "export ETCDCTL_API=2;export ETCDCTL_CERT_FILE='/etc/ssl/etcd/ssl/admin-kubesphere-master-1.pem';export ETCDCTL_KEY_FILE='/etc/ssl/etcd/ssl/admin-kubesphere-master-1-key.pem';export ETCDCTL_CA_FILE='/etc/ssl/etcd/ssl/ca.pem';/usr/local/bin/etcdctl --endpoints=https://192.168.109.100:2379 cluster-health | grep -q 'cluster is healthy'"
Error: client: etcd cluster is unavailable or misconfigured; error #0: tls: failed to verify certificate: x509: certificate is valid for 127.0.0.1, ::1, 192.168.108.100, 192.168.108.101, 192.168.108.102, 192.168.108.103, not 192.168.109.100
error #0: tls: failed to verify certificate: x509: certificate is valid for 127.0.0.1, ::1, 192.168.108.100, 192.168.108.101, 192.168.108.102, 192.168.108.103, not 192.168.109.100: Process exited with status 1
13:19:37 CST retry: [kubesphere-master-1]
13:19:42 CST message: [kubesphere-master-1]
etcd health check failed: Failed to exec command: sudo -E /bin/bash -c "export ETCDCTL_API=2;export ETCDCTL_CERT_FILE='/etc/ssl/etcd/ssl/admin-kubesphere-master-1.pem';export ETCDCTL_KEY_FILE='/etc/ssl/etcd/ssl/admin-kubesphere-master-1-key.pem';export ETCDCTL_CA_FILE='/etc/ssl/etcd/ssl/ca.pem';/usr/local/bin/etcdctl --endpoints=https://192.168.109.100:2379 cluster-health | grep -q 'cluster is healthy'"
Error: client: etcd cluster is unavailable or misconfigured; error #0: tls: failed to verify certificate: x509: certificate is valid for 127.0.0.1, ::1, 192.168.108.100, 192.168.108.101, 192.168.108.102, 192.168.108.103, not 192.168.109.100
error #0: tls: failed to verify certificate: x509: certificate is valid for 127.0.0.1, ::1, 192.168.108.100, 192.168.108.101, 192.168.108.102, 192.168.108.103, not 192.168.109.100: Process exited with status 1
13:19:42 CST retry: [kubesphere-master-1]
13:19:47 CST message: [kubesphere-master-1]
etcd health check failed: Failed to exec command: sudo -E /bin/bash -c "export ETCDCTL_API=2;export ETCDCTL_CERT_FILE='/etc/ssl/etcd/ssl/admin-kubesphere-master-1.pem';export ETCDCTL_KEY_FILE='/etc/ssl/etcd/ssl/admin-kubesphere-master-1-key.pem';export ETCDCTL_CA_FILE='/etc/ssl/etcd/ssl/ca.pem';/usr/local/bin/etcdctl --endpoints=https://192.168.109.100:2379 cluster-health | grep -q 'cluster is healthy'"
Error: client: etcd cluster is unavailable or misconfigured; error #0: tls: failed to verify certificate: x509: certificate is valid for 127.0.0.1, ::1, 192.168.108.100, 192.168.108.101, 192.168.108.102, 192.168.108.103, not 192.168.109.100
error #0: tls: failed to verify certificate: x509: certificate is valid for 127.0.0.1, ::1, 192.168.108.100, 192.168.108.101, 192.168.108.102, 192.168.108.103, not 192.168.109.100: Process exited with status 1
13:19:47 CST retry: [kubesphere-master-1]
13:19:52 CST message: [kubesphere-master-1]
etcd health check failed: Failed to exec command: sudo -E /bin/bash -c "export ETCDCTL_API=2;export ETCDCTL_CERT_FILE='/etc/ssl/etcd/ssl/admin-kubesphere-master-1.pem';export ETCDCTL_KEY_FILE='/etc/ssl/etcd/ssl/admin-kubesphere-master-1-key.pem';export ETCDCTL_CA_FILE='/etc/ssl/etcd/ssl/ca.pem';/usr/local/bin/etcdctl --endpoints=https://192.168.109.100:2379 cluster-health | grep -q 'cluster is healthy'"
Error: client: etcd cluster is unavailable or misconfigured; error #0: tls: failed to verify certificate: x509: certificate is valid for 127.0.0.1, ::1, 192.168.108.100, 192.168.108.101, 192.168.108.102, 192.168.108.103, not 192.168.109.100
error #0: tls: failed to verify certificate: x509: certificate is valid for 127.0.0.1, ::1, 192.168.108.100, 192.168.108.101, 192.168.108.102, 192.168.108.103, not 192.168.109.100: Process exited with status 1
13:19:52 CST retry: [kubesphere-master-1]
13:19:57 CST message: [kubesphere-master-1]
etcd health check failed: Failed to exec command: sudo -E /bin/bash -c "export ETCDCTL_API=2;export ETCDCTL_CERT_FILE='/etc/ssl/etcd/ssl/admin-kubesphere-master-1.pem';export ETCDCTL_KEY_FILE='/etc/ssl/etcd/ssl/admin-kubesphere-master-1-key.pem';export ETCDCTL_CA_FILE='/etc/ssl/etcd/ssl/ca.pem';/usr/local/bin/etcdctl --endpoints=https://192.168.109.100:2379 cluster-health | grep -q 'cluster is healthy'"
Error: client: etcd cluster is unavailable or misconfigured; error #0: tls: failed to verify certificate: x509: certificate is valid for 127.0.0.1, ::1, 192.168.108.100, 192.168.108.101, 192.168.108.102, 192.168.108.103, not 192.168.109.100
error #0: tls: failed to verify certificate: x509: certificate is valid for 127.0.0.1, ::1, 192.168.108.100, 192.168.108.101, 192.168.108.102, 192.168.108.103, not 192.168.109.100: Process exited with status 1
13:19:57 CST retry: [kubesphere-master-1]
Additional information
从3.1.7 降级到3.1.1都创建失败 但目前似乎没有影响到已有老集群加减节点的功能
同样的问题 新增node节点报同样的错误 请问有没有解决
I got the same issue, any solutions found?
你好,我也是使用3.1.7创建集群出现这个问题,需要使用那个版本正常
我用 kubekey 删除了旧有的集群后,就能正常创建集群了
./kk delete cluster -f config-sample.yaml -y
应该是 kubekey 中会保留旧集群的证书,最好只进行一次性安装。这边推荐以后每次创建集群前都用 kubekey 删除旧集群,以免出现问题。