kubekey
kubekey copied to clipboard
安装时 etcd 无法 peer
What is version of KubeKey has the issue?
kk version: &version.Info{Major:"3", Minor:"0", GitVersion:"v3.0.8", GitCommit:"2698dfbc5781a0fdf3ba587797676dd91c9f8274", GitTreeState:"clean", BuildDate:"2023-07-14T01:14:55Z", GoVersion:"go1.19.2", Compiler:"gc", Platform:"linux/amd64"}
What is your os environment?
6.1.8-arch1-1
KubeKey config file
apiVersion: kubekey.kubesphere.io/v1alpha2
kind: Cluster
spec:
hosts:
##You should complete the ssh information of the hosts
- {name: m01, address: x.x.x.x, internalAddress: 198.18.32.1, privateKeyPath: "~/.ssh/id_rsa_c0"}
- {name: p01, address: x.x.x.x, internalAddress: 198.18.32.2, privateKeyPath: "~/.ssh/id_rsa_c0"}
roleGroups:
etcd:
- m01
- p01
master:
- m01
worker:
- m01
- p01
controlPlaneEndpoint:
##Internal loadbalancer for apiservers
#internalLoadbalancer: haproxy
##If the external loadbalancer was used, 'address' should be set to loadbalancer's ip.
domain: lb.kubesphere.local
address: "198.18.32.1"
port: 6443
kubernetes:
version: v1.24.7
clusterName: cluster.local
proxyMode: ipvs
masqueradeAll: false
maxPods: 110
nodeCidrMaskSize: 24
network:
plugin: calico
kubePodsCIDR: 10.233.64.0/18
kubeServiceCIDR: 10.233.0.0/18
registry:
privateRegistry: ""
A clear and concise description of what happend.
From p01:
● etcd.service - etcd
Loaded: loaded (/etc/systemd/system/etcd.service; disabled; preset: disabled)
Active: activating (start) since Fri 2023-07-14 08:20:18 UTC; 1min 2s ago
Main PID: 7333 (etcd)
Tasks: 8 (limit: 9830)
Memory: 23.3M
CPU: 4.865s
CGroup: /system.slice/etcd.service
└─7333 /usr/local/bin/etcd
7月 14 08:21:21 p01 etcd[7333]: rejected connection from "198.18.32.1:56798" (error "tls: failed to verify client's certificate: x509: certificate has expired or is not yet valid", ServerName "")
7月 14 08:21:21 p01 etcd[7333]: rejected connection from "198.18.32.1:56796" (error "tls: failed to verify client's certificate: x509: certificate has expired or is not yet valid", ServerName "")
7月 14 08:21:21 p01 etcd[7333]: rejected connection from "198.18.32.1:56808" (error "tls: failed to verify client's certificate: x509: certificate has expired or is not yet valid", ServerName "")
7月 14 08:21:21 p01 etcd[7333]: rejected connection from "198.18.32.1:56802" (error "tls: failed to verify client's certificate: x509: certificate has expired or is not yet valid", ServerName "")
7月 14 08:21:21 p01 etcd[7333]: rejected connection from "198.18.32.1:56822" (error "tls: failed to verify client's certificate: x509: certificate has expired or is not yet valid", ServerName "")
7月 14 08:21:21 p01 etcd[7333]: rejected connection from "198.18.32.1:56810" (error "tls: failed to verify client's certificate: x509: certificate has expired or is not yet valid", ServerName "")
7月 14 08:21:21 p01 etcd[7333]: rejected connection from "198.18.32.1:56834" (error "tls: failed to verify client's certificate: x509: certificate has expired or is not yet valid", ServerName "")
7月 14 08:21:21 p01 etcd[7333]: rejected connection from "198.18.32.1:56842" (error "tls: failed to verify client's certificate: x509: certificate has expired or is not yet valid", ServerName "")
7月 14 08:21:22 p01 etcd[7333]: rejected connection from "198.18.32.1:56858" (error "tls: failed to verify client's certificate: x509: certificate has expired or is not yet valid", ServerName "")
7月 14 08:21:22 p01 etcd[7333]: rejected connection from "198.18.32.1:56862" (error "tls: failed to verify client's certificate: x509: certificate has expired or is not yet valid", ServerName "")
7月 14 08:21:22 p01 etcd[7333]: rejected connection from "198.18.32.1:56874" (error "tls: failed to verify client's certificate: x509: certificate has expired or is not yet valid", ServerName "")
7月 14 08:21:22 p01 etcd[7333]: rejected connection from "198.18.32.1:56882" (error "tls: failed to verify client's certificate: x509: certificate has expired or is not yet valid", ServerName "")
7月 14 08:21:22 p01 etcd[7333]: rejected connection from "198.18.32.1:56892" (error "tls: failed to verify client's certificate: x509: certificate has expired or is not yet valid", ServerName "")
7月 14 08:21:22 p01 etcd[7333]: rejected connection from "198.18.32.1:56900" (error "tls: failed to verify client's certificate: x509: certificate has expired or is not yet valid", ServerName "")
7月 14 08:21:22 p01 etcd[7333]: rejected connection from "198.18.32.1:56930" (error "tls: failed to verify client's certificate: x509: certificate has expired or is not yet valid", ServerName "")
7月 14 08:21:22 p01 etcd[7333]: rejected connection from "198.18.32.1:56914" (error "tls: failed to verify client's certificate: x509: certificate has expired or is not yet valid", ServerName "")
7月 14 08:21:22 p01 etcd[7333]: rejected connection from "198.18.32.1:56944" (error "tls: failed to verify client's certificate: x509: certificate has expired or is not yet valid", ServerName "")
7月 14 08:21:22 p01 etcd[7333]: rejected connection from "198.18.32.1:56932" (error "tls: failed to verify client's certificate: x509: certificate has expired or is not yet valid", ServerName "")
7月 14 08:21:22 p01 etcd[7333]: rejected connection from "198.18.32.1:56950" (error "tls: failed to verify client's certificate: x509: certificate has expired or is not yet valid", ServerName "")
7月 14 08:21:22 p01 etcd[7333]: rejected connection from "198.18.32.1:56962" (error "tls: failed to verify client's certificate: x509: certificate has expired or is not yet valid", ServerName "")
Relevant log output
root@m01 ~ ❯ ./kk create cluster -f sample.yaml --container-manager containerd
_ __ _ _ __
| | / / | | | | / /
| |/ / _ _| |__ ___| |/ / ___ _ _
| \| | | | '_ \ / _ \ \ / _ \ | | |
| |\ \ |_| | |_) | __/ |\ \ __/ |_| |
\_| \_/\__,_|_.__/ \___\_| \_/\___|\__, |
__/ |
|___/
14:48:51 UTC [GreetingsModule] Greetings
14:48:52 UTC message: [p01]
Greetings, KubeKey!
14:48:52 UTC message: [m01]
Greetings, KubeKey!
14:48:52 UTC success: [p01]
14:48:52 UTC success: [m01]
14:48:52 UTC [NodePreCheckModule] A pre-check on nodes
14:48:53 UTC success: [p01]
14:48:53 UTC success: [m01]
14:48:53 UTC [ConfirmModule] Display confirmation form
+-------------+------+------+---------+----------+-------+-------+---------+-----------+--------+--------+------------+------------+-------------+------------------+--------------+
| name | sudo | curl | openssl | ebtables | socat | ipset | ipvsadm | conntrack | chrony | docker | containerd | nfs client | ceph client | glusterfs client | time |
+-------------+------+------+---------+----------+-------+-------+---------+-----------+--------+--------+------------+------------+-------------+------------------+--------------+
| m01 | y | y | y | | y | | | y | y | 24.0.2 | v1.6.21 | y | | | UTC 14:48:53 |
| p01 | y | y | y | | y | | | y | | | v1.7.2 | y | | | UTC 08:18:30 |
+-------------+------+------+---------+----------+-------+-------+---------+-----------+--------+--------+------------+------------+-------------+------------------+--------------+
This is a simple check of your environment.
Before installation, ensure that your machines meet all requirements specified at
https://github.com/kubesphere/kubekey#requirements-and-recommendations
Continue this installation? [yes/no]: yes
14:48:55 UTC success: [LocalHost]
14:48:55 UTC [NodeBinariesModule] Download installation binaries
14:48:55 UTC message: [localhost]
downloading amd64 kubeadm v1.24.7 ...
14:48:55 UTC message: [localhost]
kubeadm is existed
14:48:55 UTC message: [localhost]
downloading amd64 kubelet v1.24.7 ...
14:48:56 UTC message: [localhost]
kubelet is existed
14:48:56 UTC message: [localhost]
downloading amd64 kubectl v1.24.7 ...
14:48:57 UTC message: [localhost]
kubectl is existed
14:48:57 UTC message: [localhost]
downloading amd64 helm v3.9.0 ...
14:48:57 UTC message: [localhost]
helm is existed
14:48:57 UTC message: [localhost]
downloading amd64 kubecni v1.2.0 ...
14:48:58 UTC message: [localhost]
kubecni is existed
14:48:58 UTC message: [localhost]
downloading amd64 crictl v1.24.0 ...
14:48:58 UTC message: [localhost]
crictl is existed
14:48:58 UTC message: [localhost]
downloading amd64 etcd v3.4.13 ...
14:48:58 UTC message: [localhost]
etcd is existed
14:48:58 UTC message: [localhost]
downloading amd64 containerd 1.6.4 ...
14:48:59 UTC message: [localhost]
containerd is existed
14:48:59 UTC message: [localhost]
downloading amd64 runc v1.1.1 ...
14:48:59 UTC message: [localhost]
runc is existed
14:48:59 UTC message: [localhost]
downloading amd64 calicoctl v3.23.2 ...
14:48:59 UTC message: [localhost]
calicoctl is existed
14:48:59 UTC success: [LocalHost]
14:48:59 UTC [ConfigureOSModule] Get OS release
14:48:59 UTC success: [p01]
14:48:59 UTC success: [m01]
14:48:59 UTC [ConfigureOSModule] Prepare to init OS
14:49:00 UTC success: [p01]
14:49:00 UTC success: [m01]
14:49:00 UTC [ConfigureOSModule] Generate init os script
14:49:00 UTC success: [p01]
14:49:00 UTC success: [m01]
14:49:00 UTC [ConfigureOSModule] Exec init os script
14:49:01 UTC stdout: [p01]
modprobe: FATAL: Module ip_vs not found in directory /lib/modules/6.1.8-arch1-1
modprobe: FATAL: Module ip_vs_rr not found in directory /lib/modules/6.1.8-arch1-1
modprobe: FATAL: Module ip_vs_wrr not found in directory /lib/modules/6.1.8-arch1-1
modprobe: FATAL: Module ip_vs_sh not found in directory /lib/modules/6.1.8-arch1-1
modprobe: FATAL: Module nf_conntrack not found in directory /lib/modules/6.1.8-arch1-1
vm.nr_hugepages = 1024
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-arptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_local_reserved_ports = 30000-32767
net.core.netdev_max_backlog = 65535
net.core.rmem_max = 33554432
net.core.wmem_max = 33554432
net.core.somaxconn = 32768
net.ipv4.tcp_max_syn_backlog = 1048576
net.ipv4.neigh.default.gc_thresh1 = 512
net.ipv4.neigh.default.gc_thresh2 = 2048
net.ipv4.neigh.default.gc_thresh3 = 4096
net.ipv4.tcp_retries2 = 15
net.ipv4.tcp_max_tw_buckets = 1048576
net.ipv4.tcp_max_orphans = 65535
net.ipv4.udp_rmem_min = 131072
net.ipv4.udp_wmem_min = 131072
net.ipv4.conf.all.rp_filter = 1
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.all.arp_accept = 1
net.ipv4.conf.default.arp_accept = 1
net.ipv4.conf.all.arp_ignore = 1
net.ipv4.conf.default.arp_ignore = 1
vm.max_map_count = 262144
vm.swappiness = 0
vm.overcommit_memory = 0
fs.inotify.max_user_instances = 524288
fs.inotify.max_user_watches = 524288
fs.pipe-max-size = 4194304
fs.aio-max-nr = 262144
kernel.pid_max = 65535
kernel.watchdog_thresh = 5
kernel.hung_task_timeout_secs = 5
14:49:01 UTC stdout: [m01]
modprobe: FATAL: Module ip_vs not found in directory /lib/modules/6.1.8-arch1-1
modprobe: FATAL: Module ip_vs_rr not found in directory /lib/modules/6.1.8-arch1-1
modprobe: FATAL: Module ip_vs_wrr not found in directory /lib/modules/6.1.8-arch1-1
modprobe: FATAL: Module ip_vs_sh not found in directory /lib/modules/6.1.8-arch1-1
modprobe: FATAL: Module nf_conntrack not found in directory /lib/modules/6.1.8-arch1-1
vm.nr_hugepages = 1024
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-arptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_local_reserved_ports = 30000-32767
net.core.netdev_max_backlog = 65535
net.core.rmem_max = 33554432
net.core.wmem_max = 33554432
net.core.somaxconn = 32768
net.ipv4.tcp_max_syn_backlog = 1048576
net.ipv4.neigh.default.gc_thresh1 = 512
net.ipv4.neigh.default.gc_thresh2 = 2048
net.ipv4.neigh.default.gc_thresh3 = 4096
net.ipv4.tcp_retries2 = 15
net.ipv4.tcp_max_tw_buckets = 1048576
net.ipv4.tcp_max_orphans = 65535
net.ipv4.udp_rmem_min = 131072
net.ipv4.udp_wmem_min = 131072
net.ipv4.conf.all.rp_filter = 1
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.all.arp_accept = 1
net.ipv4.conf.default.arp_accept = 1
net.ipv4.conf.all.arp_ignore = 1
net.ipv4.conf.default.arp_ignore = 1
vm.max_map_count = 262144
vm.swappiness = 0
vm.overcommit_memory = 0
fs.inotify.max_user_instances = 524288
fs.inotify.max_user_watches = 524288
fs.pipe-max-size = 4194304
fs.aio-max-nr = 262144
kernel.pid_max = 65535
kernel.watchdog_thresh = 5
kernel.hung_task_timeout_secs = 5
14:49:01 UTC success: [p01]
14:49:01 UTC success: [m01]
14:49:01 UTC [ConfigureOSModule] configure the ntp server for each node
14:49:01 UTC skipped: [p01]
14:49:01 UTC skipped: [m01]
14:49:01 UTC [KubernetesStatusModule] Get kubernetes cluster status
14:49:01 UTC success: [m01]
14:49:01 UTC [InstallContainerModule] Sync containerd binaries
14:49:01 UTC skipped: [m01]
14:49:01 UTC skipped: [p01]
14:49:01 UTC [InstallContainerModule] Sync crictl binaries
14:49:01 UTC skipped: [p01]
14:49:01 UTC skipped: [m01]
14:49:01 UTC [InstallContainerModule] Generate containerd service
14:49:01 UTC skipped: [m01]
14:49:01 UTC skipped: [p01]
14:49:01 UTC [InstallContainerModule] Generate containerd config
14:49:01 UTC skipped: [p01]
14:49:01 UTC skipped: [m01]
14:49:01 UTC [InstallContainerModule] Generate crictl config
14:49:01 UTC skipped: [p01]
14:49:01 UTC skipped: [m01]
14:49:01 UTC [InstallContainerModule] Enable containerd
14:49:01 UTC skipped: [p01]
14:49:01 UTC skipped: [m01]
14:49:01 UTC [PullModule] Start to pull images on all nodes
14:49:01 UTC message: [p01]
downloading image: kubesphere/pause:3.7
14:49:01 UTC message: [m01]
downloading image: kubesphere/pause:3.7
14:49:01 UTC message: [p01]
downloading image: kubesphere/kube-proxy:v1.24.7
14:49:01 UTC message: [m01]
downloading image: kubesphere/kube-apiserver:v1.24.7
14:49:01 UTC message: [p01]
downloading image: coredns/coredns:1.8.6
14:49:01 UTC message: [m01]
downloading image: kubesphere/kube-controller-manager:v1.24.7
14:49:01 UTC message: [m01]
downloading image: kubesphere/kube-scheduler:v1.24.7
14:49:01 UTC message: [p01]
downloading image: kubesphere/k8s-dns-node-cache:1.15.12
14:49:02 UTC message: [m01]
downloading image: kubesphere/kube-proxy:v1.24.7
14:49:02 UTC message: [p01]
downloading image: calico/kube-controllers:v3.23.2
14:49:02 UTC message: [m01]
downloading image: coredns/coredns:1.8.6
14:49:02 UTC message: [p01]
downloading image: calico/cni:v3.23.2
14:49:02 UTC message: [m01]
downloading image: kubesphere/k8s-dns-node-cache:1.15.12
14:49:02 UTC message: [p01]
downloading image: calico/node:v3.23.2
14:49:02 UTC message: [m01]
downloading image: calico/kube-controllers:v3.23.2
14:49:02 UTC message: [p01]
downloading image: calico/pod2daemon-flexvol:v3.23.2
14:49:02 UTC message: [m01]
downloading image: calico/cni:v3.23.2
14:49:02 UTC message: [m01]
downloading image: calico/node:v3.23.2
14:49:02 UTC message: [m01]
downloading image: calico/pod2daemon-flexvol:v3.23.2
14:49:02 UTC success: [p01]
14:49:02 UTC success: [m01]
14:49:02 UTC [ETCDPreCheckModule] Get etcd status
14:49:02 UTC success: [m01]
14:49:02 UTC success: [p01]
14:49:02 UTC [CertsModule] Fetch etcd certs
14:49:02 UTC success: [m01]
14:49:02 UTC skipped: [p01]
14:49:02 UTC [CertsModule] Generate etcd Certs
[certs] Using existing ca certificate authority
[certs] Using existing admin-m01 certificate and key on disk
[certs] Using existing member-m01 certificate and key on disk
[certs] Using existing node-m01 certificate and key on disk
[certs] Using existing admin-p01 certificate and key on disk
[certs] Using existing member-p01 certificate and key on disk
14:49:02 UTC success: [LocalHost]
14:49:02 UTC [CertsModule] Synchronize certs file
14:49:03 UTC success: [p01]
14:49:03 UTC success: [m01]
14:49:03 UTC [CertsModule] Synchronize certs file to master
14:49:03 UTC skipped: [m01]
14:49:03 UTC [InstallETCDBinaryModule] Install etcd using binary
14:49:05 UTC success: [p01]
14:49:05 UTC success: [m01]
14:49:05 UTC [InstallETCDBinaryModule] Generate etcd service
14:49:05 UTC success: [p01]
14:49:05 UTC success: [m01]
14:49:05 UTC [InstallETCDBinaryModule] Generate access address
14:49:05 UTC skipped: [p01]
14:49:05 UTC success: [m01]
14:49:05 UTC [ETCDConfigureModule] Health check on exist etcd
14:49:05 UTC skipped: [p01]
14:49:05 UTC skipped: [m01]
14:49:05 UTC [ETCDConfigureModule] Generate etcd.env config on new etcd
14:49:05 UTC success: [m01]
14:49:05 UTC success: [p01]
14:49:05 UTC [ETCDConfigureModule] Refresh etcd.env config on all etcd
14:49:05 UTC success: [m01]
14:49:05 UTC success: [p01]
14:49:05 UTC [ETCDConfigureModule] Restart etcd
### Additional information
_No response_
same error. etcd service can't start
same error. etcd service can't start
error "tls: failed to verify client's certificate: x509: certificate has expired or is not yet valid", ServerName ""
@CornWorld @boot-vue This error is usually caused by time not synchronization between nodes. After setting time synchronization, you can delete the pki
directory under kubekey
directory, and kk will regenerate the certificate.
Allright, I'll try again. But it needs much time, please wait me.