kubekey
kubekey copied to clipboard
Almalinux 9 - etcd health check failed
What is version of KubeKey has the issue?
kk version: &version.Info{Major:"3", Minor:"1", GitVersion:"v3.1.0-alpha.7", GitCommit:"2d29be1482de4cda0df464c351b1d806fbd69b66", GitTreeState:"clean", BuildDate:"2024-01-22T10:46:55Z", GoVersion:"go1.19.2", Compiler:"gc", Platform:"linux/amd64"}
What is your os environment?
Almalinux 9
KubeKey config file
apiVersion: kubekey.kubesphere.io/v1alpha2
kind: Cluster
metadata:
name: sample
spec:
hosts:
- {name: k8sn3, address: 192.168.122.102, internalAddress: 192.168.122.102, user: root, password: root}
- {name: k8sn4, address: 192.168.122.103, internalAddress: 192.168.122.103, user: root, password: root}
- {name: k8sn5, address: 192.168.122.104, internalAddress: 192.168.122.104, user: root, password: root}
- {name: k8sn6, address: 192.168.122.105, internalAddress: 192.168.122.105, user: root, password: root}
- {name: k8sn7, address: 192.168.122.106, internalAddress: 192.168.122.106, user: root, password: root}
- {name: k8sn8, address: 192.168.122.107, internalAddress: 192.168.122.107, user: root, password: root}
roleGroups:
etcd:
- k8sn3
- k8sn4
- k8sn5
control-plane:
- k8sn3
- k8sn4
- k8sn5
worker:
- k8sn6
- k8sn7
- k8sn8
controlPlaneEndpoint:
domain: lb.kubesphere.local
address: 192.168.122.110 # The VIP address
port: 6443
kubernetes:
version: v1.29.1
clusterName: cluster.local
autoRenewCerts: true
containerManager: containerd
etcd:
type: kubekey
network:
plugin: calico
kubePodsCIDR: 10.233.64.0/18
kubeServiceCIDR: 10.233.0.0/18
## multus support. https://github.com/k8snetworkplumbingwg/multus-cni
multusCNI:
enabled: false
registry:
privateRegistry: ""
namespaceOverride: ""
registryMirrors: []
insecureRegistries: []
addons: []
---
apiVersion: installer.kubesphere.io/v1alpha1
kind: ClusterConfiguration
metadata:
name: ks-installer
namespace: kubesphere-system
labels:
version: v3.4.1
spec:
persistence:
storageClass: ""
authentication:
jwtSecret: ""
local_registry: ""
# dev_tag: ""
etcd:
monitoring: false
endpointIps: localhost
port: 2379
tlsEnable: true
common:
core:
console:
enableMultiLogin: true
port: 30880
type: NodePort
# apiserver:
# resources: {}
# controllerManager:
# resources: {}
redis:
enabled: false
enableHA: false
volumeSize: 2Gi
openldap:
enabled: false
volumeSize: 2Gi
minio:
volumeSize: 20Gi
monitoring:
# type: external
endpoint: http://prometheus-operated.kubesphere-monitoring-system.svc:9090
GPUMonitoring:
enabled: false
gpu:
kinds:
- resourceName: "nvidia.com/gpu"
resourceType: "GPU"
default: true
es:
# master:
# volumeSize: 4Gi
# replicas: 1
# resources: {}
# data:
# volumeSize: 20Gi
# replicas: 1
# resources: {}
enabled: false
logMaxAge: 7
elkPrefix: logstash
basicAuth:
enabled: false
username: ""
password: ""
externalElasticsearchHost: ""
externalElasticsearchPort: ""
opensearch:
# master:
# volumeSize: 4Gi
# replicas: 1
# resources: {}
# data:
# volumeSize: 20Gi
# replicas: 1
# resources: {}
enabled: true
logMaxAge: 7
opensearchPrefix: whizard
basicAuth:
enabled: true
username: "admin"
password: "admin"
externalOpensearchHost: ""
externalOpensearchPort: ""
dashboard:
enabled: false
alerting:
enabled: false
# thanosruler:
# replicas: 1
# resources: {}
auditing:
enabled: false
# operator:
# resources: {}
# webhook:
# resources: {}
devops:
enabled: false
jenkinsCpuReq: 0.5
jenkinsCpuLim: 1
jenkinsMemoryReq: 4Gi
jenkinsMemoryLim: 4Gi
jenkinsVolumeSize: 16Gi
events:
enabled: false
# operator:
# resources: {}
# exporter:
# resources: {}
ruler:
enabled: true
replicas: 2
# resources: {}
logging:
enabled: false
logsidecar:
enabled: true
replicas: 2
# resources: {}
metrics_server:
enabled: false
monitoring:
storageClass: ""
node_exporter:
port: 9100
# resources: {}
# kube_rbac_proxy:
# resources: {}
# kube_state_metrics:
# resources: {}
# prometheus:
# replicas: 1
# volumeSize: 20Gi
# resources: {}
# operator:
# resources: {}
# alertmanager:
# replicas: 1
# resources: {}
# notification_manager:
# resources: {}
# operator:
# resources: {}
# proxy:
# resources: {}
gpu:
nvidia_dcgm_exporter:
enabled: false
# resources: {}
multicluster:
clusterRole: none
network:
networkpolicy:
enabled: false
ippool:
type: none
topology:
type: none
openpitrix:
store:
enabled: false
servicemesh:
enabled: false
istio:
components:
ingressGateways:
- name: istio-ingressgateway
enabled: false
cni:
enabled: false
edgeruntime:
enabled: false
kubeedge:
enabled: false
cloudCore:
cloudHub:
advertiseAddress:
- ""
service:
cloudhubNodePort: "30000"
cloudhubQuicNodePort: "30001"
cloudhubHttpsNodePort: "30002"
cloudstreamNodePort: "30003"
tunnelNodePort: "30004"
# resources: {}
# hostNetWork: false
iptables-manager:
enabled: true
mode: "external"
# resources: {}
# edgeService:
# resources: {}
gatekeeper:
enabled: false
# controller_manager:
# resources: {}
# audit:
# resources: {}
terminal:
timeout: 600
A clear and concise description of what happend.
k8sn1 LB VIP: 192.168.122.110 k8sn2 LB k8sn3 master k8sn4 master k8sn5 master k8sn6 worker k8sn7 worker k8sn8 worker
- Did create a config with: ./kk create config --with-kubesphere v3.4.1 --with-kubernetes v1.29.1
- edit the file: config-sample.yaml
- try to create cluster - fail with : ./kk create cluster -f config-sample.yaml
06:17:10 CET [ConfirmModule] Display confirmation form +-------+------+------+---------+----------+-------+-------+---------+-----------+--------+--------+------------+------------+-------------+------------------+--------------+ | name | sudo | curl | openssl | ebtables | socat | ipset | ipvsadm | conntrack | chrony | docker | containerd | nfs client | ceph client | glusterfs client | time | +-------+------+------+---------+----------+-------+-------+---------+-----------+--------+--------+------------+------------+-------------+------------------+--------------+ | k8sn3 | y | y | y | y | y | y | | y | y | 25.0.3 | 1.6.28 | | | | CET 06:17:09 | | k8sn4 | y | y | y | y | y | y | | y | y | 25.0.3 | 1.6.28 | | | | CET 06:17:10 | | k8sn5 | y | y | y | y | y | y | | y | y | 25.0.3 | 1.6.28 | | | | CET 06:17:09 | | k8sn6 | y | y | y | y | y | y | | y | y | 25.0.3 | 1.6.28 | | | | CET 06:17:10 | | k8sn7 | y | y | y | y | y | y | | y | y | 25.0.3 | 1.6.28 | | | | CET 06:17:10 | | k8sn8 | y | y | y | y | y | y | | y | y | 25.0.3 | 1.6.28 | | | | CET 06:17:09 | +-------+------+------+---------+----------+-------+-------+---------+-----------+--------+--------+------------+------------+-------------+------------------+--------------+
This is a simple check of your environment. Before installation, ensure that your machines meet all requirements specified at https://github.com/kubesphere/kubekey#requirements-and-recommendations
Continue this installation? [yes/no]: yes
The hole setup is behind a proxy so I did setup ENV & Docker for proxy:
docker:
systemctl show --property=Environment docker
[root@k8sn1 ~]# systemctl show --property=Environment docker
Environment=HTTP_PROXY=http://user:[email protected]:3128 HTTPS_PROXY=http://user:[email protected]:3128 NO_PROXY=localhost,127.0.0.1,.do>
lines 1-1/1 (END)
ENV:
no_proxy=localhost,127.0.0.1,.domain.tld,192.168.122.0/24
https_proxy=http://user:[email protected]:3128
http_proxy=http://user:[email protected]:3128
Relevant log output
14:18:30 CET retry: [k8sn4]
14:18:30 CET message: [k8sn5]
etcd health check failed: Failed to exec command: sudo -E /bin/bash -c "export ETCDCTL_API=2;export ETCDCTL_CERT_FILE='/etc/ssl/etcd/ssl/admin-k8sn5.pem';export ETCDCTL_KEY_FILE='/etc/ssl/etcd/ssl/admin-k8sn5-key.pem';export ETCDCTL_CA_FILE='/etc/ssl/etcd/ssl/ca.pem';/usr/local/bin/etcdctl --endpoints=https://192.168.122.102:2379,https://192.168.122.103:2379,https://192.168.122.104:2379 cluster-health | grep -q 'cluster is healthy'"
: Process exited with status 1
14:18:30 CET retry: [k8sn5]
14:18:35 CET message: [k8sn3]
etcd health check failed: Failed to exec command: sudo -E /bin/bash -c "export ETCDCTL_API=2;export ETCDCTL_CERT_FILE='/etc/ssl/etcd/ssl/admin-k8sn3.pem';export ETCDCTL_KEY_FILE='/etc/ssl/etcd/ssl/admin-k8sn3-key.pem';export ETCDCTL_CA_FILE='/etc/ssl/etcd/ssl/ca.pem';/usr/local/bin/etcdctl --endpoints=https://192.168.122.102:2379,https://192.168.122.103:2379,https://192.168.122.104:2379 cluster-health | grep -q 'cluster is healthy'"
: Process exited with status 1
14:18:35 CET retry: [k8sn3]
14:18:35 CET message: [k8sn4]
etcd health check failed: Failed to exec command: sudo -E /bin/bash -c "export ETCDCTL_API=2;export ETCDCTL_CERT_FILE='/etc/ssl/etcd/ssl/admin-k8sn4.pem';export ETCDCTL_KEY_FILE='/etc/ssl/etcd/ssl/admin-k8sn4-key.pem';export ETCDCTL_CA_FILE='/etc/ssl/etcd/ssl/ca.pem';/usr/local/bin/etcdctl --endpoints=https://192.168.122.102:2379,https://192.168.122.103:2379,https://192.168.122.104:2379 cluster-health | grep -q 'cluster is healthy'"
: Process exited with status 1
14:18:35 CET retry: [k8sn4]
14:18:35 CET message: [k8sn5]
etcd health check failed: Failed to exec command: sudo -E /bin/bash -c "export ETCDCTL_API=2;export ETCDCTL_CERT_FILE='/etc/ssl/etcd/ssl/admin-k8sn5.pem';export ETCDCTL_KEY_FILE='/etc/ssl/etcd/ssl/admin-k8sn5-key.pem';export ETCDCTL_CA_FILE='/etc/ssl/etcd/ssl/ca.pem';/usr/local/bin/etcdctl --endpoints=https://192.168.122.102:2379,https://192.168.122.103:2379,https://192.168.122.104:2379 cluster-health | grep -q 'cluster is healthy'"
: Process exited with status 1
14:18:35 CET retry: [k8sn5]
14:18:40 CET message: [k8sn3]
etcd health check failed: Failed to exec command: sudo -E /bin/bash -c "export ETCDCTL_API=2;export ETCDCTL_CERT_FILE='/etc/ssl/etcd/ssl/admin-k8sn3.pem';export ETCDCTL_KEY_FILE='/etc/ssl/etcd/ssl/admin-k8sn3-key.pem';export ETCDCTL_CA_FILE='/etc/ssl/etcd/ssl/ca.pem';/usr/local/bin/etcdctl --endpoints=https://192.168.122.102:2379,https://192.168.122.103:2379,https://192.168.122.104:2379 cluster-health | grep -q 'cluster is healthy'"
: Process exited with status 1
14:18:40 CET retry: [k8sn3]
14:18:40 CET message: [k8sn4]
etcd health check failed: Failed to exec command: sudo -E /bin/bash -c "export ETCDCTL_API=2;export ETCDCTL_CERT_FILE='/etc/ssl/etcd/ssl/admin-k8sn4.pem';export ETCDCTL_KEY_FILE='/etc/ssl/etcd/ssl/admin-k8sn4-key.pem';export ETCDCTL_CA_FILE='/etc/ssl/etcd/ssl/ca.pem';/usr/local/bin/etcdctl --endpoints=https://192.168.122.102:2379,https://192.168.122.103:2379,https://192.168.122.104:2379 cluster-health | grep -q 'cluster is healthy'"
Additional information
Docker Cointainer download on the nodes work:
# docker pull hello-world
Using default tag: latest
latest: Pulling from library/hello-world
c1ec31eb5944: Pull complete
Digest: sha256:d000bc569937abbe195e20322a0bde6b2922d805332fd6d8a68b19f524b7d21d
Status: Downloaded newer image for hello-world:latest
docker.io/library/hello-world:latest
# docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
hello-world latest d2c94e258dcb 10 months ago 13.3kB
running as VMs insed of libvir/qemu
SELinux (on all nodes)
# getenforce
Permissive
on all nodes (k8sn3 - k8sn8)
# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
# docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
followed this docu:
https://kubesphere.io/docs/v3.4/installing-on-linux/high-availability-configurations/set-up-ha-cluster-using-keepalived-haproxy/
# firewall-cmd --state
running
Firewall Config on all Nodes
# firewall-cmd --list-all
public (active)
target: default
icmp-block-inversion: no
interfaces: enp1s0
sources:
services: cockpit dhcpv6-client ssh
ports: 6443/tcp 2379-2380/tcp 10250/tcp 10251/tcp 10259/tcp 10257/tcp 179/tcp 4789/udp 30000-32767/tcp
protocols:
forward: yes
masquerade: no
forward-ports:
source-ports:
icmp-blocks:
rich rules:
snip of /var/log/messages from node: k8sn3
Mar 8 06:20:36 k8sn3 etcd[24851]: {"level":"info","ts":"2024-03-08T06:20:36.009+0100","caller":"flags/flag.go:113","msg":"recognized and used environment variable","variable-name":"ETCD_LISTEN_CLIENT_URLS","variable-value":"https://192.168.122.102:2379,https://127.0.0.1:2379"}
Mar 8 06:20:36 k8sn3 etcd[24851]: {"level":"info","ts":"2024-03-08T06:20:36.009+0100","caller":"flags/flag.go:113","msg":"recognized and used environment variable","variable-name":"ETCD_LISTEN_PEER_URLS","variable-value":"https://192.168.122.102:2380"}
Mar 8 06:20:36 k8sn3 etcd[24851]: {"level":"info","ts":"2024-03-08T06:20:36.009+0100","caller":"flags/flag.go:113","msg":"recognized and used environment variable","variable-name":"ETCD_METRICS","variable-value":"basic"}
Mar 8 06:20:36 k8sn3 etcd[24851]: {"level":"info","ts":"2024-03-08T06:20:36.009+0100","caller":"flags/flag.go:113","msg":"recognized and used environment variable","variable-name":"ETCD_NAME","variable-value":"etcd-k8sn3"}
Mar 8 06:20:36 k8sn3 etcd[24851]: {"level":"info","ts":"2024-03-08T06:20:36.009+0100","caller":"flags/flag.go:113","msg":"recognized and used environment variable","variable-name":"ETCD_PEER_CERT_FILE","variable-value":"/etc/ssl/etcd/ssl/member-k8sn3.pem"}
Mar 8 06:20:36 k8sn3 etcd[24851]: {"level":"info","ts":"2024-03-08T06:20:36.009+0100","caller":"flags/flag.go:113","msg":"recognized and used environment variable","variable-name":"ETCD_PEER_CLIENT_CERT_AUTH","variable-value":"true"}
Mar 8 06:20:36 k8sn3 etcd[24851]: {"level":"info","ts":"2024-03-08T06:20:36.009+0100","caller":"flags/flag.go:113","msg":"recognized and used environment variable","variable-name":"ETCD_PEER_KEY_FILE","variable-value":"/etc/ssl/etcd/ssl/member-k8sn3-key.pem"}
Mar 8 06:20:36 k8sn3 etcd[24851]: {"level":"info","ts":"2024-03-08T06:20:36.009+0100","caller":"flags/flag.go:113","msg":"recognized and used environment variable","variable-name":"ETCD_PEER_TRUSTED_CA_FILE","variable-value":"/etc/ssl/etcd/ssl/ca.pem"}
Mar 8 06:20:36 k8sn3 etcd[24851]: {"level":"info","ts":"2024-03-08T06:20:36.009+0100","caller":"flags/flag.go:113","msg":"recognized and used environment variable","variable-name":"ETCD_PROXY","variable-value":"off"}
Mar 8 06:20:36 k8sn3 etcd[24851]: {"level":"info","ts":"2024-03-08T06:20:36.009+0100","caller":"flags/flag.go:113","msg":"recognized and used environment variable","variable-name":"ETCD_SNAPSHOT_COUNT","variable-value":"10000"}
Mar 8 06:20:36 k8sn3 etcd[24851]: {"level":"info","ts":"2024-03-08T06:20:36.009+0100","caller":"flags/flag.go:113","msg":"recognized and used environment variable","variable-name":"ETCD_TRUSTED_CA_FILE","variable-value":"/etc/ssl/etcd/ssl/ca.pem"}
Mar 8 06:20:36 k8sn3 etcd[24851]: {"level":"info","ts":"2024-03-08T06:20:36.010+0100","caller":"etcdmain/etcd.go:73","msg":"Running: ","args":["/usr/local/bin/etcd"]}
Mar 8 06:20:36 k8sn3 etcd[24851]: {"level":"info","ts":"2024-03-08T06:20:36.010+0100","caller":"etcdmain/etcd.go:116","msg":"server has been already initialized","data-dir":"/var/lib/etcd","dir-type":"member"}
Mar 8 06:20:36 k8sn3 etcd[24851]: {"level":"info","ts":"2024-03-08T06:20:36.010+0100","caller":"embed/etcd.go:124","msg":"configuring peer listeners","listen-peer-urls":["https://192.168.122.102:2380"]}
Mar 8 06:20:36 k8sn3 etcd[24851]: {"level":"info","ts":"2024-03-08T06:20:36.011+0100","caller":"embed/etcd.go:484","msg":"starting with peer TLS","tls-info":"cert = /etc/ssl/etcd/ssl/member-k8sn3.pem, key = /etc/ssl/etcd/ssl/member-k8sn3-key.pem, client-cert=, client-key=, trusted-ca = /etc/ssl/etcd/ssl/ca.pem, client-cert-auth = true, crl-file = ","cipher-suites":[]}
Mar 8 06:20:36 k8sn3 etcd[24851]: {"level":"info","ts":"2024-03-08T06:20:36.014+0100","caller":"embed/etcd.go:132","msg":"configuring client listeners","listen-client-urls":["https://127.0.0.1:2379","https://192.168.122.102:2379"]}
Mar 8 06:20:36 k8sn3 etcd[24851]: {"level":"info","ts":"2024-03-08T06:20:36.016+0100","caller":"embed/etcd.go:306","msg":"starting an etcd server","etcd-version":"3.5.6","git-sha":"cecbe35ce","go-version":"go1.16.15","go-os":"linux","go-arch":"amd64","max-cpu-set":4,"max-cpu-available":4,"member-initialized":true,"name":"etcd-k8sn3","data-dir":"/var/lib/etcd","wal-dir":"","wal-dir-dedicated":"","member-dir":"/var/lib/etcd/member","force-new-cluster":false,"heartbeat-interval":"250ms","election-timeout":"5s","initial-election-tick-advance":true,"snapshot-count":10000,"max-wals":5,"max-snapshots":5,"snapshot-catchup-entries":5000,"initial-advertise-peer-urls":["https://192.168.122.102:2380"],"listen-peer-urls":["https://192.168.122.102:2380"],"advertise-client-urls":["https://192.168.122.102:2379"],"listen-client-urls":["https://127.0.0.1:2379","https://192.168.122.102:2379"],"listen-metrics-urls":[],"cors":["*"],"host-whitelist":["*"],"initial-cluster":"","initial-cluster-state":"existing","initial-cluster-token":"","quota-backend-bytes":2147483648,"max-request-bytes":1572864,"max-concurrent-streams":4294967295,"pre-vote":true,"initial-corrupt-check":false,"corrupt-check-time-interval":"0s","compact-check-time-enabled":false,"compact-check-time-interval":"1m0s","auto-compaction-mode":"periodic","auto-compaction-retention":"8h0m0s","auto-compaction-interval":"8h0m0s","discovery-url":"","discovery-proxy":"","downgrade-check-interval":"5s"}
Mar 8 06:20:36 k8sn3 etcd[24851]: {"level":"warn","ts":"2024-03-08T06:20:36.016+0100","caller":"fileutil/fileutil.go:53","msg":"check file permission","error":"directory \"/var/lib/etcd\" exist, but the permission is \"drwxr-xr-x\". The recommended permission is \"-rwx------\" to prevent possible unprivileged access to the data"}
Mar 8 06:20:36 k8sn3 etcd[24851]: {"level":"info","ts":"2024-03-08T06:20:36.018+0100","caller":"etcdserver/backend.go:81","msg":"opened backend db","path":"/var/lib/etcd/member/snap/db","took":"825.438µs"}
Mar 8 06:20:36 k8sn3 etcd[24851]: {"level":"info","ts":"2024-03-08T06:20:36.020+0100","caller":"etcdserver/server.go:530","msg":"No snapshot found. Recovering WAL from scratch!"}
Mar 8 06:20:36 k8sn3 etcd[24851]: {"level":"info","ts":"2024-03-08T06:20:36.021+0100","caller":"etcdserver/raft.go:529","msg":"restarting local member","cluster-id":"14dd0e4add552497","local-member-id":"3085fdf1dd51aa14","commit-index":0}
Mar 8 06:20:36 k8sn3 etcd[24851]: {"level":"info","ts":"2024-03-08T06:20:36.022+0100","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"3085fdf1dd51aa14 switched to configuration voters=()"}
Mar 8 06:20:36 k8sn3 etcd[24851]: {"level":"info","ts":"2024-03-08T06:20:36.022+0100","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"3085fdf1dd51aa14 became follower at term 0"}
Mar 8 06:20:36 k8sn3 etcd[24851]: {"level":"info","ts":"2024-03-08T06:20:36.022+0100","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"newRaft 3085fdf1dd51aa14 [peers: [], term: 0, commit: 0, applied: 0, lastindex: 0, lastterm: 0]"}
Mar 8 06:20:36 k8sn3 etcd[24851]: {"level":"warn","ts":"2024-03-08T06:20:36.024+0100","caller":"auth/store.go:1234","msg":"simple token is not cryptographically signed"}
Mar 8 06:20:36 k8sn3 etcd[24851]: {"level":"info","ts":"2024-03-08T06:20:36.030+0100","caller":"mvcc/kvstore.go:393","msg":"kvstore restored","current-rev":1}
Mar 8 06:20:36 k8sn3 etcd[24851]: {"level":"info","ts":"2024-03-08T06:20:36.031+0100","caller":"etcdserver/quota.go:94","msg":"enabled backend quota with default value","quota-name":"v3-applier","quota-size-bytes":2147483648,"quota-size":"2.1 GB"}
Mar 8 06:20:36 k8sn3 etcd[24851]: {"level":"info","ts":"2024-03-08T06:20:36.033+0100","caller":"etcdserver/server.go:854","msg":"starting etcd server","local-member-id":"3085fdf1dd51aa14","local-server-version":"3.5.6","cluster-version":"to_be_decided"}
Mar 8 06:20:36 k8sn3 etcd[24851]: {"level":"info","ts":"2024-03-08T06:20:36.034+0100","caller":"etcdserver/server.go:754","msg":"starting initial election tick advance","election-ticks":20}
Mar 8 06:20:36 k8sn3 etcd[24851]: {"level":"info","ts":"2024-03-08T06:20:36.034+0100","caller":"fileutil/purge.go:44","msg":"started to purge file","dir":"/var/lib/etcd/member/snap","suffix":"snap.db","max":5,"interval":"30s"}
Mar 8 06:20:36 k8sn3 etcd[24851]: {"level":"info","ts":"2024-03-08T06:20:36.034+0100","caller":"fileutil/purge.go:44","msg":"started to purge file","dir":"/var/lib/etcd/member/snap","suffix":"snap","max":5,"interval":"30s"}
Mar 8 06:20:36 k8sn3 etcd[24851]: {"level":"info","ts":"2024-03-08T06:20:36.034+0100","caller":"fileutil/purge.go:44","msg":"started to purge file","dir":"/var/lib/etcd/member/wal","suffix":"wal","max":5,"interval":"30s"}
Mar 8 06:20:36 k8sn3 etcd[24851]: {"level":"info","ts":"2024-03-08T06:20:36.038+0100","caller":"embed/etcd.go:687","msg":"starting with client TLS","tls-info":"cert = /etc/ssl/etcd/ssl/member-k8sn3.pem, key = /etc/ssl/etcd/ssl/member-k8sn3-key.pem, client-cert=, client-key=, trusted-ca = /etc/ssl/etcd/ssl/ca.pem, client-cert-auth = true, crl-file = ","cipher-suites":[]}
Mar 8 06:20:36 k8sn3 etcd[24851]: {"level":"warn","ts":"2024-03-08T06:20:36.038+0100","caller":"embed/etcd.go:700","msg":"Flag `enable-v2` is deprecated and will get removed in etcd 3.6."}
Mar 8 06:20:36 k8sn3 etcd[24851]: {"level":"info","ts":"2024-03-08T06:20:36.038+0100","caller":"embed/etcd.go:586","msg":"serving peer traffic","address":"192.168.122.102:2380"}
Mar 8 06:20:36 k8sn3 etcd[24851]: {"level":"info","ts":"2024-03-08T06:20:36.038+0100","caller":"embed/etcd.go:275","msg":"now serving peer/client/metrics","local-member-id":"3085fdf1dd51aa14","initial-advertise-peer-urls":["https://192.168.122.102:2380"],"listen-peer-urls":["https://192.168.122.102:2380"],"advertise-client-urls":["https://192.168.122.102:2379"],"listen-client-urls":["https://127.0.0.1:2379","https://192.168.122.102:2379"],"listen-metrics-urls":[]}
Mar 8 06:20:36 k8sn3 etcd[24851]: {"level":"info","ts":"2024-03-08T06:20:36.038+0100","caller":"embed/etcd.go:558","msg":"cmux::serve","address":"192.168.122.102:2380"}
Mar 8 06:20:36 k8sn3 etcd[24851]: {"level":"info","ts":"2024-03-08T06:20:36.108+0100","caller":"rafthttp/pipeline.go:72","msg":"started HTTP pipelining with remote peer","local-member-id":"3085fdf1dd51aa14","remote-peer-id":"60e9873ece57263"}
Mar 8 06:20:36 k8sn3 etcd[24851]: {"level":"info","ts":"2024-03-08T06:20:36.108+0100","caller":"rafthttp/transport.go:286","msg":"added new remote peer","local-member-id":"3085fdf1dd51aa14","remote-peer-id":"60e9873ece57263","remote-peer-urls":["https://192.168.122.103:2380"]}
Mar 8 06:20:36 k8sn3 etcd[24851]: {"level":"warn","ts":"2024-03-08T06:20:36.108+0100","caller":"rafthttp/http.go:413","msg":"failed to find remote peer in cluster","local-member-id":"3085fdf1dd51aa14","remote-peer-id-stream-handler":"3085fdf1dd51aa14","remote-peer-id-from":"60e9873ece57263","cluster-id":"14dd0e4add552497"}
Mar 8 06:20:36 k8sn3 etcd[24851]: {"level":"info","ts":"2024-03-08T06:20:36.110+0100","caller":"rafthttp/pipeline.go:72","msg":"started HTTP pipelining with remote peer","local-member-id":"3085fdf1dd51aa14","remote-peer-id":"e46b4a1bfc06a3ab"}
Mar 8 06:20:36 k8sn3 etcd[24851]: {"level":"info","ts":"2024-03-08T06:20:36.110+0100","caller":"rafthttp/transport.go:286","msg":"added new remote peer","local-member-id":"3085fdf1dd51aa14","remote-peer-id":"e46b4a1bfc06a3ab","remote-peer-urls":["https://192.168.122.104:2380"]}
Mar 8 06:20:36 k8sn3 etcd[24851]: {"level":"warn","ts":"2024-03-08T06:20:36.110+0100","caller":"rafthttp/http.go:413","msg":"failed to find remote peer in cluster","local-member-id":"3085fdf1dd51aa14","remote-peer-id-stream-handler":"3085fdf1dd51aa14","remote-peer-id-from":"e46b4a1bfc06a3ab","cluster-id":"14dd0e4add552497"}
Mar 8 06:20:36 k8sn3 etcd[24851]: {"level":"warn","ts":"2024-03-08T06:20:36.114+0100","caller":"rafthttp/http.go:413","msg":"failed to find remote peer in cluster","local-member-id":"3085fdf1dd51aa14","remote-peer-id-stream-handler":"3085fdf1dd51aa14","remote-peer-id-from":"60e9873ece57263","cluster-id":"14dd0e4add552497"}
Mar 8 06:20:36 k8sn3 etcd[24851]: {"level":"warn","ts":"2024-03-08T06:20:36.115+0100","caller":"rafthttp/http.go:413","msg":"failed to find remote peer in cluster","local-member-id":"3085fdf1dd51aa14","remote-peer-id-stream-handler":"3085fdf1dd51aa14","remote-peer-id-from":"e46b4a1bfc06a3ab","cluster-id":"14dd0e4add552497"}
Mar 8 06:20:36 k8sn3 etcd[24851]: {"level":"info","ts":"2024-03-08T06:20:36.158+0100","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"3085fdf1dd51aa14 [term: 0] received a MsgHeartbeat message with higher term from e46b4a1bfc06a3ab [term: 2]"}
Mar 8 06:20:36 k8sn3 etcd[24851]: {"level":"info","ts":"2024-03-08T06:20:36.158+0100","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"3085fdf1dd51aa14 became follower at term 2"}
Mar 8 06:20:36 k8sn3 etcd[24851]: {"level":"panic","ts":"2024-03-08T06:20:36.158+0100","logger":"raft","caller":"etcdserver/zap_raft.go:101","msg":"tocommit(344) is out of range [lastIndex(0)]. Was the raft log corrupted, truncated, or lost?","stacktrace":"go.etcd.io/etcd/server/v3/etcdserver.(*zapRaftLogger).Panicf\n\tgo.etcd.io/etcd/server/v3/etcdserver/zap_raft.go:101\ngo.etcd.io/etcd/raft/v3.(*raftLog).commitTo\n\tgo.etcd.io/etcd/raft/[email protected]/log.go:237\ngo.etcd.io/etcd/raft/v3.(*raft).handleHeartbeat\n\tgo.etcd.io/etcd/raft/[email protected]/raft.go:1508\ngo.etcd.io/etcd/raft/v3.stepFollower\n\tgo.etcd.io/etcd/raft/[email protected]/raft.go:1434\ngo.etcd.io/etcd/raft/v3.(*raft).Step\n\tgo.etcd.io/etcd/raft/[email protected]/raft.go:975\ngo.etcd.io/etcd/raft/v3.(*node).run\n\tgo.etcd.io/etcd/raft/[email protected]/node.go:356"}
Mar 8 06:20:36 k8sn3 etcd[24851]: panic: tocommit(344) is out of range [lastIndex(0)]. Was the raft log corrupted, truncated, or lost?
Mar 8 06:20:36 k8sn3 etcd[24851]: goroutine 166 [running]:
Mar 8 06:20:36 k8sn3 etcd[24851]: go.uber.org/zap/zapcore.(*CheckedEntry).Write(0xc000564000, 0x0, 0x0, 0x0)
Mar 8 06:20:36 k8sn3 etcd[24851]: #011go.uber.org/[email protected]/zapcore/entry.go:234 +0x58d
Mar 8 06:20:36 k8sn3 etcd[24851]: go.uber.org/zap.(*SugaredLogger).log(0xc00000f240, 0x4, 0x1268ad6, 0x5d, 0xc000562400, 0x2, 0x2, 0x0, 0x0, 0x0)
Mar 8 06:20:36 k8sn3 etcd[24851]: #011go.uber.org/[email protected]/sugar.go:227 +0x111
Mar 8 06:20:36 k8sn3 etcd[24851]: go.uber.org/zap.(*SugaredLogger).Panicf(...)
Mar 8 06:20:36 k8sn3 etcd[24851]: #011go.uber.org/[email protected]/sugar.go:159
Mar 8 06:20:36 k8sn3 etcd[24851]: go.etcd.io/etcd/server/v3/etcdserver.(*zapRaftLogger).Panicf(0xc0000a4a60, 0x1268ad6, 0x5d, 0xc000562400, 0x2, 0x2)
Mar 8 06:20:36 k8sn3 etcd[24851]: #011go.etcd.io/etcd/server/v3/etcdserver/zap_raft.go:101 +0x7d
Mar 8 06:20:36 k8sn3 etcd[24851]: go.etcd.io/etcd/raft/v3.(*raftLog).commitTo(0xc000196310, 0x158)
Mar 8 06:20:36 k8sn3 etcd[24851]: #011go.etcd.io/etcd/raft/[email protected]/log.go:237 +0x135
Mar 8 06:20:36 k8sn3 etcd[24851]: go.etcd.io/etcd/raft/v3.(*raft).handleHeartbeat(0xc00057c2c0, 0x8, 0x3085fdf1dd51aa14, 0xe46b4a1bfc06a3ab, 0x2, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
Mar 8 06:20:36 k8sn3 etcd[24851]: #011go.etcd.io/etcd/raft/[email protected]/raft.go:1508 +0x54
Mar 8 06:20:36 k8sn3 etcd[24851]: go.etcd.io/etcd/raft/v3.stepFollower(0xc00057c2c0, 0x8, 0x3085fdf1dd51aa14, 0xe46b4a1bfc06a3ab, 0x2, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
Mar 8 06:20:36 k8sn3 etcd[24851]: #011go.etcd.io/etcd/raft/[email protected]/raft.go:1434 +0x478
Mar 8 06:20:36 k8sn3 etcd[24851]: go.etcd.io/etcd/raft/v3.(*raft).Step(0xc00057c2c0, 0x8, 0x3085fdf1dd51aa14, 0xe46b4a1bfc06a3ab, 0x2, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
Mar 8 06:20:36 k8sn3 etcd[24851]: #011go.etcd.io/etcd/raft/[email protected]/raft.go:975 +0xa55
Mar 8 06:20:36 k8sn3 etcd[24851]: go.etcd.io/etcd/raft/v3.(*node).run(0xc000586fc0)
Mar 8 06:20:36 k8sn3 etcd[24851]: #011go.etcd.io/etcd/raft/[email protected]/node.go:356 +0x798
Mar 8 06:20:36 k8sn3 etcd[24851]: created by go.etcd.io/etcd/raft/v3.RestartNode
Mar 8 06:20:36 k8sn3 etcd[24851]: #011go.etcd.io/etcd/raft/[email protected]/node.go:244 +0x330
Mar 8 06:20:36 k8sn3 systemd[1]: etcd.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
Mar 8 06:20:36 k8sn3 systemd[1]: etcd.service: Failed with result 'exit-code'.
Mar 8 06:20:36 k8sn3 systemd[1]: Failed to start etcd.
on the Nodes: k8sn3 -> k8sn8
# ps -ef | grep etc
root 9369 1 3 07:04 ? 00:00:02 /usr/local/bin/etcd
when run:
ssh [email protected] sudo -E /bin/bash -c "export ETCDCTL_API=2;export ETCDCTL_CERT_FILE='/etc/ssl/etcd/ssl/admin-k8sn5.pem';export ETCDCTL_KEY_FILE='/etc/ssl/etcd/ssl/admin-k8sn5-key.pem';export ETCDCTL_CA_FILE='/etc/ssl/etcd/ssl/ca.pem';/usr/local/bin/etcdctl --endpoints=https://192.168.122.102:2379,https://192.168.122.103:2379,https://192.168.122.104:2379 cluster-health | grep -q 'cluster is healthy'"
got this:
Error: unknown command "cluster-health" for "etcdctl"
Run 'etcdctl --help' for usage.
Error: unknown command "cluster-health" for "etcdctl"
# /usr/local/bin/etcdctl
NAME:
etcdctl - A simple command line client for etcd3.
USAGE:
etcdctl [flags]
VERSION:
3.5.6
API VERSION:
3.5
COMMANDS:
alarm disarm Disarms all alarms
alarm list Lists all alarms
auth disable Disables authentication
auth enable Enables authentication
auth status Returns authentication status
check datascale Check the memory usage of holding data for different workloads on a given server endpoint.
check perf Check the performance of the etcd cluster
compaction Compacts the event history in etcd
defrag Defragments the storage of the etcd members with given endpoints
del Removes the specified key or range of keys [key, range_end)
elect Observes and participates in leader election
endpoint hashkv Prints the KV history hash for each endpoint in --endpoints
endpoint health Checks the healthiness of endpoints specified in `--endpoints` flag
endpoint status Prints out the status of endpoints specified in `--endpoints` flag
get Gets the key or a range of keys
help Help about any command
lease grant Creates leases
lease keep-alive Keeps leases alive (renew)
lease list List all active leases
lease revoke Revokes leases
lease timetolive Get lease information
lock Acquires a named lock
make-mirror Makes a mirror at the destination etcd cluster
member add Adds a member into the cluster
member list Lists all members in the cluster
member promote Promotes a non-voting member in the cluster
member remove Removes a member from the cluster
member update Updates a member in the cluster
move-leader Transfers leadership to another etcd cluster member.
put Puts the given key into the store
role add Adds a new role
role delete Deletes a role
role get Gets detailed information of a role
role grant-permission Grants a key to a role
role list Lists all roles
role revoke-permission Revokes a key from a role
snapshot restore Restores an etcd member snapshot to an etcd directory
snapshot save Stores an etcd node backend snapshot to a given file
snapshot status [deprecated] Gets backend snapshot status of a given file
txn Txn processes all the requests in one transaction
user add Adds a new user
user delete Deletes a user
user get Gets detailed information of a user
user grant-role Grants a role to a user
user list Lists all users
user passwd Changes password of user
user revoke-role Revokes a role from a user
version Prints the version of etcdctl
watch Watches events stream on keys or prefixes
OPTIONS:
--cacert="" verify certificates of TLS-enabled secure servers using this CA bundle
--cert="" identify secure client using this TLS certificate file
--command-timeout=5s timeout for short running command (excluding dial timeout)
--debug[=false] enable client-side debug logging
--dial-timeout=2s dial timeout for client connections
-d, --discovery-srv="" domain name to query for SRV records describing cluster endpoints
--discovery-srv-name="" service name to query when using DNS discovery
--endpoints=[127.0.0.1:2379] gRPC endpoints
-h, --help[=false] help for etcdctl
--hex[=false] print byte strings as hex encoded strings
--insecure-discovery[=true] accept insecure SRV records describing cluster endpoints
--insecure-skip-tls-verify[=false] skip server certificate verification (CAUTION: this option should be enabled only for testing purposes)
--insecure-transport[=true] disable transport security for client connections
--keepalive-time=2s keepalive time for client connections
--keepalive-timeout=6s keepalive timeout for client connections
--key="" identify secure client using this TLS key file
--password="" password for authentication (if this option is used, --user option shouldn't include password)
--user="" username[:password] for authentication (prompt if password is not supplied)
-w, --write-out="simple" set the output format (fields, json, protobuf, simple, table)