执行./kk create cluster -f config-sample.yaml -a kubesphere.tar.gz,过程中卡住不动,没日志
What is version of KubeKey has the issue?
kk version: &version.Info{Major:"3", Minor:"0", GitVersion:"v3.0.7", GitCommit:"e755baf67198d565689d7207378174f429b508ba", GitTreeState:"clean", BuildDate:"2023-01-18T01:57:24Z", GoVersion:"go1.19.2", Compiler:"gc", Platform:"linux/amd64"}
What is your os environment?
ky10_x86
KubeKey config file
主要文件配置如下
apiVersion: kubekey.kubesphere.io/v1alpha2
kind: Cluster
metadata:
name: sample
spec:
hosts:
- {name: master70, address: 192.168.1.1, internalAddress: 192.168.1.1, user: root, password: "Aa12345!"}
- {name: master72, address: 192.168.1.2, internalAddress: 192.168.1.2, user: root, password: "Aa12345!"}
- {name: master74, address: 192.168.1.3, internalAddress: 192.168.1.3, user: root, password: "Aa12345!"}
- {name: node76, address: 192.168.1.4, internalAddress: 192.168.1.4, user: root, password: "Aa12345!"}
- {name: node68, address: 192.168.1.5, internalAddress: 192.168.1.5, user: root, password: "Aa12345!"}
roleGroups:
etcd:
- master70
- master72
- master74
control-plane:
- master70
- master72
- master74
worker:
- node76
- node68
registry:
- master70
controlPlaneEndpoint:
## Internal loadbalancer for apiservers
internalLoadbalancer: haproxy
domain: lb.kubesphere.local
address: ""
port: 6443
kubernetes:
version: v1.21.13
clusterName: cluster.local
etcd:
type: kubekey
network:
plugin: calico
kubePodsCIDR: 10.233.64.0/18
kubeServiceCIDR: 10.233.0.0/18
## multus support. https://github.com/k8snetworkplumbingwg/multus-cni
multusCNI:
enabled: false
registry:
type: harbor
auths:
"dockerhub.kubekey.local":
username: admin
password: Harbor12345
privateRegistry: "dockerhub.kubekey.local"
namespaceOverride: "cytech_pf"
registryMirrors: []
insecureRegistries: ["dockerhub.kubekey.local"]
#privateRegistry: ""
#namespaceOverride: ""
#registryMirrors: []
#insecureRegistries: []
addons: []
A clear and concise description of what happend.
执行过了init后,没有异常,但是执行create的时候就卡在[InstallContainerModule] Add auths to container runtime这一行没有动静,这种要怎么排查 20:45:38 CST success: [node76] 20:45:38 CST success: [master74] 20:45:38 CST success: [master70] 20:45:38 CST success: [node68] 20:45:38 CST success: [master72] 20:45:38 CST [RepositoryModule] Reset repository to the original repository 20:45:39 CST success: [node68] 20:45:39 CST success: [master74] 20:45:39 CST success: [node76] 20:45:39 CST success: [master70] 20:45:39 CST success: [master72] 20:45:39 CST [RepositoryModule] Umount ISO file 20:45:39 CST success: [node68] 20:45:39 CST success: [master74] 20:45:39 CST success: [node76] 20:45:39 CST success: [master70] 20:45:39 CST success: [master72] 20:45:39 CST [NodeBinariesModule] Download installation binaries 20:45:39 CST message: [localhost] downloading amd64 kubeadm v1.21.13 ... 20:45:39 CST message: [localhost] kubeadm is existed 20:45:39 CST message: [localhost] downloading amd64 kubelet v1.21.13 ... 20:45:40 CST message: [localhost] kubelet is existed 20:45:40 CST message: [localhost] downloading amd64 kubectl v1.21.13 ... 20:45:41 CST message: [localhost] kubectl is existed 20:45:41 CST message: [localhost] downloading amd64 helm v3.9.0 ... 20:45:41 CST message: [localhost] helm is existed 20:45:41 CST message: [localhost] downloading amd64 kubecni v0.9.1 ... 20:45:41 CST message: [localhost] kubecni is existed 20:45:41 CST message: [localhost] downloading amd64 crictl v1.24.0 ... 20:45:41 CST message: [localhost] crictl is existed 20:45:41 CST message: [localhost] downloading amd64 etcd v3.4.13 ... 20:45:42 CST message: [localhost] etcd is existed 20:45:42 CST message: [localhost] downloading amd64 docker 20.10.8 ... 20:45:42 CST message: [localhost] docker is existed 20:45:42 CST success: [LocalHost] 20:45:42 CST [ConfigureOSModule] Get OS release 20:45:42 CST success: [node68] 20:45:42 CST success: [master74] 20:45:42 CST success: [node76] 20:45:42 CST success: [master70] 20:45:42 CST success: [master72] 20:45:42 CST [ConfigureOSModule] Prepare to init OS 20:45:48 CST success: [node68] 20:45:48 CST success: [node76] 20:45:48 CST success: [master74] 20:45:48 CST success: [master70] 20:45:48 CST success: [master72] 20:45:48 CST [ConfigureOSModule] Generate init os script 20:45:49 CST success: [node68] 20:45:49 CST success: [master74] 20:45:49 CST success: [node76] 20:45:49 CST success: [master70] 20:45:49 CST success: [master72] 20:45:49 CST [ConfigureOSModule] Exec init os script 20:45:50 CST stdout: [node68] setenforce: SELinux is disabled Disabled net.ipv4.ip_forward = 1 net.bridge.bridge-nf-call-arptables = 1 net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 net.ipv4.ip_local_reserved_ports = 30000-32767 vm.max_map_count = 262144 vm.swappiness = 1 fs.inotify.max_user_instances = 524288 kernel.pid_max = 65535 20:45:51 CST stdout: [master74] setenforce: SELinux is disabled Disabled kernel.sysrq = 0 net.ipv4.ip_forward = 1 net.ipv4.conf.all.send_redirects = 0 net.ipv4.conf.default.send_redirects = 0 net.ipv4.conf.all.accept_source_route = 0 net.ipv4.conf.default.accept_source_route = 0 net.ipv4.conf.all.accept_redirects = 0 net.ipv4.conf.default.accept_redirects = 0 net.ipv4.conf.all.secure_redirects = 0 net.ipv4.conf.default.secure_redirects = 0 net.ipv4.icmp_echo_ignore_broadcasts = 1 net.ipv4.icmp_ignore_bogus_error_responses = 1 net.ipv4.conf.all.rp_filter = 1 net.ipv4.conf.default.rp_filter = 1 net.ipv4.tcp_syncookies = 1 kernel.dmesg_restrict = 1 net.ipv6.conf.all.accept_redirects = 0 net.ipv6.conf.default.accept_redirects = 0 net.bridge.bridge-nf-call-arptables = 1 net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 net.ipv4.ip_local_reserved_ports = 30000-32767 vm.max_map_count = 262144 vm.swappiness = 1 fs.inotify.max_user_instances = 524288 kernel.pid_max = 65535 20:45:51 CST stdout: [node76] setenforce: SELinux is disabled Disabled kernel.sysrq = 0 net.ipv4.ip_forward = 1 net.ipv4.conf.all.send_redirects = 0 net.ipv4.conf.default.send_redirects = 0 net.ipv4.conf.all.accept_source_route = 0 net.ipv4.conf.default.accept_source_route = 0 net.ipv4.conf.all.accept_redirects = 0 net.ipv4.conf.default.accept_redirects = 0 net.ipv4.conf.all.secure_redirects = 0 net.ipv4.conf.default.secure_redirects = 0 net.ipv4.icmp_echo_ignore_broadcasts = 1 net.ipv4.icmp_ignore_bogus_error_responses = 1 net.ipv4.conf.all.rp_filter = 1 net.ipv4.conf.default.rp_filter = 1 net.ipv4.tcp_syncookies = 1 kernel.dmesg_restrict = 1 net.ipv6.conf.all.accept_redirects = 0 net.ipv6.conf.default.accept_redirects = 0 net.bridge.bridge-nf-call-arptables = 1 net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 net.ipv4.ip_local_reserved_ports = 30000-32767 vm.max_map_count = 262144 vm.swappiness = 1 fs.inotify.max_user_instances = 524288 kernel.pid_max = 65535 kernel.sem = 5010 641280 5010 256 kernel.shmall = 2097152 kernel.shmmax = 53687091200 kernel.shmmni = 8192 vm.mmap_min_addr = 65536 vm.dirty_writeback_centisecs = 100 vm.dirty_background_ratio = 10 vm.dirty_ratio = 60 vm.min_free_kbytes = 512000 vm.vfs_cache_pressure = 200 fs.aio-max-nr = 1048576 fs.file-max = 76724600 fs.nr_open = 2097152 net.core.netdev_max_backlog = 32768 net.core.rmem_default = 262144 net.core.rmem_max = 4194304 net.core.somaxconn = 4096 net.core.wmem_default = 262144 net.core.wmem_max = 4194304 net.ipv4.ip_local_port_range = 9000 65500 net.ipv4.route.gc_timeout = 100 net.ipv4.tcp_keepalive_time = 1200 net.ipv4.tcp_keepalive_probes = 3 net.ipv4.tcp_keepalive_intvl = 30 net.ipv4.tcp_max_syn_backlog = 4096 net.ipv4.tcp_max_tw_buckets = 6000 net.ipv4.tcp_synack_retries = 2 net.ipv4.tcp_syn_retries = 2 net.ipv4.tcp_fin_timeout = 30 net.ipv4.tcp_wmem = 8192 436600 873200 net.ipv4.tcp_rmem = 32768 436600 873200 net.ipv4.tcp_mem = 94500000 91500000 92700000 net.ipv4.tcp_max_orphans = 3276800 20:45:52 CST stdout: [master72] setenforce: SELinux is disabled Disabled kernel.sysrq = 0 net.ipv4.ip_forward = 1 net.ipv4.conf.all.send_redirects = 0 net.ipv4.conf.default.send_redirects = 0 net.ipv4.conf.all.accept_source_route = 0 net.ipv4.conf.default.accept_source_route = 0 net.ipv4.conf.all.accept_redirects = 0 net.ipv4.conf.default.accept_redirects = 0 net.ipv4.conf.all.secure_redirects = 0 net.ipv4.conf.default.secure_redirects = 0 net.ipv4.icmp_echo_ignore_broadcasts = 1 net.ipv4.icmp_ignore_bogus_error_responses = 1 net.ipv4.conf.all.rp_filter = 1 net.ipv4.conf.default.rp_filter = 1 net.ipv4.tcp_syncookies = 1 kernel.dmesg_restrict = 1 net.ipv6.conf.all.accept_redirects = 0 net.ipv6.conf.default.accept_redirects = 0 net.bridge.bridge-nf-call-arptables = 1 net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 net.ipv4.ip_local_reserved_ports = 30000-32767 vm.max_map_count = 262144 vm.swappiness = 1 fs.inotify.max_user_instances = 524288 kernel.pid_max = 65535 20:45:53 CST stdout: [master70] setenforce: SELinux is disabled Disabled kernel.sysrq = 0 net.ipv4.ip_forward = 1 net.ipv4.conf.all.send_redirects = 0 net.ipv4.conf.default.send_redirects = 0 net.ipv4.conf.all.accept_source_route = 0 net.ipv4.conf.default.accept_source_route = 0 net.ipv4.conf.all.accept_redirects = 0 net.ipv4.conf.default.accept_redirects = 0 net.ipv4.conf.all.secure_redirects = 0 net.ipv4.conf.default.secure_redirects = 0 net.ipv4.icmp_echo_ignore_broadcasts = 1 net.ipv4.icmp_ignore_bogus_error_responses = 1 net.ipv4.conf.all.rp_filter = 1 net.ipv4.conf.default.rp_filter = 1 net.ipv4.tcp_syncookies = 1 kernel.dmesg_restrict = 1 net.ipv6.conf.all.accept_redirects = 0 net.ipv6.conf.default.accept_redirects = 0 net.bridge.bridge-nf-call-arptables = 1 net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 net.ipv4.ip_local_reserved_ports = 30000-32767 vm.max_map_count = 262144 vm.swappiness = 1 fs.inotify.max_user_instances = 524288 kernel.pid_max = 65535 20:45:53 CST success: [node68] 20:45:53 CST success: [master74] 20:45:53 CST success: [node76] 20:45:53 CST success: [master72] 20:45:53 CST success: [master70] 20:45:53 CST [ConfigureOSModule] configure the ntp server for each node 20:45:53 CST skipped: [master70] 20:45:53 CST skipped: [node76] 20:45:53 CST skipped: [master74] 20:45:53 CST skipped: [node68] 20:45:53 CST skipped: [master72] 20:45:53 CST [KubernetesStatusModule] Get kubernetes cluster status 20:45:55 CST success: [master70] 20:45:55 CST success: [master72] 20:45:55 CST success: [master74] 20:45:55 CST [InstallContainerModule] Sync docker binaries 20:45:56 CST skipped: [node68] 20:45:56 CST skipped: [master74] 20:45:56 CST skipped: [node76] 20:45:56 CST skipped: [master72] 20:45:56 CST skipped: [master70] 20:45:56 CST [InstallContainerModule] Generate docker service 20:45:56 CST skipped: [node68] 20:45:56 CST skipped: [node76] 20:45:56 CST skipped: [master74] 20:45:56 CST skipped: [master72] 20:45:56 CST skipped: [master70] 20:45:56 CST [InstallContainerModule] Generate docker config 20:45:56 CST skipped: [node68] 20:45:56 CST skipped: [master74] 20:45:56 CST skipped: [node76] 20:45:56 CST skipped: [master72] 20:45:56 CST skipped: [master70] 20:45:56 CST [InstallContainerModule] Enable docker 20:45:57 CST skipped: [node68] 20:45:57 CST skipped: [master74] 20:45:57 CST skipped: [node76] 20:45:57 CST skipped: [master72] 20:45:57 CST skipped: [master70] 20:45:57 CST [InstallContainerModule] Add auths to container runtime
Relevant log output
Additional information
No response
从日志来看,你的机器在执行 kk create cluster 之前已经安装了 Docker,很可能是本地已安装的 Docker 版本与集群环境不兼容。 kk create cluster 会在系统未安装 Docker 时,自动从官网下载安装并配置兼容版本。 建议先卸载现有的 Docker,再通过 kk 安装,以确保环境一致性和兼容性。
目前定位到问题是出现在node76节点机器上,但是找不到问题原因,环境的话都是离线初始化安装好的
v3.0.7 版本的kk 安装的默认软件版本如下: https://github.com/kubesphere/kubekey/blob/e755baf67198d565689d7207378174f429b508ba/cmd/kk/apis/kubekey/v1alpha2/default.go#L44-L46
可以去对应的node76, node68机器上查看docker的系统日志,看有没有有效信息。