kubekey icon indicating copy to clipboard operation
kubekey copied to clipboard

执行./kk create cluster -f config-sample.yaml -a kubesphere.tar.gz,过程中卡住不动,没日志

Open guoxingliang opened this issue 1 month ago • 3 comments

What is version of KubeKey has the issue?

kk version: &version.Info{Major:"3", Minor:"0", GitVersion:"v3.0.7", GitCommit:"e755baf67198d565689d7207378174f429b508ba", GitTreeState:"clean", BuildDate:"2023-01-18T01:57:24Z", GoVersion:"go1.19.2", Compiler:"gc", Platform:"linux/amd64"}

What is your os environment?

ky10_x86

KubeKey config file

主要文件配置如下
apiVersion: kubekey.kubesphere.io/v1alpha2
kind: Cluster
metadata:
  name: sample
spec:
  hosts:
  - {name: master70, address: 192.168.1.1, internalAddress: 192.168.1.1, user: root, password: "Aa12345!"}
  - {name: master72, address: 192.168.1.2, internalAddress: 192.168.1.2, user: root, password: "Aa12345!"}
  - {name: master74, address: 192.168.1.3, internalAddress: 192.168.1.3, user: root, password: "Aa12345!"}
  - {name: node76, address: 192.168.1.4, internalAddress: 192.168.1.4, user: root, password: "Aa12345!"}
  - {name: node68, address: 192.168.1.5, internalAddress: 192.168.1.5, user: root, password: "Aa12345!"}
  roleGroups:
    etcd:
    - master70
    - master72
    - master74
    control-plane:
    - master70
    - master72
    - master74
    worker:
    - node76
    - node68
    registry:
    - master70
  controlPlaneEndpoint:
    ## Internal loadbalancer for apiservers
    internalLoadbalancer: haproxy

    domain: lb.kubesphere.local
    address: ""
    port: 6443
  kubernetes:
    version: v1.21.13
    clusterName: cluster.local
  etcd:
    type: kubekey
  network:
    plugin: calico
    kubePodsCIDR: 10.233.64.0/18
    kubeServiceCIDR: 10.233.0.0/18
    ## multus support. https://github.com/k8snetworkplumbingwg/multus-cni
    multusCNI:
      enabled: false
  registry:
    type: harbor
    auths:
      "dockerhub.kubekey.local":
        username: admin
        password: Harbor12345
    privateRegistry: "dockerhub.kubekey.local"
    namespaceOverride: "cytech_pf"
    registryMirrors: []
    insecureRegistries: ["dockerhub.kubekey.local"]
    #privateRegistry: ""
    #namespaceOverride: ""
    #registryMirrors: []
    #insecureRegistries: []
  addons: []

A clear and concise description of what happend.

执行过了init后,没有异常,但是执行create的时候就卡在[InstallContainerModule] Add auths to container runtime这一行没有动静,这种要怎么排查 20:45:38 CST success: [node76] 20:45:38 CST success: [master74] 20:45:38 CST success: [master70] 20:45:38 CST success: [node68] 20:45:38 CST success: [master72] 20:45:38 CST [RepositoryModule] Reset repository to the original repository 20:45:39 CST success: [node68] 20:45:39 CST success: [master74] 20:45:39 CST success: [node76] 20:45:39 CST success: [master70] 20:45:39 CST success: [master72] 20:45:39 CST [RepositoryModule] Umount ISO file 20:45:39 CST success: [node68] 20:45:39 CST success: [master74] 20:45:39 CST success: [node76] 20:45:39 CST success: [master70] 20:45:39 CST success: [master72] 20:45:39 CST [NodeBinariesModule] Download installation binaries 20:45:39 CST message: [localhost] downloading amd64 kubeadm v1.21.13 ... 20:45:39 CST message: [localhost] kubeadm is existed 20:45:39 CST message: [localhost] downloading amd64 kubelet v1.21.13 ... 20:45:40 CST message: [localhost] kubelet is existed 20:45:40 CST message: [localhost] downloading amd64 kubectl v1.21.13 ... 20:45:41 CST message: [localhost] kubectl is existed 20:45:41 CST message: [localhost] downloading amd64 helm v3.9.0 ... 20:45:41 CST message: [localhost] helm is existed 20:45:41 CST message: [localhost] downloading amd64 kubecni v0.9.1 ... 20:45:41 CST message: [localhost] kubecni is existed 20:45:41 CST message: [localhost] downloading amd64 crictl v1.24.0 ... 20:45:41 CST message: [localhost] crictl is existed 20:45:41 CST message: [localhost] downloading amd64 etcd v3.4.13 ... 20:45:42 CST message: [localhost] etcd is existed 20:45:42 CST message: [localhost] downloading amd64 docker 20.10.8 ... 20:45:42 CST message: [localhost] docker is existed 20:45:42 CST success: [LocalHost] 20:45:42 CST [ConfigureOSModule] Get OS release 20:45:42 CST success: [node68] 20:45:42 CST success: [master74] 20:45:42 CST success: [node76] 20:45:42 CST success: [master70] 20:45:42 CST success: [master72] 20:45:42 CST [ConfigureOSModule] Prepare to init OS 20:45:48 CST success: [node68] 20:45:48 CST success: [node76] 20:45:48 CST success: [master74] 20:45:48 CST success: [master70] 20:45:48 CST success: [master72] 20:45:48 CST [ConfigureOSModule] Generate init os script 20:45:49 CST success: [node68] 20:45:49 CST success: [master74] 20:45:49 CST success: [node76] 20:45:49 CST success: [master70] 20:45:49 CST success: [master72] 20:45:49 CST [ConfigureOSModule] Exec init os script 20:45:50 CST stdout: [node68] setenforce: SELinux is disabled Disabled net.ipv4.ip_forward = 1 net.bridge.bridge-nf-call-arptables = 1 net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 net.ipv4.ip_local_reserved_ports = 30000-32767 vm.max_map_count = 262144 vm.swappiness = 1 fs.inotify.max_user_instances = 524288 kernel.pid_max = 65535 20:45:51 CST stdout: [master74] setenforce: SELinux is disabled Disabled kernel.sysrq = 0 net.ipv4.ip_forward = 1 net.ipv4.conf.all.send_redirects = 0 net.ipv4.conf.default.send_redirects = 0 net.ipv4.conf.all.accept_source_route = 0 net.ipv4.conf.default.accept_source_route = 0 net.ipv4.conf.all.accept_redirects = 0 net.ipv4.conf.default.accept_redirects = 0 net.ipv4.conf.all.secure_redirects = 0 net.ipv4.conf.default.secure_redirects = 0 net.ipv4.icmp_echo_ignore_broadcasts = 1 net.ipv4.icmp_ignore_bogus_error_responses = 1 net.ipv4.conf.all.rp_filter = 1 net.ipv4.conf.default.rp_filter = 1 net.ipv4.tcp_syncookies = 1 kernel.dmesg_restrict = 1 net.ipv6.conf.all.accept_redirects = 0 net.ipv6.conf.default.accept_redirects = 0 net.bridge.bridge-nf-call-arptables = 1 net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 net.ipv4.ip_local_reserved_ports = 30000-32767 vm.max_map_count = 262144 vm.swappiness = 1 fs.inotify.max_user_instances = 524288 kernel.pid_max = 65535 20:45:51 CST stdout: [node76] setenforce: SELinux is disabled Disabled kernel.sysrq = 0 net.ipv4.ip_forward = 1 net.ipv4.conf.all.send_redirects = 0 net.ipv4.conf.default.send_redirects = 0 net.ipv4.conf.all.accept_source_route = 0 net.ipv4.conf.default.accept_source_route = 0 net.ipv4.conf.all.accept_redirects = 0 net.ipv4.conf.default.accept_redirects = 0 net.ipv4.conf.all.secure_redirects = 0 net.ipv4.conf.default.secure_redirects = 0 net.ipv4.icmp_echo_ignore_broadcasts = 1 net.ipv4.icmp_ignore_bogus_error_responses = 1 net.ipv4.conf.all.rp_filter = 1 net.ipv4.conf.default.rp_filter = 1 net.ipv4.tcp_syncookies = 1 kernel.dmesg_restrict = 1 net.ipv6.conf.all.accept_redirects = 0 net.ipv6.conf.default.accept_redirects = 0 net.bridge.bridge-nf-call-arptables = 1 net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 net.ipv4.ip_local_reserved_ports = 30000-32767 vm.max_map_count = 262144 vm.swappiness = 1 fs.inotify.max_user_instances = 524288 kernel.pid_max = 65535 kernel.sem = 5010 641280 5010 256 kernel.shmall = 2097152 kernel.shmmax = 53687091200 kernel.shmmni = 8192 vm.mmap_min_addr = 65536 vm.dirty_writeback_centisecs = 100 vm.dirty_background_ratio = 10 vm.dirty_ratio = 60 vm.min_free_kbytes = 512000 vm.vfs_cache_pressure = 200 fs.aio-max-nr = 1048576 fs.file-max = 76724600 fs.nr_open = 2097152 net.core.netdev_max_backlog = 32768 net.core.rmem_default = 262144 net.core.rmem_max = 4194304 net.core.somaxconn = 4096 net.core.wmem_default = 262144 net.core.wmem_max = 4194304 net.ipv4.ip_local_port_range = 9000 65500 net.ipv4.route.gc_timeout = 100 net.ipv4.tcp_keepalive_time = 1200 net.ipv4.tcp_keepalive_probes = 3 net.ipv4.tcp_keepalive_intvl = 30 net.ipv4.tcp_max_syn_backlog = 4096 net.ipv4.tcp_max_tw_buckets = 6000 net.ipv4.tcp_synack_retries = 2 net.ipv4.tcp_syn_retries = 2 net.ipv4.tcp_fin_timeout = 30 net.ipv4.tcp_wmem = 8192 436600 873200 net.ipv4.tcp_rmem = 32768 436600 873200 net.ipv4.tcp_mem = 94500000 91500000 92700000 net.ipv4.tcp_max_orphans = 3276800 20:45:52 CST stdout: [master72] setenforce: SELinux is disabled Disabled kernel.sysrq = 0 net.ipv4.ip_forward = 1 net.ipv4.conf.all.send_redirects = 0 net.ipv4.conf.default.send_redirects = 0 net.ipv4.conf.all.accept_source_route = 0 net.ipv4.conf.default.accept_source_route = 0 net.ipv4.conf.all.accept_redirects = 0 net.ipv4.conf.default.accept_redirects = 0 net.ipv4.conf.all.secure_redirects = 0 net.ipv4.conf.default.secure_redirects = 0 net.ipv4.icmp_echo_ignore_broadcasts = 1 net.ipv4.icmp_ignore_bogus_error_responses = 1 net.ipv4.conf.all.rp_filter = 1 net.ipv4.conf.default.rp_filter = 1 net.ipv4.tcp_syncookies = 1 kernel.dmesg_restrict = 1 net.ipv6.conf.all.accept_redirects = 0 net.ipv6.conf.default.accept_redirects = 0 net.bridge.bridge-nf-call-arptables = 1 net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 net.ipv4.ip_local_reserved_ports = 30000-32767 vm.max_map_count = 262144 vm.swappiness = 1 fs.inotify.max_user_instances = 524288 kernel.pid_max = 65535 20:45:53 CST stdout: [master70] setenforce: SELinux is disabled Disabled kernel.sysrq = 0 net.ipv4.ip_forward = 1 net.ipv4.conf.all.send_redirects = 0 net.ipv4.conf.default.send_redirects = 0 net.ipv4.conf.all.accept_source_route = 0 net.ipv4.conf.default.accept_source_route = 0 net.ipv4.conf.all.accept_redirects = 0 net.ipv4.conf.default.accept_redirects = 0 net.ipv4.conf.all.secure_redirects = 0 net.ipv4.conf.default.secure_redirects = 0 net.ipv4.icmp_echo_ignore_broadcasts = 1 net.ipv4.icmp_ignore_bogus_error_responses = 1 net.ipv4.conf.all.rp_filter = 1 net.ipv4.conf.default.rp_filter = 1 net.ipv4.tcp_syncookies = 1 kernel.dmesg_restrict = 1 net.ipv6.conf.all.accept_redirects = 0 net.ipv6.conf.default.accept_redirects = 0 net.bridge.bridge-nf-call-arptables = 1 net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 net.ipv4.ip_local_reserved_ports = 30000-32767 vm.max_map_count = 262144 vm.swappiness = 1 fs.inotify.max_user_instances = 524288 kernel.pid_max = 65535 20:45:53 CST success: [node68] 20:45:53 CST success: [master74] 20:45:53 CST success: [node76] 20:45:53 CST success: [master72] 20:45:53 CST success: [master70] 20:45:53 CST [ConfigureOSModule] configure the ntp server for each node 20:45:53 CST skipped: [master70] 20:45:53 CST skipped: [node76] 20:45:53 CST skipped: [master74] 20:45:53 CST skipped: [node68] 20:45:53 CST skipped: [master72] 20:45:53 CST [KubernetesStatusModule] Get kubernetes cluster status 20:45:55 CST success: [master70] 20:45:55 CST success: [master72] 20:45:55 CST success: [master74] 20:45:55 CST [InstallContainerModule] Sync docker binaries 20:45:56 CST skipped: [node68] 20:45:56 CST skipped: [master74] 20:45:56 CST skipped: [node76] 20:45:56 CST skipped: [master72] 20:45:56 CST skipped: [master70] 20:45:56 CST [InstallContainerModule] Generate docker service 20:45:56 CST skipped: [node68] 20:45:56 CST skipped: [node76] 20:45:56 CST skipped: [master74] 20:45:56 CST skipped: [master72] 20:45:56 CST skipped: [master70] 20:45:56 CST [InstallContainerModule] Generate docker config 20:45:56 CST skipped: [node68] 20:45:56 CST skipped: [master74] 20:45:56 CST skipped: [node76] 20:45:56 CST skipped: [master72] 20:45:56 CST skipped: [master70] 20:45:56 CST [InstallContainerModule] Enable docker 20:45:57 CST skipped: [node68] 20:45:57 CST skipped: [master74] 20:45:57 CST skipped: [node76] 20:45:57 CST skipped: [master72] 20:45:57 CST skipped: [master70] 20:45:57 CST [InstallContainerModule] Add auths to container runtime

Relevant log output


Additional information

No response

guoxingliang avatar Nov 04 '25 12:11 guoxingliang

从日志来看,你的机器在执行 kk create cluster 之前已经安装了 Docker,很可能是本地已安装的 Docker 版本与集群环境不兼容。 kk create cluster 会在系统未安装 Docker 时,自动从官网下载安装并配置兼容版本。 建议先卸载现有的 Docker,再通过 kk 安装,以确保环境一致性和兼容性。

redscholar avatar Nov 05 '25 04:11 redscholar

目前定位到问题是出现在node76节点机器上,但是找不到问题原因,环境的话都是离线初始化安装好的

Image

guoxingliang avatar Nov 06 '25 08:11 guoxingliang

v3.0.7 版本的kk 安装的默认软件版本如下: https://github.com/kubesphere/kubekey/blob/e755baf67198d565689d7207378174f429b508ba/cmd/kk/apis/kubekey/v1alpha2/default.go#L44-L46

可以去对应的node76, node68机器上查看docker的系统日志,看有没有有效信息。

redscholar avatar Nov 06 '25 11:11 redscholar