kk3.1.9 离线部署 ARM版本遇到如下问题

Open GalenGao opened this issue 7 months ago • 1 comments
Your current KubeKey version

3.1.9
Describe this feature

具体部署日志如下：
08:43:56 CST message: [host]
downloading image: dockerhub.kubekey.local/kubesphereio/pause:3.9
08:43:56 CST message: [host]
downloading image: dockerhub.kubekey.local/kubesphereio/kube-apiserver:v1.33.0
08:43:56 CST message: [host]
downloading image: dockerhub.kubekey.local/kubesphereio/kube-controller-manager:v1.33.0
08:43:56 CST message: [host]
downloading image: dockerhub.kubekey.local/kubesphereio/kube-scheduler:v1.33.0
08:43:56 CST message: [member1]
downloading image: dockerhub.kubekey.local/kubesphereio/kube-proxy:v1.33.0
08:43:56 CST message: [host]
downloading image: dockerhub.kubekey.local/kubesphereio/kube-proxy:v1.33.0
08:43:57 CST message: [host]
downloading image: dockerhub.kubekey.local/kubesphereio/coredns:1.9.3
08:43:57 CST message: [host]
downloading image: dockerhub.kubekey.local/kubesphereio/k8s-dns-node-cache:1.22.20
08:43:57 CST message: [member1]
downloading image: dockerhub.kubekey.local/kubesphereio/coredns:1.9.3
08:43:57 CST message: [host]
downloading image: dockerhub.kubekey.local/kubesphereio/kube-controllers:v3.27.4
08:43:57 CST message: [host]
downloading image: dockerhub.kubekey.local/kubesphereio/cni:v3.27.4
08:43:57 CST message: [member1]
downloading image: dockerhub.kubekey.local/kubesphereio/k8s-dns-node-cache:1.22.20
08:43:57 CST message: [host]
downloading image: dockerhub.kubekey.local/kubesphereio/node:v3.27.4
08:43:58 CST message: [host]
downloading image: dockerhub.kubekey.local/kubesphereio/pod2daemon-flexvol:v3.27.4
08:43:58 CST message: [member1]
downloading image: dockerhub.kubekey.local/kubesphereio/kube-controllers:v3.27.4
08:43:58 CST message: [member1]
downloading image: dockerhub.kubekey.local/kubesphereio/cni:v3.27.4
08:43:59 CST message: [member1]
downloading image: dockerhub.kubekey.local/kubesphereio/node:v3.27.4
08:43:59 CST message: [member1]
downloading image: dockerhub.kubekey.local/kubesphereio/pod2daemon-flexvol:v3.27.4
08:44:00 CST success: [host]
08:44:00 CST success: [member1]
08:44:00 CST [ETCDPreCheckModule] Get etcd status
08:44:00 CST success: [host]
08:44:00 CST [CertsModule] Fetch etcd certs
08:44:00 CST success: [host]
08:44:00 CST [CertsModule] Generate etcd Certs
08:44:01 CST success: [LocalHost]
08:44:01 CST [CertsModule] Synchronize certs file
08:44:05 CST success: [host]
08:44:05 CST [CertsModule] Synchronize certs file to master
08:44:05 CST skipped: [host]
08:44:05 CST [InstallETCDBinaryModule] Install etcd using binary
08:44:06 CST success: [host]
08:44:06 CST [InstallETCDBinaryModule] Generate etcd service
08:44:07 CST success: [host]
08:44:07 CST [InstallETCDBinaryModule] Generate access address
08:44:07 CST success: [host]
08:44:07 CST [ETCDConfigureModule] Health check on exist etcd
08:44:07 CST skipped: [host]
08:44:07 CST [ETCDConfigureModule] Generate etcd.env config on new etcd
08:44:08 CST success: [host]
08:44:08 CST [ETCDConfigureModule] Refresh etcd.env config on all etcd
08:44:08 CST success: [host]
08:44:08 CST [ETCDConfigureModule] Restart etcd
08:44:14 CST success: [host]
08:44:14 CST [ETCDConfigureModule] Health check on all etcd
08:44:14 CST success: [host]
08:44:14 CST [ETCDConfigureModule] Refresh etcd.env config to exist mode on all etcd
08:44:14 CST success: [host]
08:44:14 CST [ETCDConfigureModule] Health check on all etcd
08:44:15 CST success: [host]
08:44:15 CST [ETCDBackupModule] Backup etcd data regularly
08:44:15 CST success: [host]
08:44:15 CST [ETCDBackupModule] Generate backup ETCD service
08:44:16 CST success: [host]
08:44:16 CST [ETCDBackupModule] Generate backup ETCD timer
08:44:16 CST success: [host]
08:44:16 CST [ETCDBackupModule] Enable backup etcd service
08:44:17 CST success: [host]
08:44:17 CST [InstallKubeBinariesModule] Synchronize kubernetes binaries
08:44:35 CST success: [host]
08:44:35 CST success: [member1]
08:44:35 CST [InstallKubeBinariesModule] Change kubelet mode
08:44:35 CST success: [host]
08:44:35 CST success: [member1]
08:44:35 CST [InstallKubeBinariesModule] Generate kubelet service
08:44:36 CST success: [host]
08:44:36 CST success: [member1]
08:44:36 CST [InstallKubeBinariesModule] Enable kubelet service
08:44:37 CST success: [host]
08:44:37 CST success: [member1]
08:44:37 CST [InstallKubeBinariesModule] Generate kubelet env
08:44:38 CST success: [host]
08:44:38 CST success: [member1]
08:44:38 CST [InitKubernetesModule] Generate kubeadm config
08:44:39 CST success: [host]
08:44:39 CST [InitKubernetesModule] Generate audit policy
08:44:39 CST skipped: [host]
08:44:39 CST [InitKubernetesModule] Generate audit webhook
08:44:39 CST skipped: [host]
08:44:39 CST [InitKubernetesModule] Init cluster using kubeadm
08:48:44 CST stdout: [host]
W0515 08:44:39.180162   36833 common.go:101] your configuration file uses a deprecated API spec: "kubeadm.k8s.io/v1beta3" (kind: "ClusterConfiguration"). Please use 'kubeadm config migrate --old-config old-config-file --new-config new-config-file', which will write the new, similar spec using a newer API version.
W0515 08:44:39.181096   36833 common.go:101] your configuration file uses a deprecated API spec: "kubeadm.k8s.io/v1beta3" (kind: "InitConfiguration"). Please use 'kubeadm config migrate --old-config old-config-file --new-config new-config-file', which will write the new, similar spec using a newer API version.
W0515 08:44:39.183997   36833 utils.go:69] The recommended value for "clusterDNS" in "KubeletConfiguration" is: [10.233.0.10]; the provided value is: [169.254.25.10]
[init] Using Kubernetes version: v1.33.0
[preflight] Running pre-flight checks
	[WARNING SystemVerification]: cgroups v1 support is in maintenance mode, please migrate to cgroups v2
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action beforehand using 'kubeadm config images pull'
W0515 08:44:39.276264   36833 checks.go:846] detected that the sandbox image "dockerhub.kubekey.local/kubesphereio/pause:3.9" of the container runtime is inconsistent with that used by kubeadm.It is recommended to use "dockerhub.kubekey.local/kubesphereio/pause:3.10" as the CRI sandbox image.
	[WARNING ImagePull]: failed to pull image dockerhub.kubekey.local/kubesphereio/pause:3.10: failed to pull image dockerhub.kubekey.local/kubesphereio/pause:3.10: rpc error: code = NotFound desc = failed to pull and unpack image "dockerhub.kubekey.local/kubesphereio/pause:3.10": failed to resolve reference "dockerhub.kubekey.local/kubesphereio/pause:3.10": dockerhub.kubekey.local/kubesphereio/pause:3.10: not found
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [host host.cluster.local kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local lb.kubesphere.local localhost member1 member1.cluster.local] and IPs [10.233.0.1 20.48.5.133 127.0.0.1 20.48.1.130]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] External etcd mode: Skipping etcd/ca certificate authority generation
[certs] External etcd mode: Skipping etcd/server certificate generation
[certs] External etcd mode: Skipping etcd/peer certificate generation
[certs] External etcd mode: Skipping etcd/healthcheck-client certificate generation
[certs] External etcd mode: Skipping apiserver-etcd-client certificate generation
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "super-admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests"
[kubelet-check] Waiting for a healthy kubelet at http://127.0.0.1:10248/healthz. This can take up to 4m0s
[kubelet-check] The kubelet is healthy after 501.620193ms
[control-plane-check] Waiting for healthy control plane components. This can take up to 4m0s
[control-plane-check] Checking kube-apiserver at https://20.48.5.133:6443/livez
[control-plane-check] Checking kube-controller-manager at https://0.0.0.0:10257/healthz
[control-plane-check] Checking kube-scheduler at https://0.0.0.0:10259/livez
[control-plane-check] kube-controller-manager is not healthy after 4m0.000648186s
[control-plane-check] kube-scheduler is not healthy after 4m0.000809016s
[control-plane-check] kube-apiserver is not healthy after 4m0.000881116s

A control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI.
Here is one example how you may list all running Kubernetes containers by using crictl:
	- 'crictl --runtime-endpoint unix:///run/containerd/containerd.sock ps -a | grep kube | grep -v pause'
	Once you have found the failing container, you can inspect its logs with:
	- 'crictl --runtime-endpoint unix:///run/containerd/containerd.sock logs CONTAINERID'

error execution phase wait-control-plane: failed while waiting for the control plane to start: [kube-controller-manager check failed at https://0.0.0.0:10257/healthz: Get "https://0.0.0.0:10257/healthz": dial tcp 0.0.0.0:10257: connect: connection refused, kube-scheduler check failed at https://0.0.0.0:10259/livez: Get "https://0.0.0.0:10259/livez": dial tcp 0.0.0.0:10259: connect: connection refused, kube-apiserver check failed at https://20.48.5.133:6443/livez: client rate limiter Wait returned an error: rate: Wait(n=1) would exceed context deadline]
To see the stack trace of this error execute with --v=5 or higher
08:48:47 CST stdout: [host]
[reset] Reading configuration from the "kubeadm-config" ConfigMap in namespace "kube-system"...
[reset] Use 'kubeadm init phase upload-config --config your-config-file' to re-upload it.
W0515 08:48:45.060538   22601 reset.go:137] [reset] Unable to fetch the kubeadm-config ConfigMap from cluster: failed to get config map: Get "https://lb.kubesphere.local:6443/api/v1/namespaces/kube-system/configmaps/kubeadm-config?timeout=10s": dial tcp 20.48.5.133:6443: connect: connection refused
[preflight] Running pre-flight checks
W0515 08:48:45.060637   22601 removeetcdmember.go:106] [reset] No kubeadm config, using etcd pod spec to get data directory
[reset] Stopping the kubelet service
[reset] Unmounting mounted directories in "/var/lib/kubelet"
[reset] Deleting contents of directories: [/etc/kubernetes/manifests /var/lib/kubelet /etc/kubernetes/pki]
[reset] Deleting files: [/etc/kubernetes/admin.conf /etc/kubernetes/super-admin.conf /etc/kubernetes/kubelet.conf /etc/kubernetes/bootstrap-kubelet.conf /etc/kubernetes/controller-manager.conf /etc/kubernetes/scheduler.conf]

The reset process does not perform cleanup of CNI plugin configuration,
network filtering rules and kubeconfig files.

For information on how to perform this cleanup manually, please see:
    https://k8s.io/docs/reference/setup-tools/kubeadm/kubeadm-reset/
08:48:47 CST message: [host]
init kubernetes cluster failed: Failed to exec command: sudo -E /bin/bash -c "/usr/local/bin/kubeadm init --config=/etc/kubernetes/kubeadm-config.yaml --ignore-preflight-errors=FileExisting-crictl,ImagePull" 
W0515 08:44:39.180162   36833 common.go:101] your configuration file uses a deprecated API spec: "kubeadm.k8s.io/v1beta3" (kind: "ClusterConfiguration"). Please use 'kubeadm config migrate --old-config old-config-file --new-config new-config-file', which will write the new, similar spec using a newer API version.
W0515 08:44:39.181096   36833 common.go:101] your configuration file uses a deprecated API spec: "kubeadm.k8s.io/v1beta3" (kind: "InitConfiguration"). Please use 'kubeadm config migrate --old-config old-config-file --new-config new-config-file', which will write the new, similar spec using a newer API version.
W0515 08:44:39.183997   36833 utils.go:69] The recommended value for "clusterDNS" in "KubeletConfiguration" is: [10.233.0.10]; the provided value is: [169.254.25.10]
[init] Using Kubernetes version: v1.33.0
[preflight] Running pre-flight checks
	[WARNING SystemVerification]: cgroups v1 support is in maintenance mode, please migrate to cgroups v2
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action beforehand using 'kubeadm config images pull'
W0515 08:44:39.276264   36833 checks.go:846] detected that the sandbox image "dockerhub.kubekey.local/kubesphereio/pause:3.9" of the container runtime is inconsistent with that used by kubeadm.It is recommended to use "dockerhub.kubekey.local/kubesphereio/pause:3.10" as the CRI sandbox image.
	[WARNING ImagePull]: failed to pull image dockerhub.kubekey.local/kubesphereio/pause:3.10: failed to pull image dockerhub.kubekey.local/kubesphereio/pause:3.10: rpc error: code = NotFound desc = failed to pull and unpack image "dockerhub.kubekey.local/kubesphereio/pause:3.10": failed to resolve reference "dockerhub.kubekey.local/kubesphereio/pause:3.10": dockerhub.kubekey.local/kubesphereio/pause:3.10: not found
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [host host.cluster.local kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local lb.kubesphere.local localhost member1 member1.cluster.local] and IPs [10.233.0.1 20.48.5.133 127.0.0.1 20.48.1.130]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] External etcd mode: Skipping etcd/ca certificate authority generation
[certs] External etcd mode: Skipping etcd/server certificate generation
[certs] External etcd mode: Skipping etcd/peer certificate generation
[certs] External etcd mode: Skipping etcd/healthcheck-client certificate generation
[certs] External etcd mode: Skipping apiserver-etcd-client certificate generation
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "super-admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests"
[kubelet-check] Waiting for a healthy kubelet at http://127.0.0.1:10248/healthz. This can take up to 4m0s
[kubelet-check] The kubelet is healthy after 501.620193ms
[control-plane-check] Waiting for healthy control plane components. This can take up to 4m0s
[control-plane-check] Checking kube-apiserver at https://20.48.5.133:6443/livez
[control-plane-check] Checking kube-controller-manager at https://0.0.0.0:10257/healthz
[control-plane-check] Checking kube-scheduler at https://0.0.0.0:10259/livez
[control-plane-check] kube-controller-manager is not healthy after 4m0.000648186s
[control-plane-check] kube-scheduler is not healthy after 4m0.000809016s
[control-plane-check] kube-apiserver is not healthy after 4m0.000881116s

A control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI.
Here is one example how you may list all running Kubernetes containers by using crictl:
	- 'crictl --runtime-endpoint unix:///run/containerd/containerd.sock ps -a | grep kube | grep -v pause'
	Once you have found the failing container, you can inspect its logs with:
	- 'crictl --runtime-endpoint unix:///run/containerd/containerd.sock logs CONTAINERID'

error execution phase wait-control-plane: failed while waiting for the control plane to start: [kube-controller-manager check failed at https://0.0.0.0:10257/healthz: Get "https://0.0.0.0:10257/healthz": dial tcp 0.0.0.0:10257: connect: connection refused, kube-scheduler check failed at https://0.0.0.0:10259/livez: Get "https://0.0.0.0:10259/livez": dial tcp 0.0.0.0:10259: connect: connection refused, kube-apiserver check failed at https://20.48.5.133:6443/livez: client rate limiter Wait returned an error: rate: Wait(n=1) would exceed context deadline]
To see the stack trace of this error execute with --v=5 or higher: Process exited with status 1
08:48:47 CST retry: [host]
08:53:00 CST stdout: [host]
W0515 08:48:52.270301   22698 common.go:101] your configuration file uses a deprecated API spec: "kubeadm.k8s.io/v1beta3" (kind: "ClusterConfiguration"). Please use 'kubeadm config migrate --old-config old-config-file --new-config new-config-file', which will write the new, similar spec using a newer API version.
W0515 08:48:52.271802   22698 common.go:101] your configuration file uses a deprecated API spec: "kubeadm.k8s.io/v1beta3" (kind: "InitConfiguration"). Please use 'kubeadm config migrate --old-config old-config-file --new-config new-config-file', which will write the new, similar spec using a newer API version.
W0515 08:48:52.273727   22698 utils.go:69] The recommended value for "clusterDNS" in "KubeletConfiguration" is: [10.233.0.10]; the provided value is: [169.254.25.10]
[init] Using Kubernetes version: v1.33.0
[preflight] Running pre-flight checks
	[WARNING SystemVerification]: cgroups v1 support is in maintenance mode, please migrate to cgroups v2
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action beforehand using 'kubeadm config images pull'
W0515 08:48:52.368232   22698 checks.go:846] detected that the sandbox image "dockerhub.kubekey.local/kubesphereio/pause:3.9" of the container runtime is inconsistent with that used by kubeadm.It is recommended to use "dockerhub.kubekey.local/kubesphereio/pause:3.10" as the CRI sandbox image.
	[WARNING ImagePull]: failed to pull image dockerhub.kubekey.local/kubesphereio/pause:3.10: failed to pull image dockerhub.kubekey.local/kubesphereio/pause:3.10: rpc error: code = NotFound desc = failed to pull and unpack image "dockerhub.kubekey.local/kubesphereio/pause:3.10": failed to resolve reference "dockerhub.kubekey.local/kubesphereio/pause:3.10": dockerhub.kubekey.local/kubesphereio/pause:3.10: not found
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [host host.cluster.local kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local lb.kubesphere.local localhost member1 member1.cluster.local] and IPs [10.233.0.1 20.48.5.133 127.0.0.1 20.48.1.130]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] External etcd mode: Skipping etcd/ca certificate authority generation
[certs] External etcd mode: Skipping etcd/server certificate generation
[certs] External etcd mode: Skipping etcd/peer certificate generation
[certs] External etcd mode: Skipping etcd/healthcheck-client certificate generation
[certs] External etcd mode: Skipping apiserver-etcd-client certificate generation
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "super-admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests"
[kubelet-check] Waiting for a healthy kubelet at http://127.0.0.1:10248/healthz. This can take up to 4m0s
[kubelet-check] The kubelet is healthy after 1.501424809s
[control-plane-check] Waiting for healthy control plane components. This can take up to 4m0s
[control-plane-check] Checking kube-apiserver at https://20.48.5.133:6443/livez
[control-plane-check] Checking kube-controller-manager at https://0.0.0.0:10257/healthz
[control-plane-check] Checking kube-scheduler at https://0.0.0.0:10259/livez
[control-plane-check] kube-apiserver is not healthy after 4m0.001364706s
[control-plane-check] kube-scheduler is not healthy after 4m0.001438136s
[control-plane-check] kube-controller-manager is not healthy after 4m0.001405066s

A control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI.
Here is one example how you may list all running Kubernetes containers by using crictl:
	- 'crictl --runtime-endpoint unix:///run/containerd/containerd.sock ps -a | grep kube | grep -v pause'
	Once you have found the failing container, you can inspect its logs with:
	- 'crictl --runtime-endpoint unix:///run/containerd/containerd.sock logs CONTAINERID'

error execution phase wait-control-plane: failed while waiting for the control plane to start: [kube-apiserver check failed at https://20.48.5.133:6443/livez: Get "https://lb.kubesphere.local:6443/livez?timeout=10s": dial tcp 20.48.5.133:6443: connect: connection refused, kube-scheduler check failed at https://0.0.0.0:10259/livez: Get "https://0.0.0.0:10259/livez": dial tcp 0.0.0.0:10259: connect: connection refused, kube-controller-manager check failed at https://0.0.0.0:10257/healthz: Get "https://0.0.0.0:10257/healthz": dial tcp 0.0.0.0:10257: connect: connection refused]
To see the stack trace of this error execute with --v=5 or higher
08:53:03 CST stdout: [host]
[reset] Reading configuration from the "kubeadm-config" ConfigMap in namespace "kube-system"...
[reset] Use 'kubeadm init phase upload-config --config your-config-file' to re-upload it.
W0515 08:53:01.090705    8437 reset.go:137] [reset] Unable to fetch the kubeadm-config ConfigMap from cluster: failed to get config map: Get "https://lb.kubesphere.local:6443/api/v1/namespaces/kube-system/configmaps/kubeadm-config?timeout=10s": dial tcp 20.48.5.133:6443: connect: connection refused
[preflight] Running pre-flight checks
W0515 08:53:01.090833    8437 removeetcdmember.go:106] [reset] No kubeadm config, using etcd pod spec to get data directory
[reset] Stopping the kubelet service
[reset] Unmounting mounted directories in "/var/lib/kubelet"
[reset] Deleting contents of directories: [/etc/kubernetes/manifests /var/lib/kubelet /etc/kubernetes/pki]
[reset] Deleting files: [/etc/kubernetes/admin.conf /etc/kubernetes/super-admin.conf /etc/kubernetes/kubelet.conf /etc/kubernetes/bootstrap-kubelet.conf /etc/kubernetes/controller-manager.conf /etc/kubernetes/scheduler.conf]

The reset process does not perform cleanup of CNI plugin configuration,
network filtering rules and kubeconfig files.

For information on how to perform this cleanup manually, please see:
    https://k8s.io/docs/reference/setup-tools/kubeadm/kubeadm-reset/
08:53:03 CST message: [host]
init kubernetes cluster failed: Failed to exec command: sudo -E /bin/bash -c "/usr/local/bin/kubeadm init --config=/etc/kubernetes/kubeadm-config.yaml --ignore-preflight-errors=FileExisting-crictl,ImagePull" 
W0515 08:48:52.270301   22698 common.go:101] your configuration file uses a deprecated API spec: "kubeadm.k8s.io/v1beta3" (kind: "ClusterConfiguration"). Please use 'kubeadm config migrate --old-config old-config-file --new-config new-config-file', which will write the new, similar spec using a newer API version.
W0515 08:48:52.271802   22698 common.go:101] your configuration file uses a deprecated API spec: "kubeadm.k8s.io/v1beta3" (kind: "InitConfiguration"). Please use 'kubeadm config migrate --old-config old-config-file --new-config new-config-file', which will write the new, similar spec using a newer API version.
W0515 08:48:52.273727   22698 utils.go:69] The recommended value for "clusterDNS" in "KubeletConfiguration" is: [10.233.0.10]; the provided value is: [169.254.25.10]
[init] Using Kubernetes version: v1.33.0
[preflight] Running pre-flight checks
	[WARNING SystemVerification]: cgroups v1 support is in maintenance mode, please migrate to cgroups v2
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action beforehand using 'kubeadm config images pull'
W0515 08:48:52.368232   22698 checks.go:846] detected that the sandbox image "dockerhub.kubekey.local/kubesphereio/pause:3.9" of the container runtime is inconsistent with that used by kubeadm.It is recommended to use "dockerhub.kubekey.local/kubesphereio/pause:3.10" as the CRI sandbox image.
	[WARNING ImagePull]: failed to pull image dockerhub.kubekey.local/kubesphereio/pause:3.10: failed to pull image dockerhub.kubekey.local/kubesphereio/pause:3.10: rpc error: code = NotFound desc = failed to pull and unpack image "dockerhub.kubekey.local/kubesphereio/pause:3.10": failed to resolve reference "dockerhub.kubekey.local/kubesphereio/pause:3.10": dockerhub.kubekey.local/kubesphereio/pause:3.10: not found
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [host host.cluster.local kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local lb.kubesphere.local localhost member1 member1.cluster.local] and IPs [10.233.0.1 20.48.5.133 127.0.0.1 20.48.1.130]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] External etcd mode: Skipping etcd/ca certificate authority generation
[certs] External etcd mode: Skipping etcd/server certificate generation
[certs] External etcd mode: Skipping etcd/peer certificate generation
[certs] External etcd mode: Skipping etcd/healthcheck-client certificate generation
[certs] External etcd mode: Skipping apiserver-etcd-client certificate generation
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "super-admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests"
[kubelet-check] Waiting for a healthy kubelet at http://127.0.0.1:10248/healthz. This can take up to 4m0s
[kubelet-check] The kubelet is healthy after 1.501424809s
[control-plane-check] Waiting for healthy control plane components. This can take up to 4m0s
[control-plane-check] Checking kube-apiserver at https://20.48.5.133:6443/livez
[control-plane-check] Checking kube-controller-manager at https://0.0.0.0:10257/healthz
[control-plane-check] Checking kube-scheduler at https://0.0.0.0:10259/livez
[control-plane-check] kube-apiserver is not healthy after 4m0.001364706s
[control-plane-check] kube-scheduler is not healthy after 4m0.001438136s
[control-plane-check] kube-controller-manager is not healthy after 4m0.001405066s

A control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI.
Here is one example how you may list all running Kubernetes containers by using crictl:
	- 'crictl --runtime-endpoint unix:///run/containerd/containerd.sock ps -a | grep kube | grep -v pause'
	Once you have found the failing container, you can inspect its logs with:
	- 'crictl --runtime-endpoint unix:///run/containerd/containerd.sock logs CONTAINERID'

error execution phase wait-control-plane: failed while waiting for the control plane to start: [kube-apiserver check failed at https://20.48.5.133:6443/livez: Get "https://lb.kubesphere.local:6443/livez?timeout=10s": dial tcp 20.48.5.133:6443: connect: connection refused, kube-scheduler check failed at https://0.0.0.0:10259/livez: Get "https://0.0.0.0:10259/livez": dial tcp 0.0.0.0:10259: connect: connection refused, kube-controller-manager check failed at https://0.0.0.0:10257/healthz: Get "https://0.0.0.0:10257/healthz": dial tcp 0.0.0.0:10257: connect: connection refused]
To see the stack trace of this error execute with --v=5 or higher: Process exited with status 1
08:53:03 CST retry: [host]
08:57:14 CST stdout: [host]
W0515 08:53:08.543796    8540 common.go:101] your configuration file uses a deprecated API spec: "kubeadm.k8s.io/v1beta3" (kind: "ClusterConfiguration"). Please use 'kubeadm config migrate --old-config old-config-file --new-config new-config-file', which will write the new, similar spec using a newer API version.
W0515 08:53:08.544785    8540 common.go:101] your configuration file uses a deprecated API spec: "kubeadm.k8s.io/v1beta3" (kind: "InitConfiguration"). Please use 'kubeadm config migrate --old-config old-config-file --new-config new-config-file', which will write the new, similar spec using a newer API version.
W0515 08:53:08.546786    8540 utils.go:69] The recommended value for "clusterDNS" in "KubeletConfiguration" is: [10.233.0.10]; the provided value is: [169.254.25.10]
[init] Using Kubernetes version: v1.33.0
[preflight] Running pre-flight checks
	[WARNING SystemVerification]: cgroups v1 support is in maintenance mode, please migrate to cgroups v2
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action beforehand using 'kubeadm config images pull'
W0515 08:53:08.645407    8540 checks.go:846] detected that the sandbox image "dockerhub.kubekey.local/kubesphereio/pause:3.9" of the container runtime is inconsistent with that used by kubeadm.It is recommended to use "dockerhub.kubekey.local/kubesphereio/pause:3.10" as the CRI sandbox image.
	[WARNING ImagePull]: failed to pull image dockerhub.kubekey.local/kubesphereio/pause:3.10: failed to pull image dockerhub.kubekey.local/kubesphereio/pause:3.10: rpc error: code = NotFound desc = failed to pull and unpack image "dockerhub.kubekey.local/kubesphereio/pause:3.10": failed to resolve reference "dockerhub.kubekey.local/kubesphereio/pause:3.10": dockerhub.kubekey.local/kubesphereio/pause:3.10: not found
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [host host.cluster.local kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local lb.kubesphere.local localhost member1 member1.cluster.local] and IPs [10.233.0.1 20.48.5.133 127.0.0.1 20.48.1.130]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] External etcd mode: Skipping etcd/ca certificate authority generation
[certs] External etcd mode: Skipping etcd/server certificate generation
[certs] External etcd mode: Skipping etcd/peer certificate generation
[certs] External etcd mode: Skipping etcd/healthcheck-client certificate generation
[certs] External etcd mode: Skipping apiserver-etcd-client certificate generation
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "super-admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests"
[kubelet-check] Waiting for a healthy kubelet at http://127.0.0.1:10248/healthz. This can take up to 4m0s
[kubelet-check] The kubelet is healthy after 1.50191261s
[control-plane-check] Waiting for healthy control plane components. This can take up to 4m0s
[control-plane-check] Checking kube-apiserver at https://20.48.5.133:6443/livez
[control-plane-check] Checking kube-controller-manager at https://0.0.0.0:10257/healthz
[control-plane-check] Checking kube-scheduler at https://0.0.0.0:10259/livez
[control-plane-check] kube-apiserver is not healthy after 4m0.001045575s
[control-plane-check] kube-controller-manager is not healthy after 4m0.000959595s
[control-plane-check] kube-scheduler is not healthy after 4m0.001149145s

A control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI.
Here is one example how you may list all running Kubernetes containers by using crictl:
	- 'crictl --runtime-endpoint unix:///run/containerd/containerd.sock ps -a | grep kube | grep -v pause'
	Once you have found the failing container, you can inspect its logs with:
	- 'crictl --runtime-endpoint unix:///run/containerd/containerd.sock logs CONTAINERID'

error execution phase wait-control-plane: failed while waiting for the control plane to start: [kube-apiserver check failed at https://20.48.5.133:6443/livez: Get "https://lb.kubesphere.local:6443/livez?timeout=10s": dial tcp 20.48.5.133:6443: connect: connection refused, kube-controller-manager check failed at https://0.0.0.0:10257/healthz: Get "https://0.0.0.0:10257/healthz": dial tcp 0.0.0.0:10257: connect: connection refused, kube-scheduler check failed at https://0.0.0.0:10259/livez: Get "https://0.0.0.0:10259/livez": dial tcp 0.0.0.0:10259: connect: connection refused]
To see the stack trace of this error execute with --v=5 or higher
08:57:17 CST stdout: [host]
[reset] Reading configuration from the "kubeadm-config" ConfigMap in namespace "kube-system"...
[reset] Use 'kubeadm init phase upload-config --config your-config-file' to re-upload it.
W0515 08:57:15.426257   59593 reset.go:137] [reset] Unable to fetch the kubeadm-config ConfigMap from cluster: failed to get config map: Get "https://lb.kubesphere.local:6443/api/v1/namespaces/kube-system/configmaps/kubeadm-config?timeout=10s": dial tcp 20.48.5.133:6443: connect: connection refused
[preflight] Running pre-flight checks
W0515 08:57:15.426359   59593 removeetcdmember.go:106] [reset] No kubeadm config, using etcd pod spec to get data directory
[reset] Stopping the kubelet service
[reset] Unmounting mounted directories in "/var/lib/kubelet"
[reset] Deleting contents of directories: [/etc/kubernetes/manifests /var/lib/kubelet /etc/kubernetes/pki]
[reset] Deleting files: [/etc/kubernetes/admin.conf /etc/kubernetes/super-admin.conf /etc/kubernetes/kubelet.conf /etc/kubernetes/bootstrap-kubelet.conf /etc/kubernetes/controller-manager.conf /etc/kubernetes/scheduler.conf]

The reset process does not perform cleanup of CNI plugin configuration,
network filtering rules and kubeconfig files.

For information on how to perform this cleanup manually, please see:
    https://k8s.io/docs/reference/setup-tools/kubeadm/kubeadm-reset/
08:57:17 CST message: [host]
init kubernetes cluster failed: Failed to exec command: sudo -E /bin/bash -c "/usr/local/bin/kubeadm init --config=/etc/kubernetes/kubeadm-config.yaml --ignore-preflight-errors=FileExisting-crictl,ImagePull" 
W0515 08:53:08.543796    8540 common.go:101] your configuration file uses a deprecated API spec: "kubeadm.k8s.io/v1beta3" (kind: "ClusterConfiguration"). Please use 'kubeadm config migrate --old-config old-config-file --new-config new-config-file', which will write the new, similar spec using a newer API version.
W0515 08:53:08.544785    8540 common.go:101] your configuration file uses a deprecated API spec: "kubeadm.k8s.io/v1beta3" (kind: "InitConfiguration"). Please use 'kubeadm config migrate --old-config old-config-file --new-config new-config-file', which will write the new, similar spec using a newer API version.
W0515 08:53:08.546786    8540 utils.go:69] The recommended value for "clusterDNS" in "KubeletConfiguration" is: [10.233.0.10]; the provided value is: [169.254.25.10]
[init] Using Kubernetes version: v1.33.0
[preflight] Running pre-flight checks
	[WARNING SystemVerification]: cgroups v1 support is in maintenance mode, please migrate to cgroups v2
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action beforehand using 'kubeadm config images pull'
W0515 08:53:08.645407    8540 checks.go:846] detected that the sandbox image "dockerhub.kubekey.local/kubesphereio/pause:3.9" of the container runtime is inconsistent with that used by kubeadm.It is recommended to use "dockerhub.kubekey.local/kubesphereio/pause:3.10" as the CRI sandbox image.
	[WARNING ImagePull]: failed to pull image dockerhub.kubekey.local/kubesphereio/pause:3.10: failed to pull image dockerhub.kubekey.local/kubesphereio/pause:3.10: rpc error: code = NotFound desc = failed to pull and unpack image "dockerhub.kubekey.local/kubesphereio/pause:3.10": failed to resolve reference "dockerhub.kubekey.local/kubesphereio/pause:3.10": dockerhub.kubekey.local/kubesphereio/pause:3.10: not found
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [host host.cluster.local kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local lb.kubesphere.local localhost member1 member1.cluster.local] and IPs [10.233.0.1 20.48.5.133 127.0.0.1 20.48.1.130]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] External etcd mode: Skipping etcd/ca certificate authority generation
[certs] External etcd mode: Skipping etcd/server certificate generation
[certs] External etcd mode: Skipping etcd/peer certificate generation
[certs] External etcd mode: Skipping etcd/healthcheck-client certificate generation
[certs] External etcd mode: Skipping apiserver-etcd-client certificate generation
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "super-admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests"
[kubelet-check] Waiting for a healthy kubelet at http://127.0.0.1:10248/healthz. This can take up to 4m0s
[kubelet-check] The kubelet is healthy after 1.50191261s
[control-plane-check] Waiting for healthy control plane components. This can take up to 4m0s
[control-plane-check] Checking kube-apiserver at https://20.48.5.133:6443/livez
[control-plane-check] Checking kube-controller-manager at https://0.0.0.0:10257/healthz
[control-plane-check] Checking kube-scheduler at https://0.0.0.0:10259/livez
[control-plane-check] kube-apiserver is not healthy after 4m0.001045575s
[control-plane-check] kube-controller-manager is not healthy after 4m0.000959595s
[control-plane-check] kube-scheduler is not healthy after 4m0.001149145s

A control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI.
Here is one example how you may list all running Kubernetes containers by using crictl:
	- 'crictl --runtime-endpoint unix:///run/containerd/containerd.sock ps -a | grep kube | grep -v pause'
	Once you have found the failing container, you can inspect its logs with:
	- 'crictl --runtime-endpoint unix:///run/containerd/containerd.sock logs CONTAINERID'

error execution phase wait-control-plane: failed while waiting for the control plane to start: [kube-apiserver check failed at https://20.48.5.133:6443/livez: Get "https://lb.kubesphere.local:6443/livez?timeout=10s": dial tcp 20.48.5.133:6443: connect: connection refused, kube-controller-manager check failed at https://0.0.0.0:10257/healthz: Get "https://0.0.0.0:10257/healthz": dial tcp 0.0.0.0:10257: connect: connection refused, kube-scheduler check failed at https://0.0.0.0:10259/livez: Get "https://0.0.0.0:10259/livez": dial tcp 0.0.0.0:10259: connect: connection refused]
To see the stack trace of this error execute with --v=5 or higher: Process exited with status 1
08:57:17 CST failed: [host]
Describe the solution you'd like

执行如下安装部署的时候报错： ./kk create cluster -f config-sample.yaml -a kubesphere.tar.gz --with-local-storage
Additional information

No response
May 15 '25 01:05 GalenGao
日志显示， kube-apiserver 容器启动失败。用他说的这两个命令排查一下 Here is one example how you may list all running Kubernetes containers by using crictl: - 'crictl --runtime-endpoint unix:///run/containerd/containerd.sock ps -a | grep kube | grep -v pause' Once you have found the failing container, you can inspect its logs with: - 'crictl --runtime-endpoint unix:///run/containerd/containerd.sock logs CONTAINERID'
May 15 '25 06:05 redscholar