kubekey icon indicating copy to clipboard operation
kubekey copied to clipboard

install k8s 1.23 failed

Open slzzz opened this issue 3 years ago • 13 comments

What is version of KubeKey has the issue?

1.17.3

What is your os environment?

centos7.6

KubeKey config file

No response

A clear and concise description of what happend.

images 404 17:22:00 CST [PullModule] Start to pull images on all nodes 17:22:00 CST message: [node1] downloading image: registry.cn-beijing.aliyuncs.com/kubesphereio/pause:3.6 17:22:00 CST message: [node2] downloading image: registry.cn-beijing.aliyuncs.com/kubesphereio/pause:3.6 17:22:02 CST message: [node2] pull image failed: Failed to exec command: sudo -E /bin/bash -c "env PATH=$PATH docker pull registry.cn-beijing.aliyuncs.com/kubesphereio/pause:3.6" Error response from daemon: manifest for registry.cn-beijing.aliyuncs.com/kubesphereio/pause:3.6 not found: manifest unknown: manifest unknown: Process exited with status 1 17:22:02 CST retry: [node2] 17:22:02 CST message: [node1] pull image failed: Failed to exec command: sudo -E /bin/bash -c "env PATH=$PATH docker pull registry.cn-beijing.aliyuncs.com/kubesphereio/pause:3.6" Error response from daemon: manifest for registry.cn-beijing.aliyuncs.com/kubesphereio/pause:3.6 not found: manifest unknown: manifest unknown: Process exited with status 1 17:22:02 CST retry: [node1] 17:22:07 CST message: [node2] downloading image: registry.cn-beijing.aliyuncs.com/kubesphereio/pause:3.6 17:22:07 CST message: [node1] downloading image: registry.cn-beijing.aliyuncs.com/kubesphereio/pause:3.6 17:22:08 CST message: [node2] pull image failed: Failed to exec command: sudo -E /bin/bash -c "env PATH=$PATH docker pull registry.cn-beijing.aliyuncs.com/kubesphereio/pause:3.6" Error response from daemon: manifest for registry.cn-beijing.aliyuncs.com/kubesphereio/pause:3.6 not found: manifest unknown: manifest unknown: Process exited with status 1 17:22:08 CST retry: [node2] 17:22:08 CST message: [node1] pull image failed: Failed to exec command: sudo -E /bin/bash -c "env PATH=$PATH docker pull registry.cn-beijing.aliyuncs.com/kubesphereio/pause:3.6" Error response from daemon: manifest for registry.cn-beijing.aliyuncs.com/kubesphereio/pause:3.6 not found: manifest unknown: manifest unknown: Process exited with status 1 17:22:08 CST retry: [node1] 17:22:13 CST message: [node2] downloading image: registry.cn-beijing.aliyuncs.com/kubesphereio/pause:3.6 17:22:13 CST message: [node1] downloading image: registry.cn-beijing.aliyuncs.com/kubesphereio/pause:3.6 17:22:14 CST message: [node2] pull image failed: Failed to exec command: sudo -E /bin/bash -c "env PATH=$PATH docker pull registry.cn-beijing.aliyuncs.com/kubesphereio/pause:3.6" Error response from daemon: manifest for registry.cn-beijing.aliyuncs.com/kubesphereio/pause:3.6 not found: manifest unknown: manifest unknown: Process exited with status 1 17:22:14 CST retry: [node2] 17:22:14 CST message: [node1] pull image failed: Failed to exec command: sudo -E /bin/bash -c "env PATH=$PATH docker pull registry.cn-beijing.aliyuncs.com/kubesphereio/pause:3.6" Error response from daemon: manifest for registry.cn-beijing.aliyuncs.com/kubesphereio/pause:3.6 not found: manifest unknown: manifest unknown: Process exited with status 1 17:22:14 CST retry: [node1] 17:22:14 CST failed: [node2] 17:22:14 CST failed: [node1] error: Pipeline[CreateClusterPipeline] execute failed: Module[PullModule] exec failed: failed: [node2] [PullImages] exec failed after 3 retires: pull image failed: Failed to exec command: sudo -E /bin/bash -c "env PATH=$PATH docker pull registry.cn-beijing.aliyuncs.com/kubesphereio/pause:3.6" Error response from daemon: manifest for registry.cn-beijing.aliyuncs.com/kubesphereio/pause:3.6 not found: manifest unknown: manifest unknown: Process exited with status 1 failed: [node1] [PullImages] exec failed after 3 retires: pull image failed: Failed to exec command: sudo -E /bin/bash -c "env PATH=$PATH docker pull registry.cn-beijing.aliyuncs.com/kubesphereio/pause:3.6" Error response from daemon: manifest for registry.cn-beijing.aliyuncs.com/kubesphereio/pause:3.6 not found: manifest unknown: manifest unknown: Process exited with status 1

Relevant log output

No response

Additional information

No response

slzzz avatar Dec 17 '21 09:12 slzzz

The images is synchronized, you can try again.

registry.cn-beijing.aliyuncs.com/kubesphereio/pause:3.6
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-apiserver:v1.23.0
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-controller-manager:v1.23.0
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-scheduler:v1.23.0
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-proxy:v1.23.0

pixiake avatar Dec 17 '21 13:12 pixiake

The images is synchronized, you can try again.

registry.cn-beijing.aliyuncs.com/kubesphereio/pause:3.6
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-apiserver:v1.23.0
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-controller-manager:v1.23.0
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-scheduler:v1.23.0
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-proxy:v1.23.0

3q,运行成功了,为什么在1.23版本还有docker存在呢

slzzz avatar Dec 20 '21 10:12 slzzz

The images is synchronized, you can try again.

registry.cn-beijing.aliyuncs.com/kubesphereio/pause:3.6
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-apiserver:v1.23.0
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-controller-manager:v1.23.0
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-scheduler:v1.23.0
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-proxy:v1.23.0

3q,运行成功了,为什么在1.23版本还有docker存在呢 runtime还是docker System Info: Machine ID: 24c86563b2ff45a6abafe73bb089e42e System UUID: 7C954085-22E5-8044-8539-C376191E5676 Boot ID: 98f4fe25-e670-412b-8535-1a462b6b41db Kernel Version: 3.10.0-1062.el7.x86_64 OS Image: CentOS Linux 7 (Core) Operating System: linux Architecture: amd64 Container Runtime Version: docker://20.10.8

slzzz avatar Dec 20 '21 11:12 slzzz

The images is synchronized, you can try again.

registry.cn-beijing.aliyuncs.com/kubesphereio/pause:3.6
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-apiserver:v1.23.0
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-controller-manager:v1.23.0
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-scheduler:v1.23.0
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-proxy:v1.23.0

3q,运行成功了,为什么在1.23版本还有docker存在呢 runtime还是docker System Info: Machine ID: 24c86563b2ff45a6abafe73bb089e42e System UUID: 7C954085-22E5-8044-8539-C376191E5676 Boot ID: 98f4fe25-e670-412b-8535-1a462b6b41db Kernel Version: 3.10.0-1062.el7.x86_64 OS Image: CentOS Linux 7 (Core) Operating System: linux Architecture: amd64 Container Runtime Version: docker://20.10.8

https://kubernetes.io/blog/2021/11/12/are-you-ready-for-dockershim-removal/

If you want to use containerd as runtime, you can specify --container-manager containerd

pixiake avatar Dec 20 '21 15:12 pixiake

The images is synchronized, you can try again.

registry.cn-beijing.aliyuncs.com/kubesphereio/pause:3.6
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-apiserver:v1.23.0
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-controller-manager:v1.23.0
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-scheduler:v1.23.0
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-proxy:v1.23.0

3q,运行成功了,为什么在1.23版本还有docker存在呢 runtime还是docker System Info: Machine ID: 24c86563b2ff45a6abafe73bb089e42e System UUID: 7C954085-22E5-8044-8539-C376191E5676 Boot ID: 98f4fe25-e670-412b-8535-1a462b6b41db Kernel Version: 3.10.0-1062.el7.x86_64 OS Image: CentOS Linux 7 (Core) Operating System: linux Architecture: amd64 Container Runtime Version: docker://20.10.8

https://kubernetes.io/blog/2021/11/12/are-you-ready-for-dockershim-removal/

If you want to use containerd as runtime, you can specify --container-manager containerd

Do I need to manually modify the kublet startup parameters? Will subsequent versions be automated?

slzzz avatar Dec 22 '21 02:12 slzzz

The images is synchronized, you can try again.

registry.cn-beijing.aliyuncs.com/kubesphereio/pause:3.6
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-apiserver:v1.23.0
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-controller-manager:v1.23.0
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-scheduler:v1.23.0
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-proxy:v1.23.0

3q,运行成功了,为什么在1.23版本还有docker存在呢 runtime还是docker System Info: Machine ID: 24c86563b2ff45a6abafe73bb089e42e System UUID: 7C954085-22E5-8044-8539-C376191E5676 Boot ID: 98f4fe25-e670-412b-8535-1a462b6b41db Kernel Version: 3.10.0-1062.el7.x86_64 OS Image: CentOS Linux 7 (Core) Operating System: linux Architecture: amd64 Container Runtime Version: docker://20.10.8

https://kubernetes.io/blog/2021/11/12/are-you-ready-for-dockershim-removal/ If you want to use containerd as runtime, you can specify --container-manager containerd

Do I need to manually modify the kublet startup parameters? Will subsequent versions be automated?

  1. Use the command to delete the current cluster ./kk delete cluster -f config.yaml
  2. Download the latest master branch source code and build it. (We fix a command-line flag bug yesterday)
  3. Use the command to create a new cluster that you wanted ./kk create cluster -f config.yaml --container-manager containerd

24sama avatar Dec 22 '21 02:12 24sama

The images is synchronized, you can try again.

registry.cn-beijing.aliyuncs.com/kubesphereio/pause:3.6
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-apiserver:v1.23.0
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-controller-manager:v1.23.0
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-scheduler:v1.23.0
registry.cn-beijing.aliyuncs.com/kubesphereio/kube-proxy:v1.23.0

3q,运行成功了,为什么在1.23版本还有docker存在呢 runtime还是docker System Info: Machine ID: 24c86563b2ff45a6abafe73bb089e42e System UUID: 7C954085-22E5-8044-8539-C376191E5676 Boot ID: 98f4fe25-e670-412b-8535-1a462b6b41db Kernel Version: 3.10.0-1062.el7.x86_64 OS Image: CentOS Linux 7 (Core) Operating System: linux Architecture: amd64 Container Runtime Version: docker://20.10.8

https://kubernetes.io/blog/2021/11/12/are-you-ready-for-dockershim-removal/ If you want to use containerd as runtime, you can specify --container-manager containerd

Do I need to manually modify the kublet startup parameters? Will subsequent versions be automated?

  1. Use the command to delete the current cluster ./kk delete cluster -f config.yaml
  2. Download the latest master branch source code and build it. (We fix a command-line flag bug yesterday)
  3. Use the command to create a new cluster that you wanted ./kk create cluster -f config.yaml --container-manager containerd

create cluster is failed.

wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s [kubelet-check] Initial timeout of 40s passed.

Unfortunately, an error has occurred:
	timed out waiting for the condition

This error is likely caused by:
	- The kubelet is not running
	- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
	- 'systemctl status kubelet'
	- 'journalctl -xeu kubelet'

Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI.

Here is one example how you may list all Kubernetes containers running in cri-o/containerd using crictl:
	- 'crictl --runtime-endpoint unix:///run/containerd/containerd.sock ps -a | grep kube | grep -v pause'
	Once you have found the failing container, you can inspect its logs with:
	- 'crictl --runtime-endpoint unix:///run/containerd/containerd.sock logs CONTAINERID'

error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster To see the stack trace of this error execute with --v=5 or higher: Process exited with status 1

slzzz avatar Dec 23 '21 06:12 slzzz

22643 server.go:205] "Failed to load kubelet config file" err="failed to load Kubelet config file /var/lib/kubelet/config.yaml, error failed to read kubelet config file "/var/lib/kubelet/config.yaml", error: open /var/lib/kubelet/config.yaml: no such file or directory

slzzz avatar Dec 23 '21 06:12 slzzz

That looks like there are some residual files on your host. I try again on a clean host. And everything is OK. So, could you please install it again on a clean host like me?

24sama avatar Dec 23 '21 06:12 24sama

hi, pls check your containerd log. I'm facing the same issue if you find this error "apparmor_parser": executable file not found in $PATH in your log file, pls install apparmor-parser

chaunceyjiang avatar Dec 23 '21 09:12 chaunceyjiang

thks, I tried to clean up the containerd and reinstall it, it succeeded

slzzz avatar Dec 24 '21 09:12 slzzz

Hi @chaunceyjiang @24sama ,

I just downloaded the KubeKey v2.0.0-alpha.3 and installed it with Kubernetes 1.23, it returns the following error:

22:00:57 CST message: [LocalHost]
No SHA256 found for kubeadm. v1.23 is not supported.
22:00:57 CST retry: [LocalHost]
22:00:57 CST failed: [LocalHost]
error: Pipeline[CreateClusterPipeline] execute failed: Module[NodeBinariesModule] exec failed:
failed: [LocalHost] [DownloadBinaries] exec failed after 1 retires: No SHA256 found for kubeadm. v1.23 is not supported.

How to deal with it?

FeynmanZhou avatar Dec 25 '21 14:12 FeynmanZhou

Resolved, the specified K8s version should be v1.23.0 instead of v1.23.

FeynmanZhou avatar Dec 25 '21 14:12 FeynmanZhou