sealos icon indicating copy to clipboard operation
sealos copied to clipboard

虚拟机重启后k8s集群没有正常启动

Open yasyx opened this issue 3 years ago • 3 comments
trafficstars

三台虚拟机,1个master 2个node,虚拟机关闭再启动之后,k8s集群没有正常启动

  1. 执行kubectl get nodes 报错 Unable to connect to the server: dial tcp: lookup apiserver.cluster.local on 127.0.0.53:53: server misbehaving
  2. 执行 nerdctl ps -a --namespace k8s.io ,发现部分容器没有启动

image

yasyx avatar Aug 25 '22 07:08 yasyx

/etc/hosts apiserver.cluster.local is miss

cuisongliu avatar Aug 26 '22 02:08 cuisongliu

hosts 文件中有 apiserver.cluster.local 这条记录。 但还是有这个问题。

HereHaveAnPeople avatar Aug 29 '22 09:08 HereHaveAnPeople

后面做个feature 补齐丢失的hosts的问题

cuisongliu avatar Sep 02 '22 06:09 cuisongliu

https://github.com/labring/sealos/issues/1894

cuisongliu avatar Oct 20 '22 14:10 cuisongliu

add this in /etc/hosts in master can solve this problem

xxxxx(your local ip) sealos.hub
xxxx(your local ip)  apiserver.cluster.local

and run systemctl restart kubelet

root@yyj-master1:/home/ubuntu# crictl ps 
CONTAINER           IMAGE               CREATED             STATE               NAME                        ATTEMPT             POD ID              POD
a4181e96588f4       75392e3500e36       3 minutes ago       Running             calico-node                 1                   4aa9ed07ebf9f       calico-node-sr7ql
b13e459e63cad       417ab3368bad1       3 minutes ago       Running             csi-node-driver-registrar   1                   a05a13bb6ee05       csi-node-driver-cqg5l
8babe1f11c9f7       f9c3c1813269c       3 minutes ago       Running             calico-kube-controllers     1                   e3ddfb9051ebf       calico-kube-controllers-85666c5b94-59vhq
43c4440450318       5185b96f0becf       3 minutes ago       Running             coredns                     1                   ae1d2ef40e7d4       coredns-565d847f94-9nw54
23404b638cfcd       5185b96f0becf       3 minutes ago       Running             coredns                     1                   534156c6f8fd6       coredns-565d847f94-ct57v
aba612a5ff59b       792ec15461e78       3 minutes ago       Running             calico-apiserver            1                   7e9e67bb582c0       calico-apiserver-5dcd979558-45qx5
3a534b4ce18b9       6a8c8f9f60dc6       3 minutes ago       Running             calico-csi                  1                   a05a13bb6ee05       csi-node-driver-cqg5l
3b12d9771a59e       58a9a0c6d96f2       3 minutes ago       Running             kube-proxy                  1                   89125db25f563       kube-proxy-gxb54
41bd8cf428985       a8a176a5d5d69       3 minutes ago       Running             etcd                        1                   45c955cea9f1a       etcd-yyj-master1
db2839f9d8b2c       1a54c86c03a67       3 minutes ago       Running             kube-controller-manager     1                   ee4802ff864f9       kube-controller-manager-yyj-master1
8b6efea0164ff       4d2edfd10d3e3       3 minutes ago       Running             kube-apiserver              1                   3060d422d3588       kube-apiserver-yyj-master1
14c575e87b438       bef2cf3115095       3 minutes ago       Running             kube-scheduler              1                   99b66de42651b       kube-scheduler-yyj-master1

@yasyx

xiao-jay avatar Oct 21 '22 02:10 xiao-jay

It seems that only virtual opportunities have this problem, which is caused by cloud-init.

cuisongliu avatar Oct 25 '22 14:10 cuisongliu

@cuisongliu yes but we need fixed it.

fanux avatar Oct 26 '22 02:10 fanux

https://cloud.tencent.com/document/product/213/34698

cuisongliu avatar Nov 07 '22 14:11 cuisongliu