Node become NotReady after vagrant reload
Hi, thanks for your work, I'm using the latest version of this repo and can worked when vagrant up, but after vagrant reload, node-01 and node-02 become not ready, and I found the log of kubelet container in node-02:
E0510 11:47:24.151857 1236 event.go:209] Unable to write event: 'Post https://__MASTER_IP__:443/api/v1/namespaces/default/events: dial tcp: lookup __MASTER_IP__ on 10.0.2.3:53: server misbehaving' (may retry after sleeping)
E0510 11:47:24.363225 1236 reflector.go:205] k8s.io/kubernetes/pkg/kubelet/kubelet.go:451: Failed to list *v1.Service: Get https://__MASTER_IP__:443/api/v1/services?limit=500&resourceVersion=0: dial tcp: lookup __MASTER_IP__ on 10.0.2.3:53: server misbehaving
It seems that the var is not replaced by the real value.
@liubin I just tried with the following instructions and everything seems OK:
$ NODES=2 vagrant halt
$ NODES=2 vagrant up
This is equivalent to NODES=2 vagrant reload. Can you please provide the exact instructions you followed since creating the cluster?
I only did some vagrant reload or vagrant halt & vagrant up.
After some watches, I think the problem may be that the kubelet container started earlier than the MASTER_IP's replace.
I cant see the file /etc/kubernetes/node-kubeconfig.yaml has the correct ip of master, but kubelet's log show that it is still using the MASTER_IP, after restart the kubelet by docker restartt kubelet, the node becomes ready status.