kubernetes-the-hard-way-aws
kubernetes-the-hard-way-aws copied to clipboard
PLEG is not healthy: pleg was last seen active 42m53.338575586s ago
I created cluster using commands from master branch. Two nodes started reporting this error in kubelet logs. kubelet service status is active.
This happens when I deploy untrusted pod on the cluster
Log showing untrusted pod and PLEG error.
Jan 31 15:43:35 ip-10-240-0-20 kubelet[3072]: I0131 15:43:35.158465 3072 provider.go:116] Refreshing cache for provider: *credentialprovider.defaultDockerConfigProvider
Jan 31 15:43:35 ip-10-240-0-20 kubelet[3072]: I0131 15:43:35.414429 3072 kubelet.go:1953] SyncLoop (PLEG): "untrusted3_default(70a6b852-4440-11ea-9a3e-0624b25d9900)", event: &pleg.PodLifecycleEvent{ID:"70a6b852-4440-11ea-9a3e-0624b25d9900", Type:"ContainerStarted", Data:"8dcbaa8a513e8cd85b2c26e14b09f8204383b77cb4b603308fcd98af5dc0a76d"}
Jan 31 15:46:40 ip-10-240-0-20 kubelet[3072]: I0131 15:46:40.357340 3072 kubelet_node_status.go:446] Recording NodeNotReady event message for node ip-10-240-0-20
Jan 31 15:46:40 ip-10-240-0-20 kubelet[3072]: I0131 15:46:40.357386 3072 setters.go:518] Node became not ready: {Type:Ready Status:False LastHeartbeatTime:2020-01-31 15:46:40.357320576 +0000 UTC m=+6391.653089090 LastTransitionTime:2020-01-31 15:46:40.357320576 +0000 UTC m=+6391.653089090 Reason:KubeletNotReady Message:PLEG is not healthy: pleg was last seen active 3m4.945529133s ago; threshold is 3m0s}
k get nodes
NAME STATUS ROLES AGE VERSION
ip-10-240-0-20 Ready <none> 95m v1.13.4
ip-10-240-0-21 NotReady <none> 95m v1.13.4
ip-10-240-0-22 NotReady <none> 95m v1.13.4
I restarted worker services and kubelet started failing.
sudo systemctl restart containerd kubelet kube-proxy
Kubelet service status (after restart):
root@ip-10-240-0-21:/home/ubuntu# service kubelet status
● kubelet.service - Kubernetes Kubelet
Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; vendor preset: enabled)
Active: activating (auto-restart) (Result: exit-code) since Fri 2020-01-31 15:38:51 UTC; 236ms ag
Docs: https://github.com/kubernetes/kubernetes
Process: 8935 ExecStart=/usr/local/bin/kubelet --config=/var/lib/kubelet/kubelet-config.yaml --con
Main PID: 8935 (code=exited, status=255)
Jan 31 15:38:51 ip-10-240-0-21 systemd[1]: kubelet.service: Unit entered failed state.
Jan 31 15:38:51 ip-10-240-0-21 systemd[1]: kubelet.service: Failed with result 'exit-code'.
containerd status: active
root@ip-10-240-0-21:/home/ubuntu# service containerd status
● containerd.service - containerd container runtime
Loaded: loaded (/etc/systemd/system/containerd.service; enabled; vendor preset: enabled)
Active: active (running) since Fri 2020-01-31 15:37:59 UTC; 1min 24s ago
kube-proxy status: active
root@ip-10-240-0-21:/home/ubuntu# service kube-proxy status
● kube-proxy.service - Kubernetes Kube Proxy
Loaded: loaded (/etc/systemd/system/kube-proxy.service; enabled; vendor preset: enabled)
Active: active (running) since Fri 2020-01-31 15:37:59 UTC; 1min 46s ago