coreos-kubernetes
coreos-kubernetes copied to clipboard
kube-dns and dashboard get into a crash loop
I've deployed the DNS and Dashboard Add-on, according to Step 5: Deploy Add-ons, but they get into a crash loop. My installation procedure is the same as Manual Installation except for adding "--storage-backend=etcd2" and "--storage-media-type=application/json" to kube-apiserver.yaml because apiserver pod periodically restarts.
details as the below:
hardware configuration: bare metal VM(KVM)
master01 ~ # cat /etc/*release*
DISTRIB_ID="Container Linux by CoreOS"
DISTRIB_RELEASE=1353.7.0
DISTRIB_CODENAME="Ladybug"
DISTRIB_DESCRIPTION="Container Linux by CoreOS 1353.7.0 (Ladybug)"
NAME="Container Linux by CoreOS"
ID=coreos
VERSION=1353.7.0
VERSION_ID=1353.7.0
BUILD_ID=2017-04-26-2154
PRETTY_NAME="Container Linux by CoreOS 1353.7.0 (Ladybug)"
ANSI_COLOR="38;5;75"
HOME_URL="https://coreos.com/"
BUG_REPORT_URL="https://issues.coreos.com"
master01 ~ # kubectl version
Client Version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.1", GitCommit:"b0b7a323cc5a4a2019b2e9520c21c7830b7f708e", GitTreeState:"clean", BuildDate:"2017-04-03T20:44:38Z", GoVersion:"go1.7.5", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.1+coreos.0", GitCommit:"9212f77ed8c169a0afa02e58dce87913c6387b3e", GitTreeState:"clean", BuildDate:"2017-04-04T00:32:53Z", GoVersion:"go1.7.5", Compiler:"gc", Platform:"linux/amd64"}
master01 ~ # kubectl get pods --namespace=kube-system -o wide
NAME READY STATUS RESTARTS AGE IP NODE
kube-apiserver-172.30.1.100 1/1 Running 0 3m 172.30.1.100 172.30.1.100
kube-controller-manager-172.30.1.100 1/1 Running 7 3h 172.30.1.100 172.30.1.100
kube-dns-v20-gg303 2/3 CrashLoopBackOff 56 1h 10.2.68.2 172.30.1.101
kube-proxy-172.30.1.100 1/1 Running 3 3h 172.30.1.100 172.30.1.100
kube-proxy-172.30.1.101 1/1 Running 2 3h 172.30.1.101 172.30.1.101
kube-proxy-172.30.1.102 1/1 Running 2 3h 172.30.1.102 172.30.1.102
kube-proxy-172.30.1.103 1/1 Running 2 3h 172.30.1.103 172.30.1.103
kube-scheduler-172.30.1.100 1/1 Running 5 3h 172.30.1.100 172.30.1.100
kubernetes-dashboard-v1.6.0-p98p6 0/1 CrashLoopBackOff 24 1h 10.2.2.2 172.30.1.102
kube-dns logs
master01 ~ # kubectl describe pods kube-dns-v20-gg303 --namespace=kube-system
Name: kube-dns-v20-gg303
Namespace: kube-system
Node: 172.30.1.101/172.30.1.101
Start Time: Thu, 18 May 2017 20:13:40 +0900
Labels: k8s-app=kube-dns
version=v20
Annotations: kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicationController","namespace":"kube-system","name":"kube-dns-v20","uid":"0cb39011-3bbb-11e7-87e9-5254...
scheduler.alpha.kubernetes.io/critical-pod=
scheduler.alpha.kubernetes.io/tolerations=[{"key":"CriticalAddonsOnly", "operator":"Exists"}]
Status: Running
IP: 10.2.68.2
Controllers: ReplicationController/kube-dns-v20
Containers:
kubedns:
Container ID: docker://85b373f7ae5a106d995f47a2c8c4af6ff1d531f2033a193043e6e380a904a10c
Image: gcr.io/google_containers/kubedns-amd64:1.8
Image ID: docker-pullable://gcr.io/google_containers/kubedns-amd64@sha256:39264fd3c998798acdf4fe91c556a6b44f281b6c5797f464f92c3b561c8c808c
Ports: 10053/UDP, 10053/TCP
Args:
--domain=cluster.local.
--dns-port=10053
State: Running
Started: Thu, 18 May 2017 22:04:46 +0900
Last State: Terminated
Reason: Error
Exit Code: 137
Started: Mon, 01 Jan 0001 00:00:00 +0000
Finished: Thu, 18 May 2017 22:04:46 +0900
Ready: False
Restart Count: 30
Limits:
memory: 170Mi
Requests:
cpu: 100m
memory: 70Mi
Liveness: http-get http://:8080/healthz-kubedns delay=60s timeout=5s period=10s #success=1 #failure=5
Readiness: http-get http://:8081/readiness delay=3s timeout=5s period=10s #success=1 #failure=3
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-722hj (ro)
dnsmasq:
Container ID: docker://f298d29fb13e2c398c5736478f237c55b4117681d2fa7111f1aba86eb867a247
Image: gcr.io/google_containers/kube-dnsmasq-amd64:1.4
Image ID: docker-pullable://gcr.io/google_containers/kube-dnsmasq-amd64@sha256:a722df15c0cf87779aad8ba2468cf072dd208cb5d7cfcaedd90e66b3da9ea9d2
Ports: 53/UDP, 53/TCP
Args:
--cache-size=1000
--no-resolv
--server=127.0.0.1#10053
--log-facility=-
State: Running
Started: Thu, 18 May 2017 22:05:26 +0900
Last State: Terminated
Reason: Error
Exit Code: 137
Started: Mon, 01 Jan 0001 00:00:00 +0000
Finished: Thu, 18 May 2017 22:05:26 +0900
Ready: True
Restart Count: 30
Liveness: http-get http://:8080/healthz-dnsmasq delay=60s timeout=5s period=10s #success=1 #failure=5
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-722hj (ro)
healthz:
Container ID: docker://47b39617e5fbaa7ab3a01ecdb4d2fa666948dcbbff1a6c71854dc66fae603d2a
Image: gcr.io/google_containers/exechealthz-amd64:1.2
Image ID: docker-pullable://gcr.io/google_containers/exechealthz-amd64@sha256:503e158c3f65ed7399f54010571c7c977ade7fe59010695f48d9650d83488c0a
Port: 8080/TCP
Args:
--cmd=nslookup kubernetes.default.svc.cluster.local 127.0.0.1 >/dev/null
--url=/healthz-dnsmasq
--cmd=nslookup kubernetes.default.svc.cluster.local 127.0.0.1:10053 >/dev/null
--url=/healthz-kubedns
--port=8080
--quiet
State: Running
Started: Thu, 18 May 2017 21:57:28 +0900
Last State: Terminated
Reason: Completed
Exit Code: 0
Started: Mon, 01 Jan 0001 00:00:00 +0000
Finished: Thu, 18 May 2017 21:56:43 +0900
Ready: True
Restart Count: 2
Limits:
memory: 50Mi
Requests:
cpu: 10m
memory: 50Mi
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-722hj (ro)
Conditions:
Type Status
Initialized True
Ready False
PodScheduled True
Volumes:
default-token-722hj:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-722hj
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: <none>
Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
58m 58m 1 kubelet, 172.30.1.101 Normal SandboxChanged Pod sandbox changed, it will be killed and re-created.
58m 58m 1 kubelet, 172.30.1.101 spec.containers{healthz} Normal Pulled Container image "gcr.io/google_containers/exechealthz-amd64:1.2" already present on machine
.
.
.
51m 51m 1 kubelet, 172.30.1.101 spec.containers{dnsmasq} Normal Killing Killing container with id docker://26d92966bcc447662734359705db6f226fa5140b4e0937434c1113ab5883ac2d:pod "kube-dns-v20-gg303_kube-system(0cb52788-3bbb-11e7-87e9-5254000e34f9)" container "dnsmasq" is unhealthy, it will be killed and re-created.
50m 50m 1 kubelet, 172.30.1.101 spec.containers{dnsmasq} Normal Started Started container with id cd42a0e9a0e12547d61c661eb91b6e5d4afd22b10c00a86ad099295f8f71161f
50m 50m 1 kubelet, 172.30.1.101 spec.containers{dnsmasq} Normal Created Created container with id cd42a0e9a0e12547d61c661eb91b6e5d4afd22b10c00a86ad099295f8f71161f
50m 50m 1 kubelet, 172.30.1.101 spec.containers{kubedns} Normal Killing Killing container with id docker://d6010d6aba3b1c2fb918a6f35fdd2631ad77f968b1f2f560e04b0375ab82fe5a:pod "kube-dns-v20-gg303_kube-system(0cb52788-3bbb-11e7-87e9-5254000e34f9)" container "kubedns" is unhealthy, it will be killed and re-created.
50m 50m 1 kubelet, 172.30.1.101 spec.containers{kubedns} Normal Created Created container with id 7e8f43084708ce0ddf8dbb7ef0b4216fe186df8b90bb9048776c9435eebbe00c
50m 50m 1 kubelet, 172.30.1.101 spec.containers{kubedns} Normal Started Started container with id 7e8f43084708ce0ddf8dbb7ef0b4216fe186df8b90bb9048776c9435eebbe00c
49m 49m 1 kubelet, 172.30.1.101 spec.containers{dnsmasq} Normal Killing Killing container with id docker://cd42a0e9a0e12547d61c661eb91b6e5d4afd22b10c00a86ad099295f8f71161f:pod "kube-dns-v20-gg303_kube-system(0cb52788-3bbb-11e7-87e9-5254000e34f9)" container "dnsmasq" is unhealthy, it will be killed and re-created.
48m 48m 1 kubelet, 172.30.1.101 spec.containers{kubedns} Normal Killing Killing container with id docker://7e8f43084708ce0ddf8dbb7ef0b4216fe186df8b90bb9048776c9435eebbe00c:pod "kube-dns-v20-gg303_kube-system(0cb52788-3bbb-11e7-87e9-5254000e34f9)" container "kubedns" is unhealthy, it will be killed and re-created.
47m 47m 1 kubelet, 172.30.1.101 spec.containers{dnsmasq} Normal Killing Killing container with id docker://61be0b1575a99f2140b1baf98137af50ac49d2cd02b6d0221f594870301a8423:pod "kube-dns-v20-gg303_kube-system(0cb52788-3bbb-11e7-87e9-5254000e34f9)" container "dnsmasq" is unhealthy, it will be killed and re-created.
46m 45m 8 kubelet, 172.30.1.101 Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "kubedns" with CrashLoopBackOff: "Back-off 2m40s restarting failed container=kubedns pod=kube-dns-v20-gg303_kube-system(0cb52788-3bbb-11e7-87e9-5254000e34f9)"
44m 44m 1 kubelet, 172.30.1.101 Warning FailedSync Error syncing pod, skipping: [failed to "StartContainer" for "dnsmasq" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=dnsmasq pod=kube-dns-v20-gg303_kube-system(0cb52788-3bbb-11e7-87e9-5254000e34f9)", failed to "StartContainer" for "kubedns" with CrashLoopBackOff: "Back-off 2m40s restarting failed container=kubedns pod=kube-dns-v20-gg303_kube-system(0cb52788-3bbb-11e7-87e9-5254000e34f9)"]
44m 44m 6 kubelet, 172.30.1.101 Warning FailedSync Error syncing pod, skipping: [failed to "StartContainer" for "kubedns" with CrashLoopBackOff: "Back-off 2m40s restarting failed container=kubedns pod=kube-dns-v20-gg303_kube-system(0cb52788-3bbb-11e7-87e9-5254000e34f9)", failed to "StartContainer" for "dnsmasq" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=dnsmasq pod=kube-dns-v20-gg303_kube-system(0cb52788-3bbb-11e7-87e9-5254000e34f9)"]
41m 16m 5 kubelet, 172.30.1.101 Warning FailedSync Error syncing pod, skipping: [failed to "StartContainer" for "dnsmasq" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=dnsmasq pod=kube-dns-v20-gg303_kube-system(0cb52788-3bbb-11e7-87e9-5254000e34f9)", failed to "StartContainer" for "kubedns" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=kubedns pod=kube-dns-v20-gg303_kube-system(0cb52788-3bbb-11e7-87e9-5254000e34f9)"]
42m 15m 40 kubelet, 172.30.1.101 Warning FailedSync Error syncing pod, skipping: [failed to "StartContainer" for "kubedns" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=kubedns pod=kube-dns-v20-gg303_kube-system(0cb52788-3bbb-11e7-87e9-5254000e34f9)", failed to "StartContainer" for "dnsmasq" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=dnsmasq pod=kube-dns-v20-gg303_kube-system(0cb52788-3bbb-11e7-87e9-5254000e34f9)"]
58m 13m 14 kubelet, 172.30.1.101 spec.containers{dnsmasq} Normal Pulled Container image "gcr.io/google_containers/kube-dnsmasq-amd64:1.4" already present on machine
39m 12m 56 kubelet, 172.30.1.101 Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "kubedns" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=kubedns pod=kube-dns-v20-gg303_kube-system(0cb52788-3bbb-11e7-87e9-5254000e34f9)"
58m 9m 14 kubelet, 172.30.1.101 spec.containers{kubedns} Normal Pulled Container image "gcr.io/google_containers/kubedns-amd64:1.8" already present on machine
46m 9m 18 kubelet, 172.30.1.101 spec.containers{kubedns} Normal Killing (events with common reason combined)
48m 9m 20 kubelet, 172.30.1.101 spec.containers{kubedns} Normal Created (events with common reason combined)
48m 9m 20 kubelet, 172.30.1.101 spec.containers{kubedns} Normal Started (events with common reason combined)
43m 9m 59 kubelet, 172.30.1.101 Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "dnsmasq" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=dnsmasq pod=kube-dns-v20-gg303_kube-system(0cb52788-3bbb-11e7-87e9-5254000e34f9)"
46m 9m 227 kubelet, 172.30.1.101 spec.containers{kubedns} Warning BackOff Back-off restarting failed container
8m 8m 1 kubelet, 172.30.1.101 Normal SandboxChanged Pod sandbox changed, it will be killed and re-created.
.
.
.
2m 2m 1 kubelet, 172.30.1.101 spec.containers{kubedns} Normal Created Created container with id 1f74017422e1585ff27295056382736fc005befcb17886136e24d711e7587427
2m 2m 1 kubelet, 172.30.1.101 spec.containers{dnsmasq} Normal Started Started container with id 7f3b8ed565c7125e8e53c827b6eb5922e2bee30fc3f220505e1a322ec3d8b213
2m 2m 1 kubelet, 172.30.1.101 spec.containers{dnsmasq} Normal Killing Killing container with id docker://c11131b03e29341502a783713f103026a27ad3bcab39932906ba2d477593883c:pod "kube-dns-v20-gg303_kube-system(0cb52788-3bbb-11e7-87e9-5254000e34f9)" container "dnsmasq" is unhealthy, it will be killed and re-created.
2m 2m 1 kubelet, 172.30.1.101 spec.containers{dnsmasq} Normal Created Created container with id 7f3b8ed565c7125e8e53c827b6eb5922e2bee30fc3f220505e1a322ec3d8b213
8m 1m 5 kubelet, 172.30.1.101 spec.containers{kubedns} Normal Pulled Container image "gcr.io/google_containers/kubedns-amd64:1.8" already present on machine
1m 1m 1 kubelet, 172.30.1.101 spec.containers{kubedns} Normal Killing Killing container with id docker://1f74017422e1585ff27295056382736fc005befcb17886136e24d711e7587427:pod "kube-dns-v20-gg303_kube-system(0cb52788-3bbb-11e7-87e9-5254000e34f9)" container "kubedns" is unhealthy, it will be killed and re-created.
32s 32s 1 kubelet, 172.30.1.101 spec.containers{dnsmasq} Normal Killing Killing container with id docker://7f3b8ed565c7125e8e53c827b6eb5922e2bee30fc3f220505e1a322ec3d8b213:pod "kube-dns-v20-gg303_kube-system(0cb52788-3bbb-11e7-87e9-5254000e34f9)" container "dnsmasq" is unhealthy, it will be killed and re-created.
1m 32s 2 kubelet, 172.30.1.101 spec.containers{kubedns} Normal Created (events with common reason combined)
1m 32s 2 kubelet, 172.30.1.101 spec.containers{kubedns} Normal Started (events with common reason combined)
8m 32s 5 kubelet, 172.30.1.101 spec.containers{dnsmasq} Normal Pulled Container image "gcr.io/google_containers/kube-dnsmasq-amd64:1.4" already present on machine
master01 ~ # kubectl logs kube-dns-v20-gg303 -c kubedns --namespace=kube-system
I0518 13:09:09.956186 1 server.go:94] Using https://10.3.0.1:443 for kubernetes master, kubernetes API: <nil>
I0518 13:09:09.958748 1 server.go:99] v1.5.0-alpha.0.1651+7dcae5edd84f06-dirty
I0518 13:09:09.958772 1 server.go:101] FLAG: --alsologtostderr="false"
I0518 13:09:09.958793 1 server.go:101] FLAG: --dns-port="10053"
I0518 13:09:09.958836 1 server.go:101] FLAG: --domain="cluster.local."
I0518 13:09:09.958857 1 server.go:101] FLAG: --federations=""
I0518 13:09:09.958867 1 server.go:101] FLAG: --healthz-port="8081"
I0518 13:09:09.958885 1 server.go:101] FLAG: --kube-master-url=""
I0518 13:09:09.958895 1 server.go:101] FLAG: --kubecfg-file=""
I0518 13:09:09.958905 1 server.go:101] FLAG: --log-backtrace-at=":0"
I0518 13:09:09.958916 1 server.go:101] FLAG: --log-dir=""
I0518 13:09:09.958926 1 server.go:101] FLAG: --log-flush-frequency="5s"
I0518 13:09:09.959007 1 server.go:101] FLAG: --logtostderr="true"
I0518 13:09:09.959020 1 server.go:101] FLAG: --stderrthreshold="2"
I0518 13:09:09.959030 1 server.go:101] FLAG: --v="0"
I0518 13:09:09.959039 1 server.go:101] FLAG: --version="false"
I0518 13:09:09.959049 1 server.go:101] FLAG: --vmodule=""
I0518 13:09:09.959090 1 server.go:138] Starting SkyDNS server. Listening on port:10053
I0518 13:09:09.959184 1 server.go:145] skydns: metrics enabled on : /metrics:
I0518 13:09:09.959208 1 dns.go:166] Waiting for service: default/kubernetes
I0518 13:09:09.959534 1 logs.go:41] skydns: ready for queries on cluster.local. for tcp://0.0.0.0:10053 [rcache 0]
I0518 13:09:09.959554 1 logs.go:41] skydns: ready for queries on cluster.local. for udp://0.0.0.0:10053 [rcache 0]
I0518 13:09:39.959638 1 dns.go:172] Ignoring error while waiting for service default/kubernetes: Get https://10.3.0.1:443/api/v1/namespaces/default/services/kubernetes: dial tcp 10.3.0.1:443: i/o timeout. Sleeping 1s before retrying.
E0518 13:09:39.960119 1 reflector.go:214] pkg/dns/dns.go:154: Failed to list *api.Endpoints: Get https://10.3.0.1:443/api/v1/endpoints?resourceVersion=0: dial tcp 10.3.0.1:443: i/o timeout
E0518 13:09:39.960339 1 reflector.go:214] pkg/dns/dns.go:155: Failed to list *api.Service: Get https://10.3.0.1:443/api/v1/services?resourceVersion=0: dial tcp 10.3.0.1:443: i/o timeout
I0518 13:10:10.960181 1 dns.go:172] Ignoring error while waiting for service default/kubernetes: Get https://10.3.0.1:443/api/v1/namespaces/default/services/kubernetes: dial tcp 10.3.0.1:443: i/o timeout. Sleeping 1s before retrying.
E0518 13:10:10.960470 1 reflector.go:214] pkg/dns/dns.go:154: Failed to list *api.Endpoints: Get https://10.3.0.1:443/api/v1/endpoints?resourceVersion=0: dial tcp 10.3.0.1:443: i/o timeout
E0518 13:10:10.960532 1 reflector.go:214] pkg/dns/dns.go:155: Failed to list *api.Service: Get https://10.3.0.1:443/api/v1/services?resourceVersion=0: dial tcp 10.3.0.1:443: i/o timeout
I0518 13:10:16.832928 1 server.go:133] Received signal: terminated, will exit when the grace period ends
E0518 13:10:41.960854 1 reflector.go:214] pkg/dns/dns.go:155: Failed to list *api.Service: Get https://10.3.0.1:443/api/v1/services?resourceVersion=0: dial tcp 10.3.0.1:443: i/o timeout
I0518 13:10:41.960865 1 dns.go:172] Ignoring error while waiting for service default/kubernetes: Get https://10.3.0.1:443/api/v1/namespaces/default/services/kubernetes: dial tcp 10.3.0.1:443: i/o timeout. Sleeping 1s before retrying.
E0518 13:10:41.960938 1 reflector.go:214] pkg/dns/dns.go:154: Failed to list *api.Endpoints: Get https://10.3.0.1:443/api/v1/endpoints?resourceVersion=0: dial tcp 10.3.0.1:443: i/o timeout
kubernetes-dashboard logs
master01 ~ # kubectl describe pods kubernetes-dashboard-v1.6.0-p98p6 --namespace=kube-system
Name: kubernetes-dashboard-v1.6.0-p98p6
Namespace: kube-system
Node: 172.30.1.102/172.30.1.102
Start Time: Thu, 18 May 2017 20:30:44 +0900
Labels: k8s-app=kubernetes-dashboard
kubernetes.io/cluster-service=true
version=v1.6.0
Annotations: kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicationController","namespace":"kube-system","name":"kubernetes-dashboard-v1.6.0","uid":"6e6bd6a7-3bbd...
scheduler.alpha.kubernetes.io/critical-pod=
scheduler.alpha.kubernetes.io/tolerations=[{"key":"CriticalAddonsOnly", "operator":"Exists"}]
Status: Running
IP: 10.2.2.2
Controllers: ReplicationController/kubernetes-dashboard-v1.6.0
Containers:
kubernetes-dashboard:
Container ID: docker://ccda3aa5626ed514c2676481f55dff27f95395cc758d04764c219dc292487054
Image: gcr.io/google_containers/kubernetes-dashboard-amd64:v1.6.0
Image ID: docker-pullable://gcr.io/google_containers/kubernetes-dashboard-amd64@sha256:4ad64dfa7159ff4a99a65a4f96432f2fdb6542857cf230858b3159017833a882
Port: 9090/TCP
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Mon, 01 Jan 0001 00:00:00 +0000
Finished: Thu, 18 May 2017 22:12:12 +0900
Ready: False
Restart Count: 27
Limits:
cpu: 100m
memory: 50Mi
Requests:
cpu: 100m
memory: 50Mi
Liveness: http-get http://:9090/ delay=30s timeout=30s period=10s #success=1 #failure=3
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-722hj (ro)
Conditions:
Type Status
Initialized True
Ready False
PodScheduled True
Volumes:
default-token-722hj:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-722hj
Optional: false
QoS Class: Guaranteed
Node-Selectors: <none>
Tolerations: <none>
Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
59m 57m 13 kubelet, 172.30.1.102 Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "kubernetes-dashboard" with CrashLoopBackOff: "Back-off 2m40s restarting failed container=kubernetes-dashboard pod=kubernetes-dashboard-v1.6.0-p98p6_kube-system(6e6c7bae-3bbd-11e7-87e9-5254000e34f9)"
57m 57m 1 kubelet, 172.30.1.102 spec.containers{kubernetes-dashboard} Normal Created Created container with id d370ec3d54856b2ddc9c9eae663cd3250f875bab40b44568f6c4ebfb525c4036
.
.
.
40m 40m 1 kubelet, 172.30.1.102 spec.containers{kubernetes-dashboard} Normal Started Started container with id d370ec3d54856b2ddc9c9eae663cd3250f875bab40b44568f6c4ebfb525c4036
1h 17m 13 kubelet, 172.30.1.102 spec.containers{kubernetes-dashboard} Normal Pulled Container image "gcr.io/google_containers/kubernetes-dashboard-amd64:v1.6.0" already present on machine
34m 17m 4 kubelet, 172.30.1.102 spec.containers{kubernetes-dashboard} Normal Created (events with common reason combined)
34m 17m 4 kubelet, 172.30.1.102 spec.containers{kubernetes-dashboard} Normal Started (events with common reason combined)
56m 17m 165 kubelet, 172.30.1.102 Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "kubernetes-dashboard" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=kubernetes-dashboard pod=kubernetes-dashboard-v1.6.0-p98p6_kube-system(6e6c7bae-3bbd-11e7-87e9-5254000e34f9)"
1h 17m 194 kubelet, 172.30.1.102 spec.containers{kubernetes-dashboard} Warning BackOff Back-off restarting failed container
15m 15m 1 kubelet, 172.30.1.102 Normal SandboxChanged Pod sandbox changed, it will be killed and re-created.
15m 15m 1 kubelet, 172.30.1.102 spec.containers{kubernetes-dashboard} Normal Started Started container with id 8e8b5d57b22f1e5b77c90169d56de6168ffbe9bd2fec386b80f8ed46cca9402b
15m 15m 1 kubelet, 172.30.1.102 spec.containers{kubernetes-dashboard} Normal Created Created container with id 8e8b5d57b22f1e5b77c90169d56de6168ffbe9bd2fec386b80f8ed46cca9402b
.
.
.
6m 2s 30 kubelet, 172.30.1.102 Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "kubernetes-dashboard" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=kubernetes-dashboard pod=kubernetes-dashboard-v1.6.0-p98p6_kube-system(6e6c7bae-3bbd-11e7-87e9-5254000e34f9)"
15m 2s 58 kubelet, 172.30.1.102 spec.containers{kubernetes-dashboard} Warning BackOff Back-off restarting failed container
master01 ~ # kubectl logs kubernetes-dashboard-v1.6.0-p98p6 -c kubernetes-dashboard --namespace=kube-system
Using HTTP port: 9090
Creating API server client for https://10.3.0.1:443
Error while initializing connection to Kubernetes apiserver. This most likely means that the cluster is misconfigured (e.g., it has invalid apiserver certificates or service accounts configuration) or the --apiserver-host param points to a server that does not exist. Reason: Get https://10.3.0.1:443/version: dial tcp 10.3.0.1:443: i/o timeout
Refer to the troubleshooting guide for more information: https://github.com/kubernetes/dashboard/blob/master/docs/user-guide/troubleshooting.md
etc..
master01 ~ # systemctl status kubelet
kubelet.service
Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; vendor preset: disabled)
Active: active (running) since Thu 2017-05-18 21:56:57 KST; 47min ago
Process: 799 ExecStartPre=/usr/bin/rkt rm --uuid-file=/var/run/kubelet-pod.uuid (code=exited, status=254)
Process: 796 ExecStartPre=/usr/bin/mkdir -p /var/log/containers (code=exited, status=0/SUCCESS)
Process: 730 ExecStartPre=/usr/bin/mkdir -p /etc/kubernetes/manifests (code=exited, status=0/SUCCESS)
Main PID: 832 (kubelet)
Tasks: 18 (limit: 32768)
Memory: 794.0M
CPU: 54.591s
CGroup: /system.slice/kubelet.service
832 /kubelet --api-servers=http://127.0.0.1:8080 --register-schedulable=false --cni-conf-dir=/etc/kubernetes/cni/net.d --network-plugin=cni --container-runtime=docker --allow-privileged=true --pod-manifest-path=/etc/kubernetes/manifests --hostna
me-override=172.30.1.100 --cluster_dns=10.3.0.10 --cluster_domain=cluster.local
1383 journalctl -k -f
May 18 21:57:42 master01 kubelet-wrapper[832]: W0518 12:57:42.229140 832 conversion.go:110] Could not get instant cpu stats: different number of cpus
May 18 22:02:11 master01 kubelet-wrapper[832]: E0518 13:02:11.753371 832 container_manager_linux.go:638] error opening pid file /run/docker/libcontainerd/docker-containerd.pid: open /run/docker/libcontainerd/docker-containerd.pid: no such file or directory
May 18 22:07:11 master01 kubelet-wrapper[832]: E0518 13:07:11.774380 832 container_manager_linux.go:638] error opening pid file /run/docker/libcontainerd/docker-containerd.pid: open /run/docker/libcontainerd/docker-containerd.pid: no such file or directory
May 18 22:12:11 master01 kubelet-wrapper[832]: E0518 13:12:11.788341 832 container_manager_linux.go:638] error opening pid file /run/docker/libcontainerd/docker-containerd.pid: open /run/docker/libcontainerd/docker-containerd.pid: no such file or directory
May 18 22:17:11 master01 kubelet-wrapper[832]: E0518 13:17:11.801143 832 container_manager_linux.go:638] error opening pid file /run/docker/libcontainerd/docker-containerd.pid: open /run/docker/libcontainerd/docker-containerd.pid: no such file or directory
May 18 22:22:11 master01 kubelet-wrapper[832]: E0518 13:22:11.813504 832 container_manager_linux.go:638] error opening pid file /run/docker/libcontainerd/docker-containerd.pid: open /run/docker/libcontainerd/docker-containerd.pid: no such file or directory
May 18 22:27:11 master01 kubelet-wrapper[832]: E0518 13:27:11.825092 832 container_manager_linux.go:638] error opening pid file /run/docker/libcontainerd/docker-containerd.pid: open /run/docker/libcontainerd/docker-containerd.pid: no such file or directory
May 18 22:32:11 master01 kubelet-wrapper[832]: E0518 13:32:11.836884 832 container_manager_linux.go:638] error opening pid file /run/docker/libcontainerd/docker-containerd.pid: open /run/docker/libcontainerd/docker-containerd.pid: no such file or directory
May 18 22:37:11 master01 kubelet-wrapper[832]: E0518 13:37:11.848623 832 container_manager_linux.go:638] error opening pid file /run/docker/libcontainerd/docker-containerd.pid: open /run/docker/libcontainerd/docker-containerd.pid: no such file or directory
May 18 22:42:11 master01 kubelet-wrapper[832]: E0518 13:42:11.869606 832 container_manager_linux.go:638] error opening pid file /run/docker/libcontainerd/docker-containerd.pid: open /run/docker/libcontainerd/docker-containerd.pid: no such file or directory
master01 ~ # cat /var/log/containers/kube-apiserver-172.30.1.100_kube-system_kube-apiserver-b156099b99dd3542897773d74f1d5e425c5abe5c5e62d16adb42a635c77eda49.log
{"log":"E0518 12:57:19.215213 1 reflector.go:201] k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:70: Failed to list *api.ResourceQuota: Get https://localhost:443/api/v1/resourcequotas?resourceVersion=0: dial tcp [::1]:443: getsockopt: connection refused\n","stream":"stderr","time":"2017-05-18T12:57:19.215844426Z"}
{"log":"E0518 12:57:19.215544 1 reflector.go:201] k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:70: Failed to list *api.Secret: Get https://localhost:443/api/v1/secrets?resourceVersion=0: dial tcp [::1]:443: getsockopt: connection refused\n","stream":"stderr","time":"2017-05-18T12:57:19.215893429Z"}
{"log":"E0518 12:57:19.215767 1 reflector.go:201] k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:70: Failed to list *api.Namespace: Get https://localhost:443/api/v1/namespaces?resourceVersion=0: dial tcp [::1]:443: getsockopt: connection refused\n","stream":"stderr","time":"2017-05-18T12:57:19.215900977Z"}
{"log":"E0518 12:57:19.216006 1 reflector.go:201] k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:70: Failed to list *api.LimitRange: Get https://localhost:443/api/v1/limitranges?resourceVersion=0: dial tcp [::1]:443: getsockopt: connection refused\n","stream":"stderr","time":"2017-05-18T12:57:19.216215568Z"}
{"log":"E0518 12:57:19.216226 1 reflector.go:201] k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:70: Failed to list *api.ServiceAccount: Get https://localhost:443/api/v1/serviceaccounts?resourceVersion=0: dial tcp [::1]:443: getsockopt: connection refused\n","stream":"stderr","time":"2017-05-18T12:57:19.216377675Z"}
{"log":"E0518 12:57:19.216427 1 reflector.go:201] k8s.io/kubernetes/pkg/client/informers/informers_generated/internalversion/factory.go:70: Failed to list *storage.StorageClass: Get https://localhost:443/apis/storage.k8s.io/v1beta1/storageclasses?resourceVersion=0: dial tcp [::1]:443: getsockopt: connection refused\n","stream":"stderr","time":"2017-05-18T12:57:19.216627425Z"}
{"log":"[restful] 2017/05/18 12:57:19 log.go:30: [restful/swagger] listing is available at https://172.30.1.100:443/swaggerapi/\n","stream":"stderr","time":"2017-05-18T12:57:19.242919533Z"}
{"log":"[restful] 2017/05/18 12:57:19 log.go:30: [restful/swagger] https://172.30.1.100:443/swaggerui/ is mapped to folder /swagger-ui/\n","stream":"stderr","time":"2017-05-18T12:57:19.242957445Z"}
.
.
.
master01 ~ # ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 52:54:00:0e:34:f9 brd ff:ff:ff:ff:ff:ff
inet 172.30.1.100/24 brd 172.30.1.255 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::5054:ff:fe0e:34f9/64 scope link
valid_lft forever preferred_lft forever
3: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default
link/ether e6:9d:ba:c3:93:a6 brd ff:ff:ff:ff:ff:ff
inet 10.2.4.0/32 scope global flannel.1
valid_lft forever preferred_lft forever
inet6 fe80::e49d:baff:fec3:93a6/64 scope link
valid_lft forever preferred_lft forever
4: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
link/ether 02:42:4e:38:b2:65 brd ff:ff:ff:ff:ff:ff
inet 172.17.0.1/16 scope global docker0
valid_lft forever preferred_lft forever
Not sure if this helps, but we were also running into kubedns
pod crash loops after upgrading to Kubernetes 1.6 and were able to workaround it by using Calico. See the last ~10 or so commits in this fork that shows you everything we did: https://github.com/rook/coreos-kubernetes/commits/master
@jbw976 I followed instruction at https://github.com/rook/coreos-kubernetes/blob/master/Documentation/getting-started.md and used Calico. But, when I started kubelet, COMMAND attributes(docker ps) of all container(proxy, api server, controller and schedule) was "/pause"
@jbw976 I finally deployed a master node using Calico. But, "Set Up the CNI config (optional)" link is dead in "https://github.com/rook/coreos-kubernetes/blob/master/Documentation/deploy-workers.md". Do you know where is the "Set Up the CNI config (optional)" guide? I wish I would finish installation without any auto-configuration tools
Same problem here. Would prefer have flannel working before Calico
Have been attempting to solve this for a while now.
When I run describe kube-dns I get:
Williams-MacBook-Pro:KubeControl demonfuse$ kubectl describe service kube-dns --namespace=kube-system
Name: kube-dns
Namespace: kube-system
Labels: k8s-app=kube-dns
kubernetes.io/cluster-service=true
kubernetes.io/name=KubeDNS
Annotations: <none>
Selector: k8s-app=kube-dns
Type: ClusterIP
IP: 10.3.0.10
Port: dns 53/UDP
Endpoints:
Port: dns-tcp 53/TCP
Endpoints:
Session Affinity: None
Events: <none>
I noticed in the above there are no endpoints. Whereas describe service kubernetes I get endpoints.
Williams-MacBook-Pro:KubeControl demonfuse$ kubectl describe svc kubernetes
Name: kubernetes
Namespace: default
Labels: component=apiserver
provider=kubernetes
Annotations: <none>
Selector: <none>
Type: ClusterIP
IP: 10.3.0.1
Port: https 443/TCP
Endpoints: xx.xx.xx.xx:443
Session Affinity: ClientIP
Events: <none>
The logs for kubedns, dnsmasq, and healthz I noticed it's having trouble connecting to 10.3.0.1 > health reports nslookup: can't resolve
Williams-MacBook-Pro:KubeControl demonfuse$ kubectl logs --namespace=kube-system $(kubectl get pods --namespace=kube-system -l k8s-app=kube-dns -o name) -c kubedns
I0531 19:15:18.538655 1 server.go:94] Using https://10.3.0.1:443 for kubernetes master, kubernetes API: <nil>
I0531 19:15:18.539824 1 server.go:99] v1.5.0-alpha.0.1651+7dcae5edd84f06-dirty
I0531 19:15:18.539894 1 server.go:101] FLAG: --alsologtostderr="false"
I0531 19:15:18.539926 1 server.go:101] FLAG: --dns-port="10053"
I0531 19:15:18.540001 1 server.go:101] FLAG: --domain="cluster.local."
I0531 19:15:18.540029 1 server.go:101] FLAG: --federations=""
I0531 19:15:18.540051 1 server.go:101] FLAG: --healthz-port="8081"
I0531 19:15:18.540088 1 server.go:101] FLAG: --kube-master-url=""
I0531 19:15:18.540110 1 server.go:101] FLAG: --kubecfg-file=""
I0531 19:15:18.540128 1 server.go:101] FLAG: --log-backtrace-at=":0"
I0531 19:15:18.540165 1 server.go:101] FLAG: --log-dir=""
I0531 19:15:18.540189 1 server.go:101] FLAG: --log-flush-frequency="5s"
I0531 19:15:18.540210 1 server.go:101] FLAG: --logtostderr="true"
I0531 19:15:18.540244 1 server.go:101] FLAG: --stderrthreshold="2"
I0531 19:15:18.540265 1 server.go:101] FLAG: --v="0"
I0531 19:15:18.540296 1 server.go:101] FLAG: --version="false"
I0531 19:15:18.540338 1 server.go:101] FLAG: --vmodule=""
I0531 19:15:18.540415 1 server.go:138] Starting SkyDNS server. Listening on port:10053
I0531 19:15:18.540533 1 server.go:145] skydns: metrics enabled on : /metrics:
I0531 19:15:18.540598 1 dns.go:166] Waiting for service: default/kubernetes
I0531 19:15:18.541278 1 logs.go:41] skydns: ready for queries on cluster.local. for tcp://0.0.0.0:10053 [rcache 0]
I0531 19:15:18.541340 1 logs.go:41] skydns: ready for queries on cluster.local. for udp://0.0.0.0:10053 [rcache 0]
I0531 19:15:48.542106 1 dns.go:172] Ignoring error while waiting for service default/kubernetes: Get https://10.3.0.1:443/api/v1/namespaces/default/services/kubernetes: dial tcp 10.3.0.1:443: i/o timeout. Sleeping 1s before retrying.
E0531 19:15:48.544209 1 reflector.go:214] pkg/dns/dns.go:154: Failed to list *api.Endpoints: Get https://10.3.0.1:443/api/v1/endpoints?resourceVersion=0: dial tcp 10.3.0.1:443: i/o timeout
E0531 19:15:48.544580 1 reflector.go:214] pkg/dns/dns.go:155: Failed to list *api.Service: Get https://10.3.0.1:443/api/v1/services?resourceVersion=0: dial tcp 10.3.0.1:443: i/o timeout
I0531 19:16:19.543942 1 dns.go:172] Ignoring error while waiting for service default/kubernetes: Get https://10.3.0.1:443/api/v1/namespaces/default/services/kubernetes: dial tcp 10.3.0.1:443: i/o timeout. Sleeping 1s before retrying.
E0531 19:16:19.546421 1 reflector.go:214] pkg/dns/dns.go:155: Failed to list *api.Service: Get https://10.3.0.1:443/api/v1/services?resourceVersion=0: dial tcp 10.3.0.1:443: i/o timeout
E0531 19:16:19.546569 1 reflector.go:214] pkg/dns/dns.go:154: Failed to list *api.Endpoints: Get https://10.3.0.1:443/api/v1/endpoints?resourceVersion=0: dial tcp 10.3.0.1:443: i/o timeout
Williams-MacBook-Pro:KubeControl demonfuse$ kubectl logs --namespace=kube-system $(kubectl get pods --namespace=kube-system -l k8s-app=kube-dns -o name) -c dnsmasq
dnsmasq[1]: started, version 2.76 cachesize 1000
dnsmasq[1]: compile time options: IPv6 GNU-getopt no-DBus no-i18n no-IDN DHCP DHCPv6 no-Lua TFTP no-conntrack ipset auth no-DNSSEC loop-detect inotify
dnsmasq[1]: using nameserver 127.0.0.1#10053
dnsmasq[1]: read /etc/hosts - 7 addresses
Williams-MacBook-Pro:KubeControl demonfuse$ kubectl logs --namespace=kube-system $(kubectl get pods --namespace=kube-system -l k8s-app=kube-dns -o name) -c healthz
2017/05/31 19:09:17 Healthz probe on /healthz-dnsmasq error: Result of last exec: nslookup: can't resolve 'kubernetes.default.svc.cluster.local'
, at 2017-05-31 19:09:15.817652465 +0000 UTC, error exit status 1
2017/05/31 19:09:17 Healthz probe on /healthz-kubedns error: Result of last exec: nslookup: can't resolve 'kubernetes.default.svc.cluster.local'
, at 2017-05-31 19:09:15.816970338 +0000 UTC, error exit status 1
2017/05/31 19:09:27 Healthz probe on /healthz-dnsmasq error: Result of last exec: nslookup: can't resolve 'kubernetes.default.svc.cluster.local'
, at 2017-05-31 19:09:25.813204033 +0000 UTC, error exit status 1
2017/05/31 19:09:27 Healthz probe on /healthz-kubedns error: Result of last exec: nslookup: can't resolve 'kubernetes.default.svc.cluster.local'
, at 2017-05-31 19:09:25.812644469 +0000 UTC, error exit status 1
2017/05/31 19:09:37 Healthz probe on /healthz-dnsmasq error: Result of last exec: nslookup: can't resolve 'kubernetes.default.svc.cluster.local'
, at 2017-05-31 19:09:35.815158454 +0000 UTC, error exit status 1
2017/05/31 19:09:37 Healthz probe on /healthz-kubedns error: Result of last exec: nslookup: can't resolve 'kubernetes.default.svc.cluster.local'
, at 2017-05-31 19:09:35.814596885 +0000 UTC, error exit status 1
2017/05/31 19:09:47 Healthz probe on /healthz-dnsmasq error: Result of last exec: nslookup: can't resolve 'kubernetes.default.svc.cluster.local'
, at 2017-05-31 19:09:45.811774257 +0000 UTC, error exit status 1
2017/05/31 19:09:47 Healthz probe on /healthz-kubedns error: Result of last exec: nslookup: can't resolve 'kubernetes.default.svc.cluster.local'
, at 2017-05-31 19:09:45.812333 +0000 UTC, error exit status 1
2017/05/31 19:09:57 Healthz probe on /healthz-dnsmasq error: Result of last exec: nslookup: can't resolve 'kubernetes.default.svc.cluster.local'
, at 2017-05-31 19:09:55.814050664 +0000 UTC, error exit status 1
2017/05/31 19:09:57 Healthz probe on /healthz-kubedns error: Result of last exec: nslookup: can't resolve 'kubernetes.default.svc.cluster.local'
, at 2017-05-31 19:09:55.814489628 +0000 UTC, error exit status 1
2017/05/31 19:12:07 Healthz probe on /healthz-dnsmasq error: Result of last exec: nslookup: can't resolve 'kubernetes.default.svc.cluster.local'
, at 2017-05-31 19:12:05.811958358 +0000 UTC, error exit status 1
2017/05/31 19:12:07 Healthz probe on /healthz-kubedns error: Result of last exec: nslookup: can't resolve 'kubernetes.default.svc.cluster.local'
, at 2017-05-31 19:12:05.812245808 +0000 UTC, error exit status 1
2017/05/31 19:14:17 Healthz probe on /healthz-dnsmasq error: Result of last exec: nslookup: can't resolve 'kubernetes.default.svc.cluster.local'
, at 2017-05-31 19:14:15.808253036 +0000 UTC, error exit status 1
2017/05/31 19:14:17 Healthz probe on /healthz-kubedns error: Result of last exec: nslookup: can't resolve 'kubernetes.default.svc.cluster.local'
, at 2017-05-31 19:14:15.809691998 +0000 UTC, error exit status 1
2017/05/31 19:16:27 Healthz probe on /healthz-dnsmasq error: Result of last exec: nslookup: can't resolve 'kubernetes.default.svc.cluster.local'
, at 2017-05-31 19:16:25.813103916 +0000 UTC, error exit status 1
2017/05/31 19:16:27 Healthz probe on /healthz-kubedns error: Result of last exec: nslookup: can't resolve 'kubernetes.default.svc.cluster.local'
, at 2017-05-31 19:16:25.813576285 +0000 UTC, error exit status 1
According to the kube-dns troubleshooter guide, I used busy box for a simple nslookup, I got:
Williams-MacBook-Pro:KubeControl demonfuse$ kubectl exec -ti busybox -- nslookup kubernetes.default
Server: 10.3.0.10
Address 1: 10.3.0.10
nslookup: can't resolve 'kubernetes.default'
Any ideas? I followed the instructions on the Kubernetes guide to the letter and I'm using flannel without calico. How would I approach resolving kubernetes.default?
Also, the dashboard (Add-ons page on official guide) seems to be having the same problem.
Still no luck.
@Ascendance I think there are a log of missing parts in the instructions. I recommend you use a Vagrantfile and some scripts in "https://coreos.com/kubernetes/docs/latest/kubernetes-on-vagrant.html"
I am facing the same issue. I an not using calico, just flanneld. Did you find the solution? @jazzsir @Ascendance
We just had the same issue with kubedns and dashboard crash looping up as well, but are using weave. No luck resolving yet.
@hsteckylf Check the kube-proxy logs, I found some issues in the logs, fixed them and the issue is gone now.
Thanks! In our case, it ended up being due to the same issue as https://github.com/weaveworks/weave/issues/1875 with all of the weave IPAM IP space being allocated to unreachable (old) pods. After looping through weave rmpeer and recovering those IPs, all of the connections and pods were restored.
just change the port 6443 to 443 $vi /etc/kubernetes/manifests/kube-apiserver.yaml on the master and changing the liveness probe: livenessProbe: failureThreshold: 8 httpGet: host: 127.0.0.1 path: /healthz port: 443 # was 6443 scheme: HTTPS and restart the kubelet
@mfaizanse could you please share more of how did you resolve the issue ?
I've same issue, I think this log give hints
` $ kubectl logs kube-dns-86f4d74b45-gb4t7 -n kube-system -c kubedns
reflector.go:201] k8s.io/dns/pkg/dns/dns.go:150: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
reflector.go:201] k8s.io/dns/pkg/dns/dns.go:147: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
Waiting for services and endpoints to be initialized from apiserver...
dns.go:167] Timeout waiting for initialization
$ kubectl get endpoints kubernetes hostnames kube-dns
NAME ENDPOINTS AGE
kubernetes 192.168.56.101:6443 3h
hostnames 10.10.1.3:9376,10.10.2.3:9376,10.10.2.4:9376 46m
Error from server (NotFound): endpoints "kube-dns" not found
`
I try search and read more articles, but I haven't have a clear step how to troubleshoot and resolve such problem.