calico
calico copied to clipboard
routes not being setup for daemonset pod on newly added nodes until the pod is restarted manually
When a new node is added to cluster (using kubeadm join), the daemonset pods start automatically. Sometimes one of these pods doesn't get proper pod network setup. Just deleting the pod and letting it getting recreated fixes the issue. The issue occurs randomly and is not consistently reproducible everytime.
# kubectl get pod --all-namespaces -owide | grep 10.102.162.224
healthbot configmanager-f82n2 0/1 Init:ImagePullBackOff 0 2d22h 10.244.85.4 10.102.162.224 <none> <none>
healthbot speaker-dgqbl 0/1 ImagePullBackOff 0 2d22h 10.102.162.224 10.102.162.224 <none> <none>
healthbot udf-farm-7zh8b 0/1 Init:ImagePullBackOff 0 2d22h 10.244.85.3 10.102.162.224 <none> <none>
kube-system calico-node-ssd6p 1/1 Running 0 2d22h 10.102.162.224 10.102.162.224 <none> <none>
kube-system calicoctl 1/1 Running 0 34m 10.102.162.224 10.102.162.224 <none> <none>
kube-system docker-registry-docker-registry-proxy-h4n5q 1/1 Running 0 2d22h 10.244.85.1 10.102.162.224 <none> <none>
kube-system kube-proxy-f9nc2 1/1 Running 0 2d22h 10.102.162.224 10.102.162.224 <none> <none>
This node 10.102.162.224 is a newly joined node. Observe the three pods having the IPs - 10.244.85.1, 10.244.85.3 and 10.244.85.4. The pod with networking problem is the one with IP 10.244.85.1
# calicoctl get workloadendpoints -n kube-system
NAMESPACE WORKLOAD NODE NETWORKS INTERFACE
kube-system calico-kube-controllers-5b55f5fcc5-nvfd9 10.102.161.189 10.244.202.197/32 cali0fee7c57f46
kube-system coredns-6955765f44-6rfng 10.102.161.189 10.244.202.194/32 cali2a348b98402
kube-system coredns-6955765f44-xw47r 10.102.161.189 10.244.202.196/32 cali2456c980bec
kube-system docker-registry-docker-registry-86bd4bb577-8bbrd 10.102.161.189 10.244.202.198/32 cali3423c615441
kube-system docker-registry-docker-registry-proxy-7sj4d 10.102.162.125 10.244.178.1/32 calibfa4a90919f
kube-system docker-registry-docker-registry-proxy-8jtdd 10.102.161.201 10.244.38.193/32 cali26d291963b1
kube-system docker-registry-docker-registry-proxy-bk5ts 10.102.161.189 10.244.202.202/32 calic0988a005de
kube-system docker-registry-docker-registry-proxy-f66df 10.102.161.184 10.244.14.193/32 cali2aaf4a7c73b
kube-system docker-registry-docker-registry-proxy-h4n5q 10.102.162.224 10.244.85.1/32 calid14f1515eca
kube-system docker-registry-docker-registry-proxy-vp462 10.102.162.120 10.244.146.1/32 calicb508674e6d
kube-system tiller-deploy-969865475-gp2zp 10.102.161.189 10.244.202.195/32 cali7f8841ab762
calid14f1515eca is the calico interface assigned for the pod with IP 10.244.85.1
# ip route
default via 10.102.175.254 dev eth0
10.102.160.0/20 dev eth0 proto kernel scope link src 10.102.162.224
10.244.14.192/26 via 10.102.161.184 dev tunl0 proto bird onlink
10.244.38.192/26 via 10.102.161.201 dev tunl0 proto bird onlink
blackhole 10.244.85.0/26 proto bird
10.244.85.3 dev cali6eb5f2198f0 scope link
10.244.85.4 dev califee61023751 scope link
10.244.146.0/26 via 10.102.162.120 dev tunl0 proto bird onlink
10.244.178.0/26 via 10.102.162.125 dev tunl0 proto bird onlink
10.244.202.192/26 via 10.102.161.189 dev tunl0 proto bird onlink
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown
Observe 10.244.85.3, 10.244.85.4 have routes configured but not 10.244.85.1
No interface created on the host -
# ifconfig | grep cali6eb5f2198f0
cali6eb5f2198f0 Link encap:Ethernet HWaddr ee:ee:ee:ee:ee:ee
# ifconfig | grep califee61023751
califee61023751 Link encap:Ethernet HWaddr ee:ee:ee:ee:ee:ee
# ifconfig | grep calid14f1515eca
There are lot of Interface down log from calico-node pod in this node.
2020-12-03 09:19:55.980 [INFO][56] route_table.go 237: Queueing a resync of routing table. ipVersion=0x4
2020-12-03 09:19:55.981 [INFO][56] route_table.go 577: Syncing routes: adding new route. ifaceName="calid14f1515eca" ipVersion=0x4 targetCIDR=10.244.85.1/32
2020-12-03 09:19:55.981 [WARNING][56] route_table.go 604: Failed to add route error=network is down ifaceName="calid14f1515eca" ipVersion=0x4 targetCIDR=10.244.85.1/32
2020-12-03 09:19:55.981 [INFO][56] route_table.go 247: Trying to connect to netlink
2020-12-03 09:19:55.981 [INFO][56] route_table.go 361: Interface down, will retry if it goes up. ifaceName="calid14f1515eca" ipVersion=0x4
2020-12-03 09:19:55.982 [INFO][56] int_dataplane.go 967: Finished applying updates to dataplane. msecToApply=2.513651
# kubectl describe pod calico-node-ssd6p | grep "Start Time"
Start Time: Mon, 30 Nov 2020 02:52:39 -0800
# kubectl describe pod docker-registry-docker-registry-proxy-h4n5q | grep "Start Time"
Start Time: Mon, 30 Nov 2020 02:52:39 -0800
# kubectl describe pod udf-farm-7zh8b -n healthbot | grep "Start Time"
Start Time: Mon, 30 Nov 2020 02:53:09 -0800
# kubectl describe pod configmanager-f82n2 -n healthbot | grep "Start Time"
Start Time: Mon, 30 Nov 2020 02:53:09 -0800
Observe calico-node and the problematic pod docker-registry-docker-registry-proxy-h4n5q started at almost same time whereas the pods that have no networking issues started ~30 seconds later. So I'm guessing there is a race condition in here between whether calico-node pod starts first or any other deamonset's pod starts first?
One difference between docker-registry-docker-registry-proxy daemonset and the other deamonsets which started bit later is that both calico-node and docker-registry-docker-registry-proxy are installed in kube-system namespace whereas others are installed in a different namespace. I'm not sure if this makes a difference.
# kubectl describe node 10.102.162.224 | grep CIDR
PodCIDR: 10.244.40.0/21
PodCIDRs: 10.244.40.0/21
# kubectl cluster-info dump | grep -m 1 cluster-cidr
"--cluster-cidr=10.244.0.0/16",
# kubectl cluster-info dump | grep -m 1 service-cluster-ip-range
"--service-cluster-ip-range=10.96.0.0/12",
# kubectl get node
NAME STATUS ROLES AGE VERSION
10.102.161.184 Ready <none> 3d v1.17.2
10.102.161.189 Ready master 3d1h v1.17.2
10.102.161.201 Ready,SchedulingDisabled <none> 3d v1.17.2
10.102.162.120 Ready,SchedulingDisabled <none> 3d v1.17.2
10.102.162.125 Ready <none> 3d v1.17.2
10.102.162.224 Ready <none> 3d v1.17.2
Just deleting this pod and letting k8s recreate it solves the networking issue.
Your Environment
-
Calico version v3.12.3
-
Orchestrator version (e.g. kubernetes, mesos, rkt): Kubernetes 1.17.2
-
Operating System and version: Ubuntu 16.04
Hi @shashankv02 , thanks for the diags. Have you tried using a more recent Calico version? There have been a lot of improvements to Felix since v3.12 that might have addressed this issue.
Yeah, this is curious. The CNI plugin should be setting up that interface, and is the same component that should be allocating the IP address (which we see is working).
One thought - do you have any other CNI configurations besides the Calico config in /etc/cni/net.d on your hosts? Perhaps this pod is being launched with a different CNI plugin before Calico gets a chance to install its config?
If the host-side veth doesn't exist, I'm guessing it might be that another CNI plugin might have created it with a different name?
@caseydavenport There is only one CNI.
# /etc/cni/net.d# ls -lrt
total 8
-rw------- 1 root root 2623 Nov 30 02:52 calico-kubeconfig
-rw-r--r-- 1 root root 533 Nov 30 02:52 10-calico.conflist
Is there any other diagnostic information I should collect that might be helpful?
@Imm Thanks for the suggestion. I will try more recent version of calico on a development cluster whenever possible. Do not want to upgrade anything on production clusters without making sure the exact issue is fixed on a later version to avoid new regressions or behaviour changes. As this is hard to reproduce, it is also hard to verify if the issue is fixed or not.
Been trying to recreate this by removing and adding nodes with a script. Reproduced the same issue i.e., cannot access the docker-registry-proxy pod at hostPort 5000 on new node but the root cause seems to be different this time. I see the interface and static routes have been setup properly on the new node this time. But the hostPort jptables rules are not written. The docker-registry-proxy pod listens on hostPort 5000.
On a working node:
# iptables -t nat -S | grep 5000
-A CNI-DN-9083734ff9e63d966eb7c -s 10.244.140.202/32 -p tcp -m tcp --dport 5000 -j CNI-HOSTPORT-SETMARK
-A CNI-DN-9083734ff9e63d966eb7c -s 127.0.0.1/32 -p tcp -m tcp --dport 5000 -j CNI-HOSTPORT-SETMARK
-A CNI-DN-9083734ff9e63d966eb7c -p tcp -m tcp --dport 5000 -j DNAT --to-destination 10.244.140.202:80
-A CNI-HOSTPORT-DNAT -p tcp -m comment --comment "dnat name: \"k8s-pod-network\" id: \"11fe14ea06bfb57b3f2be805c242488fb9c668f88f5d8d128d2f79b8238e0c4c\"" -m multiport --dports 5000 -j CNI-DN-9083734ff9e63d966eb7c
-A KUBE-SEP-I4W7JNC5C7ZGNZMS -p tcp -m tcp -j DNAT --to-destination 10.244.140.222:50001
-A KUBE-SEP-RBXXJB6X2JQO4EBC -p tcp -m tcp -j DNAT --to-destination 10.244.140.199:5000
-A KUBE-SERVICES ! -s 10.244.0.0/16 -d 10.103.168.132/32 -p tcp -m comment --comment "kube-system/docker-registry-docker-registry:registry cluster IP" -m tcp --dport 5000 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.103.168.132/32 -p tcp -m comment --comment "kube-system/docker-registry-docker-registry:registry cluster IP" -m tcp --dport 5000 -j KUBE-SVC-NNGLEMTREGRS54DN
On newly joined node:
# iptables -t nat -S | grep 5000
-A KUBE-SEP-I4W7JNC5C7ZGNZMS -p tcp -m tcp -j DNAT --to-destination 10.244.140.222:50001
-A KUBE-SEP-RBXXJB6X2JQO4EBC -p tcp -m tcp -j DNAT --to-destination 10.244.140.199:5000
-A KUBE-SERVICES ! -s 10.244.0.0/16 -d 10.103.168.132/32 -p tcp -m comment --comment "kube-system/docker-registry-docker-registry:registry cluster IP" -m tcp --dport 5000 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.103.168.132/32 -p tcp -m comment --comment "kube-system/docker-registry-docker-registry:registry cluster IP" -m tcp --dport 5000 -j KUBE-SVC-NNGLEMTREGRS54DN
Observe the missing NAT rules on the new node. Again deleting the pod and letting it recreate fixes the issue.
# kubectl delete pod docker-registry-docker-registry-proxy-q79sk -n kube-system
pod "docker-registry-docker-registry-proxy-q79sk" deleted
# iptables -t nat -S | grep 5000
-A CNI-DN-f2f6741aa0634f9d6a631 -s 10.244.34.72/32 -p tcp -m tcp --dport 5000 -j CNI-HOSTPORT-SETMARK
-A CNI-DN-f2f6741aa0634f9d6a631 -s 127.0.0.1/32 -p tcp -m tcp --dport 5000 -j CNI-HOSTPORT-SETMARK
-A CNI-DN-f2f6741aa0634f9d6a631 -p tcp -m tcp --dport 5000 -j DNAT --to-destination 10.244.34.72:80
-A CNI-HOSTPORT-DNAT -p tcp -m comment --comment "dnat name: \"k8s-pod-network\" id: \"23eb4dadcc65546f5a19342032734ca5544f9a9fde63ef7db6ae94084ff5fad6\"" -m multiport --dports 5000 -j CNI-DN-f2f6741aa0634f9d6a631
-A KUBE-SEP-I4W7JNC5C7ZGNZMS -p tcp -m tcp -j DNAT --to-destination 10.244.140.222:50001
-A KUBE-SEP-RBXXJB6X2JQO4EBC -p tcp -m tcp -j DNAT --to-destination 10.244.140.199:5000
-A KUBE-SERVICES ! -s 10.244.0.0/16 -d 10.103.168.132/32 -p tcp -m comment --comment "kube-system/docker-registry-docker-registry:registry cluster IP" -m tcp --dport 5000 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.103.168.132/32 -p tcp -m comment --comment "kube-system/docker-registry-docker-registry:registry cluster IP" -m tcp --dport 5000 -j KUBE-SVC-NNGLEMTREGRS54DN
@shashankv02 I think that sounds like a different issue than the original one reported? In your original description it appeared that the interface for the pod didn't exist at all on the host - is that correct?
The hostPort plugin is responsible for those rules, and we've seen issues with it in the past. I think there's a strong case for that being part of the problem here, and a similar case for us ditching the upstream plugin and implementing those rules ourselves in Calico.
@caseydavenport Yeah, the symptom is similar i.e., not able to connect to a daemonset created pod on a newly joined node but the underlying cause seems to be different.
This issue is stale because it is kind/enhancement or kind/bug and has been open for 180 days with no activity.
This issue was closed because it has been inactive for 30 days since being marked as stale.