Pod gets added to service in DSR mode before external IP added to tunnel inside pod
After a pod is brought online, it gets added to the IPVS tables on all kube-router hosts when the tunnel interface inside the pod is not yet configured properly (Note kube-tunnel-if has no IP on it). This means traffic sent to it is dropped inside the container:
root@nginx-5f5664c88b-5z7d8:/# ip ad ls
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: tunl0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN group default qlen 1000
link/ipip 0.0.0.0 brd 0.0.0.0
4: eth0@if47: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
link/ether 0a:58:0a:74:81:ad brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet 10.116.129.173/24 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::c0ed:95ff:fe94:647/64 scope link
valid_lft forever preferred_lft forever
5: kube-tunnel-if@NONE: <NOARP> mtu 1480 qdisc noop state DOWN group default
link/ipip 10.116.129.173 brd 0.0.0.0
After interface gets configured correctly, around 30 seconds later:
root@nginx-5f5664c88b-5z7d8:/# ip ad ls
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: tunl0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN group default qlen 1000
link/ipip 0.0.0.0 brd 0.0.0.0
4: eth0@if47: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
link/ether 0a:58:0a:74:81:ad brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet 10.116.129.173/24 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::c0ed:95ff:fe94:647/64 scope link
valid_lft forever preferred_lft forever
5: kube-tunnel-if@NONE: <NOARP,UP,LOWER_UP> mtu 1480 qdisc noqueue state UNKNOWN group default
link/ipip 10.116.129.173 brd 0.0.0.0
inet 10.116.4.1/32 brd 10.116.4.1 scope link kube-tunnel-if
valid_lft forever preferred_lft forever
inet6 fe80::5efe:a74:81ad/64 scope link
valid_lft forever preferred_lft forever
@thardie what verison of kube-router?
it sounds like you are hitting a old bug that was adressed in https://github.com/cloudnativelabs/kube-router/pull/472
I was running latest, and upon code inspection, it does look like the code adds it to IPVS before trying to configure the interface and there are several sleeps inside the interface bring up code...
Thanks for the feedback, we will investigate this, PR's are welcome
I'd love to give a PR. Just need to get my build environment working first to verify any fix I code :)
Closing as stale