kind
kind copied to clipboard
Delay in network setup for Pod to Service communication
What happened:
I have a client Pod wants to communicate with an HTTP Server Pod (via a Service
in front of a Deployment
). The HTTP Server deployment is triggered programmatically, and the client Pod uses Kubernetes client API to watch
the Available
condition of the Deployment
resource, when the condition is met, it issues an HTTP request against the Service
DNS name.
The request fails with a connection refused error. I have to wait for quite a while, and try the request again, eventually it will work.
I even adjusted the client side waiting logic to inspect Endpoints
and EndpointSlice
resource to make sure the HTTP Server Pod IP address is populated in Endpoints
and EndpointSlice
before it initiate the HTTP request, but none of them works.
What you expected to happen:
After the Endpoints
or EndpointSlice
is correctly populated, the Pod
-> Service
-> Pod
communication should be successful, because this is the behavior consistent with non-container based Kubernetes distribution.
How to reproduce it (as minimally and precisely as possible):
- create KinD cluster
kind create cluster
- deploy a Pod with
kubectl
capability by applying the following yaml to the created cluster
apiVersion: v1
kind: ServiceAccount
metadata:
name: kubectl
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: kubectl
spec:
selector:
matchLabels:
app: kubectl
template:
metadata:
labels:
app: kubectl
spec:
containers:
- name: kubectl
image: bitnami/kubectl
command:
- sleep
- infinity
serviceAccountName: kubectl
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: kubectl
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: kubectl
namespace: default
- open a shell to the
kubectl
Podkubectl exec -it kubectl-78878b75b5-wl5vm -- bash
- inside the shell of
kubectl
Pod, execute the following commandkubectl apply -f https://raw.githubusercontent.com/istio/istio/master/samples/httpbin/httpbin.yaml;kubectl wait --for=condition=Available deployment/httpbin;until time curl http://httpbin:8000/get;do echo "try again";done
this command simulates the scenario where a client is waiting on the deployment to be ready, before it sends http request to the server Pod. Here I keep trying the curl
command until it succeeds, so that I can see how long it takes for the networking to be set up completely.
Here is the output of my testing:
serviceaccount/httpbin created
service/httpbin created
deployment.apps/httpbin created
deployment.apps/httpbin condition met
curl: (7) Failed to connect to httpbin port 8000: Connection refused
real 0m7.099s
user 0m0.005s
sys 0m0.006s
try again
curl: (7) Failed to connect to httpbin port 8000: Connection refused
real 0m0.010s
user 0m0.003s
sys 0m0.006s
try again
curl: (7) Failed to connect to httpbin port 8000: Connection refused
real 0m0.009s
user 0m0.003s
sys 0m0.006s
try again
curl: (7) Failed to connect to httpbin port 8000: Connection refused
real 0m0.009s
user 0m0.002s
sys 0m0.007s
try again
curl: (7) Failed to connect to httpbin port 8000: Connection refused
real 0m0.009s
user 0m0.003s
sys 0m0.006s
try again
curl: (7) Failed to connect to httpbin port 8000: Connection refused
real 0m0.010s
user 0m0.004s
sys 0m0.006s
try again
curl: (7) Failed to connect to httpbin port 8000: Connection refused
real 0m1.041s
user 0m0.004s
sys 0m0.007s
try again
curl: (7) Failed to connect to httpbin port 8000: Connection refused
real 0m1.023s
user 0m0.007s
sys 0m0.006s
try again
curl: (7) Failed to connect to httpbin port 8000: Connection refused
real 0m1.025s
user 0m0.006s
sys 0m0.006s
try again
curl: (7) Failed to connect to httpbin port 8000: Connection refused
real 0m1.023s
user 0m0.004s
sys 0m0.006s
try again
curl: (7) Failed to connect to httpbin port 8000: Connection refused
real 1m17.621s
user 0m0.011s
sys 0m0.015s
try again
{
"args": {},
"headers": {
"Accept": "*/*",
"Host": "httpbin:8000",
"User-Agent": "curl/7.64.0"
},
"origin": "10.244.0.2",
"url": "http://httpbin:8000/get"
}
real 0m0.017s
user 0m0.007s
sys 0m0.006s
If looking at the last failure right before the final success, the curl
command hang there for more than a minute.
Anything else we need to know?:
This issue surfaces both in KinD and also in Minikube with Docker
driver. If I switch Minikube to hyperkit
or virtualbox
driver, the issue disappears. I'm also not seeing this issue on EKS or AKS, but GKE does exhibit the same issue.
Environment:
- kind version:
kind v0.10.0 go1.15.7 darwin/amd64
- Kubernetes version:
Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.0", GitCommit:"cb303e613a121a29364f75cc67d3d580833a7479", GitTreeState:"clean", BuildDate:"2021-04-08T21:16:14Z", GoVersion:"go1.16.3", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.2", GitCommit:"faecb196815e248d3ecfb03c680a4507229c2a56", GitTreeState:"clean", BuildDate:"2021-01-21T01:11:42Z", GoVersion:"go1.15.5", Compiler:"gc", Platform:"linux/amd64"}
- Docker version:
Client:
Context: default
Debug Mode: false
Plugins:
app: Docker App (Docker Inc., v0.9.1-beta3)
buildx: Build with BuildKit (Docker Inc., v0.5.1-docker)
compose: Docker Compose (Docker Inc., 2.0.0-beta.1)
scan: Docker Scan (Docker Inc., v0.8.0)
Server:
Containers: 6
Running: 6
Paused: 0
Stopped: 0
Images: 266
Server Version: 20.10.6
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
userxattr: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Cgroup Version: 1
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 05f951a3781f4f2c1911b05e61c160e9c30eaa8e
runc version: 12644e614e25b05da6fd08a38ffa0cfe1903fdec
init version: de40ad0
Security Options:
seccomp
Profile: default
Kernel Version: 5.10.25-linuxkit
Operating System: Docker Desktop
OSType: linux
Architecture: x86_64
CPUs: 6
Total Memory: 7.775GiB
Name: docker-desktop
ID: 2JZY:DWSA:55MA:ZJ43:XRID:XUWW:NCOI:HUDK:PHUA:NZUB:62XR:6STK
Docker Root Dir: /var/lib/docker
Debug Mode: true
File Descriptors: 90
Goroutines: 81
System Time: 2021-05-25T23:17:10.5021701Z
EventsListeners: 3
HTTP Proxy: http.docker.internal:3128
HTTPS Proxy: http.docker.internal:3128
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
- OS:
ProductName: macOS
ProductVersion: 11.3.1
BuildVersion: 20E241
Please try upgrading to v0.11 and one of the images released with it. We made some improvements to the networking stack since v0.10
Please try upgrading to v0.11 and one of the images released with it. We made some improvements to the networking stack since v0.10
just tried with kind v0.11.0 go1.16.3 darwin/amd64
, still the same behavior, the last failed curl
execution took more than a minute to error out, and then the subsequent iteration succeeded eventually.
Also, my colleague didn't see this issue when his Docker Desktop app was a couple of versions older than mine; After he upgraded his to the same as mine, he started seeing the same issue. But we lost the track of which exact version of Docker Desktop he had previously.
Without digging into a reproducer (doing other things atm), I would guess it's most likely the docker embedded DNS acting up.
That would explain affecting both kind and the derivative docker mode in minikube, and only showing up after you upgraded docker.
If it is the case though, there's not much we can do about it here.
I dig a bit deeper by taking snapshot of iptables
of the single node.
The first snapshot is taken right after the httpbin
deployment became Available
, which means it has Ready
Pod.
# Generated by iptables-save v1.8.5 on Wed May 26 01:15:18 2021
*mangle
:PREROUTING ACCEPT [1186166:435334751]
:INPUT ACCEPT [1185936:435297997]
:FORWARD ACCEPT [230:36754]
:OUTPUT ACCEPT [1021400:172352482]
:POSTROUTING ACCEPT [1021610:172388056]
:KUBE-KUBELET-CANARY - [0:0]
:KUBE-PROXY-CANARY - [0:0]
COMMIT
# Completed on Wed May 26 01:15:18 2021
# Generated by iptables-save v1.8.5 on Wed May 26 01:15:18 2021
*filter
:INPUT ACCEPT [51772:8549330]
:FORWARD ACCEPT [9:744]
:OUTPUT ACCEPT [51821:9422272]
:KUBE-EXTERNAL-SERVICES - [0:0]
:KUBE-FIREWALL - [0:0]
:KUBE-FORWARD - [0:0]
:KUBE-KUBELET-CANARY - [0:0]
:KUBE-NODEPORTS - [0:0]
:KUBE-PROXY-CANARY - [0:0]
:KUBE-SERVICES - [0:0]
-A INPUT -m comment --comment "kubernetes health check service ports" -j KUBE-NODEPORTS
-A INPUT -m conntrack --ctstate NEW -m comment --comment "kubernetes externally-visible service portals" -j KUBE-EXTERNAL-SERVICES
-A INPUT -j KUBE-FIREWALL
-A FORWARD -m comment --comment "kubernetes forwarding rules" -j KUBE-FORWARD
-A FORWARD -m conntrack --ctstate NEW -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A FORWARD -m conntrack --ctstate NEW -m comment --comment "kubernetes externally-visible service portals" -j KUBE-EXTERNAL-SERVICES
-A OUTPUT -m conntrack --ctstate NEW -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A OUTPUT -j KUBE-FIREWALL
-A KUBE-FIREWALL -m comment --comment "kubernetes firewall for dropping marked packets" -m mark --mark 0x8000/0x8000 -j DROP
-A KUBE-FIREWALL ! -s 127.0.0.0/8 -d 127.0.0.0/8 -m comment --comment "block incoming localnet connections" -m conntrack ! --ctstate RELATED,ESTABLISHED,DNAT -j DROP
-A KUBE-FORWARD -m conntrack --ctstate INVALID -j DROP
-A KUBE-FORWARD -m comment --comment "kubernetes forwarding rules" -m mark --mark 0x4000/0x4000 -j ACCEPT
-A KUBE-FORWARD -m comment --comment "kubernetes forwarding conntrack pod source rule" -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A KUBE-FORWARD -m comment --comment "kubernetes forwarding conntrack pod destination rule" -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
COMMIT
# Completed on Wed May 26 01:15:18 2021
# Generated by iptables-save v1.8.5 on Wed May 26 01:15:18 2021
*nat
:PREROUTING ACCEPT [3:180]
:INPUT ACCEPT [6:404]
:OUTPUT ACCEPT [561:33704]
:POSTROUTING ACCEPT [567:34224]
:DOCKER_OUTPUT - [0:0]
:DOCKER_POSTROUTING - [0:0]
:KIND-MASQ-AGENT - [0:0]
:KUBE-KUBELET-CANARY - [0:0]
:KUBE-MARK-DROP - [0:0]
:KUBE-MARK-MASQ - [0:0]
:KUBE-NODEPORTS - [0:0]
:KUBE-POSTROUTING - [0:0]
:KUBE-PROXY-CANARY - [0:0]
:KUBE-SEP-6E7XQMQ4RAYOWTTM - [0:0]
:KUBE-SEP-FVQSBIWR5JTECIVC - [0:0]
:KUBE-SEP-LASJGFFJP3UOS6RQ - [0:0]
:KUBE-SEP-LPGSDLJ3FDW46N4W - [0:0]
:KUBE-SEP-QKX4QX54UKWK6JIY - [0:0]
:KUBE-SEP-ZP3FB6NMPNCO4VBJ - [0:0]
:KUBE-SEP-ZXMNUKOKXUTL2MK2 - [0:0]
:KUBE-SERVICES - [0:0]
:KUBE-SVC-ERIFXISQEP7F7OF4 - [0:0]
:KUBE-SVC-JD5MR3NA4I4DYORP - [0:0]
:KUBE-SVC-NPX46M4PTMTKRN6Y - [0:0]
:KUBE-SVC-TCOU7JCQXEZGVUNU - [0:0]
-A PREROUTING -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A PREROUTING -d 192.168.65.2/32 -j DOCKER_OUTPUT
-A OUTPUT -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A OUTPUT -d 192.168.65.2/32 -j DOCKER_OUTPUT
-A POSTROUTING -m comment --comment "kubernetes postrouting rules" -j KUBE-POSTROUTING
-A POSTROUTING -d 192.168.65.2/32 -j DOCKER_POSTROUTING
-A POSTROUTING -m addrtype ! --dst-type LOCAL -m comment --comment "kind-masq-agent: ensure nat POSTROUTING directs all non-LOCAL destination traffic to our custom KIND-MASQ-AGENT chain" -j KIND-MASQ-AGENT
-A DOCKER_OUTPUT -d 192.168.65.2/32 -p tcp -m tcp --dport 53 -j DNAT --to-destination 127.0.0.11:38695
-A DOCKER_OUTPUT -d 192.168.65.2/32 -p udp -m udp --dport 53 -j DNAT --to-destination 127.0.0.11:50472
-A DOCKER_POSTROUTING -s 127.0.0.11/32 -p tcp -m tcp --sport 38695 -j SNAT --to-source 192.168.65.2:53
-A DOCKER_POSTROUTING -s 127.0.0.11/32 -p udp -m udp --sport 50472 -j SNAT --to-source 192.168.65.2:53
-A KIND-MASQ-AGENT -d 10.244.0.0/16 -m comment --comment "kind-masq-agent: local traffic is not subject to MASQUERADE" -j RETURN
-A KIND-MASQ-AGENT -m comment --comment "kind-masq-agent: outbound traffic is subject to MASQUERADE (must be last in chain)" -j MASQUERADE
-A KUBE-MARK-DROP -j MARK --set-xmark 0x8000/0x8000
-A KUBE-MARK-MASQ -j MARK --set-xmark 0x4000/0x4000
-A KUBE-POSTROUTING -m mark ! --mark 0x4000/0x4000 -j RETURN
-A KUBE-POSTROUTING -j MARK --set-xmark 0x4000/0x0
-A KUBE-POSTROUTING -m comment --comment "kubernetes service traffic requiring SNAT" -j MASQUERADE --random-fully
-A KUBE-SEP-6E7XQMQ4RAYOWTTM -s 10.244.0.3/32 -m comment --comment "kube-system/kube-dns:dns" -j KUBE-MARK-MASQ
-A KUBE-SEP-6E7XQMQ4RAYOWTTM -p udp -m comment --comment "kube-system/kube-dns:dns" -m udp -j DNAT --to-destination 10.244.0.3:53
-A KUBE-SEP-FVQSBIWR5JTECIVC -s 10.244.0.5/32 -m comment --comment "kube-system/kube-dns:metrics" -j KUBE-MARK-MASQ
-A KUBE-SEP-FVQSBIWR5JTECIVC -p tcp -m comment --comment "kube-system/kube-dns:metrics" -m tcp -j DNAT --to-destination 10.244.0.5:9153
-A KUBE-SEP-LASJGFFJP3UOS6RQ -s 10.244.0.5/32 -m comment --comment "kube-system/kube-dns:dns-tcp" -j KUBE-MARK-MASQ
-A KUBE-SEP-LASJGFFJP3UOS6RQ -p tcp -m comment --comment "kube-system/kube-dns:dns-tcp" -m tcp -j DNAT --to-destination 10.244.0.5:53
-A KUBE-SEP-LPGSDLJ3FDW46N4W -s 10.244.0.5/32 -m comment --comment "kube-system/kube-dns:dns" -j KUBE-MARK-MASQ
-A KUBE-SEP-LPGSDLJ3FDW46N4W -p udp -m comment --comment "kube-system/kube-dns:dns" -m udp -j DNAT --to-destination 10.244.0.5:53
-A KUBE-SEP-QKX4QX54UKWK6JIY -s 172.18.0.3/32 -m comment --comment "default/kubernetes:https" -j KUBE-MARK-MASQ
-A KUBE-SEP-QKX4QX54UKWK6JIY -p tcp -m comment --comment "default/kubernetes:https" -m tcp -j DNAT --to-destination 172.18.0.3:6443
-A KUBE-SEP-ZP3FB6NMPNCO4VBJ -s 10.244.0.3/32 -m comment --comment "kube-system/kube-dns:metrics" -j KUBE-MARK-MASQ
-A KUBE-SEP-ZP3FB6NMPNCO4VBJ -p tcp -m comment --comment "kube-system/kube-dns:metrics" -m tcp -j DNAT --to-destination 10.244.0.3:9153
-A KUBE-SEP-ZXMNUKOKXUTL2MK2 -s 10.244.0.3/32 -m comment --comment "kube-system/kube-dns:dns-tcp" -j KUBE-MARK-MASQ
-A KUBE-SEP-ZXMNUKOKXUTL2MK2 -p tcp -m comment --comment "kube-system/kube-dns:dns-tcp" -m tcp -j DNAT --to-destination 10.244.0.3:53
-A KUBE-SERVICES ! -s 10.244.0.0/16 -d 10.96.0.1/32 -p tcp -m comment --comment "default/kubernetes:https cluster IP" -m tcp --dport 443 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.96.0.1/32 -p tcp -m comment --comment "default/kubernetes:https cluster IP" -m tcp --dport 443 -j KUBE-SVC-NPX46M4PTMTKRN6Y
-A KUBE-SERVICES ! -s 10.244.0.0/16 -d 10.96.0.10/32 -p udp -m comment --comment "kube-system/kube-dns:dns cluster IP" -m udp --dport 53 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.96.0.10/32 -p udp -m comment --comment "kube-system/kube-dns:dns cluster IP" -m udp --dport 53 -j KUBE-SVC-TCOU7JCQXEZGVUNU
-A KUBE-SERVICES ! -s 10.244.0.0/16 -d 10.96.0.10/32 -p tcp -m comment --comment "kube-system/kube-dns:dns-tcp cluster IP" -m tcp --dport 53 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.96.0.10/32 -p tcp -m comment --comment "kube-system/kube-dns:dns-tcp cluster IP" -m tcp --dport 53 -j KUBE-SVC-ERIFXISQEP7F7OF4
-A KUBE-SERVICES ! -s 10.244.0.0/16 -d 10.96.0.10/32 -p tcp -m comment --comment "kube-system/kube-dns:metrics cluster IP" -m tcp --dport 9153 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.96.0.10/32 -p tcp -m comment --comment "kube-system/kube-dns:metrics cluster IP" -m tcp --dport 9153 -j KUBE-SVC-JD5MR3NA4I4DYORP
-A KUBE-SERVICES -m comment --comment "kubernetes service nodeports; NOTE: this must be the last rule in this chain" -m addrtype --dst-type LOCAL -j KUBE-NODEPORTS
-A KUBE-SVC-ERIFXISQEP7F7OF4 -m comment --comment "kube-system/kube-dns:dns-tcp" -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-ZXMNUKOKXUTL2MK2
-A KUBE-SVC-ERIFXISQEP7F7OF4 -m comment --comment "kube-system/kube-dns:dns-tcp" -j KUBE-SEP-LASJGFFJP3UOS6RQ
-A KUBE-SVC-JD5MR3NA4I4DYORP -m comment --comment "kube-system/kube-dns:metrics" -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-ZP3FB6NMPNCO4VBJ
-A KUBE-SVC-JD5MR3NA4I4DYORP -m comment --comment "kube-system/kube-dns:metrics" -j KUBE-SEP-FVQSBIWR5JTECIVC
-A KUBE-SVC-NPX46M4PTMTKRN6Y -m comment --comment "default/kubernetes:https" -j KUBE-SEP-QKX4QX54UKWK6JIY
-A KUBE-SVC-TCOU7JCQXEZGVUNU -m comment --comment "kube-system/kube-dns:dns" -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-6E7XQMQ4RAYOWTTM
-A KUBE-SVC-TCOU7JCQXEZGVUNU -m comment --comment "kube-system/kube-dns:dns" -j KUBE-SEP-LPGSDLJ3FDW46N4W
COMMIT
# Completed on Wed May 26 01:15:19 2021
The second snapshot is taken a couple of minutes later.
# Generated by iptables-save v1.8.5 on Wed May 26 01:18:24 2021
*mangle
:PREROUTING ACCEPT [1217024:440238923]
:INPUT ACCEPT [1216794:440202169]
:FORWARD ACCEPT [230:36754]
:OUTPUT ACCEPT [1052216:177448310]
:POSTROUTING ACCEPT [1052426:177483884]
:KUBE-KUBELET-CANARY - [0:0]
:KUBE-PROXY-CANARY - [0:0]
COMMIT
# Completed on Wed May 26 01:18:24 2021
# Generated by iptables-save v1.8.5 on Wed May 26 01:18:24 2021
*filter
:INPUT ACCEPT [29625:4711246]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [29584:4894699]
:KUBE-EXTERNAL-SERVICES - [0:0]
:KUBE-FIREWALL - [0:0]
:KUBE-FORWARD - [0:0]
:KUBE-KUBELET-CANARY - [0:0]
:KUBE-NODEPORTS - [0:0]
:KUBE-PROXY-CANARY - [0:0]
:KUBE-SERVICES - [0:0]
-A INPUT -m comment --comment "kubernetes health check service ports" -j KUBE-NODEPORTS
-A INPUT -m conntrack --ctstate NEW -m comment --comment "kubernetes externally-visible service portals" -j KUBE-EXTERNAL-SERVICES
-A INPUT -j KUBE-FIREWALL
-A FORWARD -m comment --comment "kubernetes forwarding rules" -j KUBE-FORWARD
-A FORWARD -m conntrack --ctstate NEW -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A FORWARD -m conntrack --ctstate NEW -m comment --comment "kubernetes externally-visible service portals" -j KUBE-EXTERNAL-SERVICES
-A OUTPUT -m conntrack --ctstate NEW -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A OUTPUT -j KUBE-FIREWALL
-A KUBE-FIREWALL -m comment --comment "kubernetes firewall for dropping marked packets" -m mark --mark 0x8000/0x8000 -j DROP
-A KUBE-FIREWALL ! -s 127.0.0.0/8 -d 127.0.0.0/8 -m comment --comment "block incoming localnet connections" -m conntrack ! --ctstate RELATED,ESTABLISHED,DNAT -j DROP
-A KUBE-FORWARD -m conntrack --ctstate INVALID -j DROP
-A KUBE-FORWARD -m comment --comment "kubernetes forwarding rules" -m mark --mark 0x4000/0x4000 -j ACCEPT
-A KUBE-FORWARD -m comment --comment "kubernetes forwarding conntrack pod source rule" -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A KUBE-FORWARD -m comment --comment "kubernetes forwarding conntrack pod destination rule" -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
COMMIT
# Completed on Wed May 26 01:18:25 2021
# Generated by iptables-save v1.8.5 on Wed May 26 01:18:25 2021
*nat
:PREROUTING ACCEPT [0:0]
:INPUT ACCEPT [0:0]
:OUTPUT ACCEPT [325:19500]
:POSTROUTING ACCEPT [325:19500]
:DOCKER_OUTPUT - [0:0]
:DOCKER_POSTROUTING - [0:0]
:KIND-MASQ-AGENT - [0:0]
:KUBE-KUBELET-CANARY - [0:0]
:KUBE-MARK-DROP - [0:0]
:KUBE-MARK-MASQ - [0:0]
:KUBE-NODEPORTS - [0:0]
:KUBE-POSTROUTING - [0:0]
:KUBE-PROXY-CANARY - [0:0]
:KUBE-SEP-6E7XQMQ4RAYOWTTM - [0:0]
:KUBE-SEP-FVQSBIWR5JTECIVC - [0:0]
:KUBE-SEP-LASJGFFJP3UOS6RQ - [0:0]
:KUBE-SEP-LPGSDLJ3FDW46N4W - [0:0]
:KUBE-SEP-QKX4QX54UKWK6JIY - [0:0]
:KUBE-SEP-XT3VQGQ2EKT4GEUU - [0:0]
:KUBE-SEP-ZP3FB6NMPNCO4VBJ - [0:0]
:KUBE-SEP-ZXMNUKOKXUTL2MK2 - [0:0]
:KUBE-SERVICES - [0:0]
:KUBE-SVC-ERIFXISQEP7F7OF4 - [0:0]
:KUBE-SVC-FREKB6WNWYJLKTHC - [0:0]
:KUBE-SVC-JD5MR3NA4I4DYORP - [0:0]
:KUBE-SVC-NPX46M4PTMTKRN6Y - [0:0]
:KUBE-SVC-TCOU7JCQXEZGVUNU - [0:0]
-A PREROUTING -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A PREROUTING -d 192.168.65.2/32 -j DOCKER_OUTPUT
-A OUTPUT -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A OUTPUT -d 192.168.65.2/32 -j DOCKER_OUTPUT
-A POSTROUTING -m comment --comment "kubernetes postrouting rules" -j KUBE-POSTROUTING
-A POSTROUTING -d 192.168.65.2/32 -j DOCKER_POSTROUTING
-A POSTROUTING -m addrtype ! --dst-type LOCAL -m comment --comment "kind-masq-agent: ensure nat POSTROUTING directs all non-LOCAL destination traffic to our custom KIND-MASQ-AGENT chain" -j KIND-MASQ-AGENT
-A DOCKER_OUTPUT -d 192.168.65.2/32 -p tcp -m tcp --dport 53 -j DNAT --to-destination 127.0.0.11:38695
-A DOCKER_OUTPUT -d 192.168.65.2/32 -p udp -m udp --dport 53 -j DNAT --to-destination 127.0.0.11:50472
-A DOCKER_POSTROUTING -s 127.0.0.11/32 -p tcp -m tcp --sport 38695 -j SNAT --to-source 192.168.65.2:53
-A DOCKER_POSTROUTING -s 127.0.0.11/32 -p udp -m udp --sport 50472 -j SNAT --to-source 192.168.65.2:53
-A KIND-MASQ-AGENT -d 10.244.0.0/16 -m comment --comment "kind-masq-agent: local traffic is not subject to MASQUERADE" -j RETURN
-A KIND-MASQ-AGENT -m comment --comment "kind-masq-agent: outbound traffic is subject to MASQUERADE (must be last in chain)" -j MASQUERADE
-A KUBE-MARK-DROP -j MARK --set-xmark 0x8000/0x8000
-A KUBE-MARK-MASQ -j MARK --set-xmark 0x4000/0x4000
-A KUBE-POSTROUTING -m mark ! --mark 0x4000/0x4000 -j RETURN
-A KUBE-POSTROUTING -j MARK --set-xmark 0x4000/0x0
-A KUBE-POSTROUTING -m comment --comment "kubernetes service traffic requiring SNAT" -j MASQUERADE --random-fully
-A KUBE-SEP-6E7XQMQ4RAYOWTTM -s 10.244.0.3/32 -m comment --comment "kube-system/kube-dns:dns" -j KUBE-MARK-MASQ
-A KUBE-SEP-6E7XQMQ4RAYOWTTM -p udp -m comment --comment "kube-system/kube-dns:dns" -m udp -j DNAT --to-destination 10.244.0.3:53
-A KUBE-SEP-FVQSBIWR5JTECIVC -s 10.244.0.5/32 -m comment --comment "kube-system/kube-dns:metrics" -j KUBE-MARK-MASQ
-A KUBE-SEP-FVQSBIWR5JTECIVC -p tcp -m comment --comment "kube-system/kube-dns:metrics" -m tcp -j DNAT --to-destination 10.244.0.5:9153
-A KUBE-SEP-LASJGFFJP3UOS6RQ -s 10.244.0.5/32 -m comment --comment "kube-system/kube-dns:dns-tcp" -j KUBE-MARK-MASQ
-A KUBE-SEP-LASJGFFJP3UOS6RQ -p tcp -m comment --comment "kube-system/kube-dns:dns-tcp" -m tcp -j DNAT --to-destination 10.244.0.5:53
-A KUBE-SEP-LPGSDLJ3FDW46N4W -s 10.244.0.5/32 -m comment --comment "kube-system/kube-dns:dns" -j KUBE-MARK-MASQ
-A KUBE-SEP-LPGSDLJ3FDW46N4W -p udp -m comment --comment "kube-system/kube-dns:dns" -m udp -j DNAT --to-destination 10.244.0.5:53
-A KUBE-SEP-QKX4QX54UKWK6JIY -s 172.18.0.3/32 -m comment --comment "default/kubernetes:https" -j KUBE-MARK-MASQ
-A KUBE-SEP-QKX4QX54UKWK6JIY -p tcp -m comment --comment "default/kubernetes:https" -m tcp -j DNAT --to-destination 172.18.0.3:6443
-A KUBE-SEP-XT3VQGQ2EKT4GEUU -s 10.244.0.7/32 -m comment --comment "default/httpbin:http" -j KUBE-MARK-MASQ
-A KUBE-SEP-XT3VQGQ2EKT4GEUU -p tcp -m comment --comment "default/httpbin:http" -m tcp -j DNAT --to-destination 10.244.0.7:80
-A KUBE-SEP-ZP3FB6NMPNCO4VBJ -s 10.244.0.3/32 -m comment --comment "kube-system/kube-dns:metrics" -j KUBE-MARK-MASQ
-A KUBE-SEP-ZP3FB6NMPNCO4VBJ -p tcp -m comment --comment "kube-system/kube-dns:metrics" -m tcp -j DNAT --to-destination 10.244.0.3:9153
-A KUBE-SEP-ZXMNUKOKXUTL2MK2 -s 10.244.0.3/32 -m comment --comment "kube-system/kube-dns:dns-tcp" -j KUBE-MARK-MASQ
-A KUBE-SEP-ZXMNUKOKXUTL2MK2 -p tcp -m comment --comment "kube-system/kube-dns:dns-tcp" -m tcp -j DNAT --to-destination 10.244.0.3:53
-A KUBE-SERVICES ! -s 10.244.0.0/16 -d 10.96.0.1/32 -p tcp -m comment --comment "default/kubernetes:https cluster IP" -m tcp --dport 443 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.96.0.1/32 -p tcp -m comment --comment "default/kubernetes:https cluster IP" -m tcp --dport 443 -j KUBE-SVC-NPX46M4PTMTKRN6Y
-A KUBE-SERVICES ! -s 10.244.0.0/16 -d 10.96.0.10/32 -p udp -m comment --comment "kube-system/kube-dns:dns cluster IP" -m udp --dport 53 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.96.0.10/32 -p udp -m comment --comment "kube-system/kube-dns:dns cluster IP" -m udp --dport 53 -j KUBE-SVC-TCOU7JCQXEZGVUNU
-A KUBE-SERVICES ! -s 10.244.0.0/16 -d 10.96.0.10/32 -p tcp -m comment --comment "kube-system/kube-dns:dns-tcp cluster IP" -m tcp --dport 53 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.96.0.10/32 -p tcp -m comment --comment "kube-system/kube-dns:dns-tcp cluster IP" -m tcp --dport 53 -j KUBE-SVC-ERIFXISQEP7F7OF4
-A KUBE-SERVICES ! -s 10.244.0.0/16 -d 10.96.0.10/32 -p tcp -m comment --comment "kube-system/kube-dns:metrics cluster IP" -m tcp --dport 9153 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.96.0.10/32 -p tcp -m comment --comment "kube-system/kube-dns:metrics cluster IP" -m tcp --dport 9153 -j KUBE-SVC-JD5MR3NA4I4DYORP
-A KUBE-SERVICES ! -s 10.244.0.0/16 -d 10.96.136.177/32 -p tcp -m comment --comment "default/httpbin:http cluster IP" -m tcp --dport 8000 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.96.136.177/32 -p tcp -m comment --comment "default/httpbin:http cluster IP" -m tcp --dport 8000 -j KUBE-SVC-FREKB6WNWYJLKTHC
-A KUBE-SERVICES -m comment --comment "kubernetes service nodeports; NOTE: this must be the last rule in this chain" -m addrtype --dst-type LOCAL -j KUBE-NODEPORTS
-A KUBE-SVC-ERIFXISQEP7F7OF4 -m comment --comment "kube-system/kube-dns:dns-tcp" -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-ZXMNUKOKXUTL2MK2
-A KUBE-SVC-ERIFXISQEP7F7OF4 -m comment --comment "kube-system/kube-dns:dns-tcp" -j KUBE-SEP-LASJGFFJP3UOS6RQ
-A KUBE-SVC-FREKB6WNWYJLKTHC -m comment --comment "default/httpbin:http" -j KUBE-SEP-XT3VQGQ2EKT4GEUU
-A KUBE-SVC-JD5MR3NA4I4DYORP -m comment --comment "kube-system/kube-dns:metrics" -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-ZP3FB6NMPNCO4VBJ
-A KUBE-SVC-JD5MR3NA4I4DYORP -m comment --comment "kube-system/kube-dns:metrics" -j KUBE-SEP-FVQSBIWR5JTECIVC
-A KUBE-SVC-NPX46M4PTMTKRN6Y -m comment --comment "default/kubernetes:https" -j KUBE-SEP-QKX4QX54UKWK6JIY
-A KUBE-SVC-TCOU7JCQXEZGVUNU -m comment --comment "kube-system/kube-dns:dns" -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-6E7XQMQ4RAYOWTTM
-A KUBE-SVC-TCOU7JCQXEZGVUNU -m comment --comment "kube-system/kube-dns:dns" -j KUBE-SEP-LPGSDLJ3FDW46N4W
COMMIT
# Completed on Wed May 26 01:18:26 2021
By looking up for keyword httpbin
in these two snapshot, we can find kube-proxy
was not updating the iptables
instantly. This delay is causing the curl
connection failure, isn't it?
By doing a better job in clocking the iptable-save
dump, it proves I was wrong, kube-proxy
managed to update the iptables
within a second.
But even with the iptables
properly updated, the curl
command still kept failing for a minute or so.
@BenTheElder if I add the -v
to my curl
command, I can see the httpbin
service DNS name got resolved to its ClusterIP correctly. Does it rule out the docker DNS problem as you suggested?
you have Prometheus metrics in kube-proxy that measure exactly that, the time since the endpoint is created to the time when the iptables rules are programmed. https://github.com/kubernetes/kubernetes/blob/b496238dd65d86c65183ac7ffa128c5de46705b4/pkg/proxy/metrics/metrics.go#L52
An example on how to install Prometheus in kind here https://github.com/aojea/kind-networking-plugins/tree/main/monitoring
That will give you better insight
@BenTheElder if I add the -v to my curl command, I can see the httpbin service DNS name got resolved to its ClusterIP correctly. Does it rule out the docker DNS problem as you suggested?
Sounds like it.
But even with the
iptables
properly updated, the curl command still kept failing for a minute or so.
Is the pod ready and listening?
You may just have less resources available / slow startup in your docker setup. Docker desktop doesn't allocate much OOTB.
Even if it were iptables programming times, that's in kube-proxy upstream, kind isn't really responsible for that ...
@aojea Thanks for the pointer. I patched the kube-proxy
configmap and deployed Prometheus, I could see 3 time series related to metrics network_programming_duration_seconds
.

I'm not sure which exact one to look at, but I assume we're talking about network_programming_duration_seconds_bucket
, the interesting part of the table view is:

If I read the table correctly, does it mean the network_programming_duration
is mostly in the range of 10 ~ 20 seconds?
@BenTheElder if I add the -v to my curl command, I can see the httpbin service DNS name got resolved to its ClusterIP correctly. Does it rule out the docker DNS problem as you suggested?
Sounds like it.
But even with the
iptables
properly updated, the curl command still kept failing for a minute or so.Is the pod ready and listening?
the Pod is definitely in Ready
status and listening, otherwise, the Deployment
couldn't be Available
; and in the real product testing setup, I was programmatically waiting for the Endpoints
resource to be populated with the Pod IP address before initiate the communication via the Service
.
You may just have less resources available / slow startup in your docker setup. Docker desktop doesn't allocate much OOTB.
I assigned 6CPU, 8G memory, and 184GB disk to my Docker Desktop. If you're talking about slow startup of the httpbin
Pod, I don't believe so, as I mentioned above, I even tried programmatically waiting for the Endpoints
resource to include the Pod IP address, which wouldn't happen unless the Pod is completely Ready
.
If there's slowness in kube-proxy
, I guess @aojea 's advice may provide more insight. I've pasted the metrics values above, the question is "does the metrics explains my observation?"
Even if it were iptables programming times, that's in kube-proxy upstream, kind isn't really responsible for that ...
upps, I though that the fix for those metrics was in 1.21 but it is not, it is fixed in 1.22 that has not been released, however, it seems you are using 1.20.
Check sync_proxy_rules_duration_seconds
, you can do a graph and see the trend, this way we can rule out that the problem is in iptables programming
I'm actually on v1.21.1, here's what you are looking for I guess. It is still a histogram metric, so we only have bucket/count/sum.

we want to know why the delay, there are multiples factors here, and we want to discard kube-proxy adding the rules is the problem, so you have to correlate the metrics with your tests, there is no baseline, every deployment is different.
If we know that the delay is caused because kube-proxy takes 6 secs to flush the rules, we found the problem.
In order to do that, you have to create a graph to check the time it took to create the rules when you are doing the test, this time is also recorded on kube-proxy logs if runs with verbose level 4.
I think this is probably https://github.com/kubernetes-sigs/kind/issues/2208#issuecomment-857064037 which was a bug in docker desktop that should be fixed now, according to the user there.
@simingweng can you try with up to date docker desktop as referenced in https://github.com/kubernetes-sigs/kind/issues/2280#issuecomment-859683436?
I'm still not seeing that kind has done anything, this seems most likely to be a problem with docker desktop or possibly kube-proxy.
Given https://github.com/kubernetes-sigs/kind/issues/2280#issuecomment-848341309 + inactivity since may I'm inclined to call this https://github.com/kubernetes-sigs/kind/issues/2208#issuecomment-857064037 for now.
Hi, I think the issue needs to be reopen, here is my case:
I am still seeing the issue on latest (as of Feb 9, 2022) Docker for macOS 4.4.2 and Kind 0.11.1 on M1 MacBook Pro w/ macOS 12.0.1.
I can reproduce it consistently with our integration test suite that redeploys DBs and microservice applications on each test.
Tests often fail with networking errors where app Deployment
s try to reach DB Deployments via Service
s for first ~10-15 seconds after successful rollout of DB Deployment
s.
Ie we have:
apiVersion: v1
kind: Service
metadata:
name: postgresql
namespace: postgresql
labels:
app: postgresql
spec:
ports:
- port: 5432
name: posgresql
clusterIP: None
selector:
app: postgresql
DB Deployment
of course has startupProbe
and we wait for its rollout
before deploying the app:
DB Deployment:
startupProbe:
exec:
command:
- pg_isready
initialDelaySeconds: 2
periodSeconds: 1
failureThreshold: 30
Then we wait for DB rollout: kubectl rollout -n postgresql status deployment postgresql
Then app Deployment
tries to connect to postgresql
Service
. What we see is that app Deployment
will frequently crash w/ Network errors trying to reach the Service
, so I had to add initContainer
like this:
spec:
# No need for this check in Production, it's a Kind/Docker bug — DNS Unavailable for few seconds after deploy
# See https://github.com/kubernetes-sigs/kind/issues/2280
initContainers:
- name: verify-pgsql-reachable
image: postgres:14-bullseye
command: [
'/bin/bash',
'-c',
'for i in {1..15}; do if pg_isready --host postgresql.postgresql.svc.cluster.local --timeout 1; then exit 0; else sleep 1; fi; done;'
]
And here is sample output from tests:
10:26:32.936 [ne#52] INFO - Init container logs
postgresql.postgresql.svc.cluster.local:5432 - no response
postgresql.postgresql.svc.cluster.local:5432 - no response
postgresql.postgresql.svc.cluster.local:5432 - no response
postgresql.postgresql.svc.cluster.local:5432 - no response
postgresql.postgresql.svc.cluster.local:5432 - no response
postgresql.postgresql.svc.cluster.local:5432 - no response
postgresql.postgresql.svc.cluster.local:5432 - accepting connections
As you can see for significant time ~12seconds(!) 6x * (1s timeout + 1s delay)
there is no networking between app Deployment
and DB Service
and the check is same pg_isready
as it is in startupProbe
of DB Deployment
.
Thus, I think this issue needs to be reopen and investigated further, Kind is often used in integration tests and when two services can't talk to each other well that's an issue :/
What I also found is that using Service.spec.clusterIP: None
(headless mode where Service DNS entry points directly to selected pods) results in longer networking unavailability than Service.spec.type: ClusterIP
(when k8s actually routes traffic through a stable IP to actual Pods) with Kind and Docker on macOS.
So for now I'm switching Service
s to type: ClusterIP
, but it's not a fix, just a workaround and not a stable one either because I still see unavailability and I'd actually like to use clusterIP: None
:
("better" than clusterIP: None
but still bad)
10:53:35.583 [ne#52] INFO - Init container logs
postgresql.postgresql.svc.cluster.local:5432 - no response
postgresql.postgresql.svc.cluster.local:5432 - no response
postgresql.postgresql.svc.cluster.local:5432 - accepting connections
I no longer have docker desktop / kind on macOS (due to license change), so I can't reproduce to investigate. My co-maintainer @aojea never had access to this environment.
Based on prior comments:
This issue surfaces both in KinD and also in Minikube with Docker driver. If I switch Minikube to hyperkit or virtualbox driver, the issue disappears. I'm also not seeing this issue on EKS or AKS, but GKE does exhibit the same issue.
It's not clear that this is even kind specific though.
I can reopen this, but I can't work on it.