coreos-kubernetes
coreos-kubernetes copied to clipboard
kube-dns service ip unreachable from another pod
I have hit a frustrating brick wall, and I'm hoping someone can help. I am using multi-node/generic to set up the worker.
CoreOS-1164.1.0 and rktnetes (flannel and cni) rkt-1.14.0 Hyperkube-1.4.0_coreos.1 kube-dns-v19
Kube Cluster DNS Service IP == 10.69.11.9 (pod IP: 10.11.57.5)
Currently, I have reduced down the rktnetes cluster to just 1 master and 1 worker, to try and pin this down.
The problem is that none of my worker pods can connect to the DNS Service IP to do lookups. If I launch a busybox pod (pod IP:10.11.57.92 and /etc/resolv.conf == nameserver 10.69.11.9) and run:
$ nslookup kubernetes.default.svc.cluster.local
** It tries to contact 10.69.11.9 and just hangs
$ nslookup kubernetes.default.svc.cluster.local 10.69.11.9
** Explicitly specifying ip address just to be sure, still hangs
$ nslookup kubernetes.default.svc.cluster.local 10.11.57.5
** This works perfectly fine
Server: 10.11.57.5
Address 1: 10.11.57.5 kube-dns-v19-ath05
Name: kubernetes.default.svc.cluster.local
Address 1: 10.69.11.1 kubernetes.default.svc.cluster.local
I checked the conntrack table, and see that the packets are in an Unreplied state:
ipv4 2 udp 17 29 src=10.11.57.92 dst=10.69.11.9 sport=43663 dport=53 [UNREPLIED] src=10.69.11.9 dst=10.11.57.92 sport=53 dport=43663 mark=0 secctx=system_u:object_r:unlabeled_t:s0 zone=0 use=3
Now if I'm logged directly into the worker node and do the nslookup to 10.69.11.9, it works perfectly fine.
The conntrack table shows:
ipv4 2 udp 17 27 src=10.42.11.111 dst=10.69.11.9 sport=37000 dport=53 src=10.11.57.5 dst=10.11.57.1 sport=53 dport=37000 mark=0 secctx=system_u:object_r:unlabeled_t:s0 zone=0 use=2
I've looked at the netfilter nat table, and all appears to look ok.
Has anyone ran into the problem before with kube-dns, or any pod not being able to communicate with the service-cluster-ip-range (10.69.11.0/24)? I pretty much spent the whole weekend trying to find the underlying problem and have come up empty which leaves me stuck at a formidable wall.
Can you try using --masquerade-all for kube-proxy?
On Oct 4, 2016 06:36, "jelis" [email protected] wrote:
I think I am experiencing the same issue right now.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/coreos/coreos-kubernetes/issues/707#issuecomment-251389457, or mute the thread https://github.com/notifications/unsubscribe-auth/ABa0fEE-tlnpdLGTLS9-SP5gnwf_S4Ckks5qwlZjgaJpZM4KNLiN .
@mischief just in the cloud-config-worker ?
@mischief I added the option to the proxy pod on the workers:
"Entrypoint": [
"/hyperkube",
"proxy",
"--master=https://10.0.0.50:443",
"--kubeconfig=/etc/kubernetes/worker-kubeconfig.yaml",
"--masquerade-all"
],
then i try to run this against a container in my pod with no luck:
core@ip-10-0-1-238 ~ $ docker exec -it 302b27442abc curl https://www.google.com
curl: (6) Could not resolve host: www.google.com
but it's not a problem on the worker host to do
curl https://www.google.com
on my container:
core@ip-10-0-1-238 ~ $ docker exec -it 302b27442abc /bin/bash
root@camserver-818956273-x20m2:/# cat /etc/resolv.conf
search default.svc.cluster.local svc.cluster.local cluster.local ec2.internal
nameserver 10.3.0.10
options ndots:5
@mischief currently, I am ripping out the Calico bits and testing with just cni + flannel. Looking at @jelis cluster.yaml.txt, I see that 'useCalico' is set to true (same for me). Since @jelis and I are running different kubelet runtimes (docker vs rktnetes), that leads me to believe it is solely manifested in the network layer. Overall, kubernetes is like an orchestra, it sounds amazing when every component is in tune, but if an instrument somewhere is out of tune, the whole performance can go sideways.
lol im switching to rancher
@TerraTech I just tried to reproduce this using the multi-node vagrant scripts (they use the generic scripts internally) running CoreOS 1164.1.0 with USE_CALICO set to both true and false and I havn't been able to reproduce the issue so far, but I still need to test with rkt as the runtime. I've been using the master branch of coreos-kubernetes.
I can commiserate that debugging k8s network problems can be difficult. I'll update this if I have better luck reproducing this. Are you setting this up on some existing VMs or baremetal?
Ripping out Calico did not fix the connectivity problem. However, it did remove an unneeded layer (at this point) and made the iptables output easier to grok and follow the flow.
@pbx0 I'm setting this up on a baremetal CoreOS lab cluster: 1 master, 1 etcd, 3 workers (2 disabled) **I try to avoid VMs so as to reduce any layering complexity and find working as close to baremetal is best especially with networking problems.
@mischief by adding --masquerade-all, the pods can now talk to the service-cluster-ip-range (10.69.11.0/24):
/ # nslookup kubernetes.default.svc.cluster.local
Server: 10.69.11.9
Address 1: 10.69.11.9 kube-dns.kube-system.svc.cluster.local
Name: kubernetes.default.svc.cluster.local
Address 1: 10.69.11.1 kubernetes.default.svc.cluster.local
/ #
/ # wget -qO- http://10.69.11.198
hostnames-3799501552-8oeb0
The wget is calling upon the debugging service hostnames as described here: http://kubernetes.io/docs/user-guide/debugging-services/#setup
@mischief is passing --masquerade-all now a requirement, or is this to further the debugging of this?
I also can submit netfilter TRACE packets if that would help. When I traced it last, it went through the nat and it was masq'd, however there was nothing for the return trip. e.g. nslookup(10.11.57.92) ==> 10.69.11.9 (masq:10.11.57.5), and no packets for 10.11.57.5 =XXX=> nslookup. I tried with both kube-dns and also the debugging hostnames pod to test both UDP and TCP traffic. In nf_conntrack, both UDP and TCP were in an Unreplied state.
@TerraTech i suggested it because i had an iptables problem where packets had mismatched source IPs and interfaces. it was simply a guess at the time because i was running out of ideas, but using --masquerade-all fixed it for me.
@TerraTech, @dghubble is our resident baremetal expert as he runs the https://github.com/coreos/coreos-baremetal repo. @dghubble do you happen to recognize this issue?
@mischief I would surmise this could now make a turn into the 'why' does adding that option allow this to work. I'm hoping that someone that knows kube-proxy better than I can chime in. Maybe it will shine a light on the underlying requirement or possibly a bug.
@TerraTech unfortunately i'm only a beginner in this area. i followed @thockin's suggestion at http://stackoverflow.com/a/34008477 to log iptables packets, and that's how i realized there was a SRC/interface mismatch. maybe it can lead to some more info for you...
@TerraTech my problem went away. i tried upgrading some of the pieces:
$ kube-aws version kube-aws version v0.8.2
kubernetesVersion: v1.4.0_coreos.1
$ cat /etc/os-release
NAME=CoreOS
ID=coreos
VERSION=1122.2.0
VERSION_ID=1122.2.0
BUILD_ID=2016-09-06-1449
PRETTY_NAME="CoreOS 1122.2.0 (MoreOS)"
ANSI_COLOR="1;32"
HOME_URL="https://coreos.com/"
BUG_REPORT_URL="https://github.com/coreos/bugs/issues"
I had the exact same issue when moving to the kubelet-wrapper (most of which was copied from this repo) and kubedns ..
core@ip-10-50-1-162 ~ $ cat /etc/os-release
NAME=CoreOS
ID=coreos
VERSION=1192.2.0
VERSION_ID=1192.2.0
BUILD_ID=2016-10-21-0026
PRETTY_NAME="CoreOS 1192.2.0 (MoreOS)"
ANSI_COLOR="1;32"
HOME_URL="https://coreos.com/"
BUG_REPORT_URL="https://github.com/coreos/bugs/issues"
core@ip-10-50-1-162 ~ $ docker images | grep hyperkube
quay.io/coreos/hyperkube v1.4.3_coreos.0 5d3bc50f8157 13 days ago 654.7 MB
The output from iptables nat tables looked fine but if a pod calling dns was on a node where a DNS endpoint existed .. i.e. out and in on the same box .. dns would fail. Note, cross node traffic was ok though, so DNS was sporadic. I shifting to k8s 1.4.3, but it didn't fix the issue. Eventually pinned it down to the br_netfilter module not being loaded
core@ip-10-50-1-162 ~ $ docker exec -ti 8330246fbbbe bash
root@nginx-3137573019-ndbbu:/# host www.google.com
;; reply from unexpected source: 10.10.67.11#53, expected 10.200.0.10#53
^Croot@nginx-3137573019-ndbbu:/# host www.google.com^C
root@nginx-3137573019-ndbbu:/# exit
core@ip-10-50-1-162 ~ $ lsmod | grep br_ne
core@ip-10-50-1-162 ~ $ sudo modprobe br_netfilter
core@ip-10-50-1-162 ~ $ lsmod | grep br_ne
br_netfilter 24576 0
bridge 110592 1 br_netfilter
# suddently starts to work again ...
core@ip-10-50-1-162 ~ $ docker exec -ti 8330246fbbbe bash
root@nginx-3137573019-ndbbu:/# host www.google.com
www.google.com has address 209.85.202.147
www.google.com has address 209.85.202.99
www.google.com has address 209.85.202.103
www.google.com has address 209.85.202.104
www.google.com has address 209.85.202.105
www.google.com has address 209.85.202.106
www.google.com has IPv6 address 2a00:1450:400b:c03::67
root@nginx-3137573019-ndbbu:/# host www.google.com^C
root@nginx-3137573019-ndbbu:/# exit
# remove the module to break it
core@ip-10-50-1-162 ~ $ sudo rmmod br_netfilter
core@ip-10-50-1-162 ~ $ docker exec -ti 8330246fbbbe bash
root@nginx-3137573019-ndbbu:/# host www.google.com
;; reply from unexpected source: 10.10.67.11#53, expected 10.200.0.10#53