k3s icon indicating copy to clipboard operation
k3s copied to clipboard

Core DNS NodeHosts partially breaks resolving of node's addresses in dualstack configuration.

Open dmaes opened this issue 1 year ago • 2 comments

Environmental Info: K3s Version:

# k3s -v
k3s version v1.29.1+k3s2 (57482a1c)
go version go1.21.6

Node(s) CPU architecture, OS, and Version:

# uname -a
Linux k3sserver-01-srv.<domain> 6.1.0-13-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.55-1 (2023-09-29) x86_64 GNU/Linux

Cluster Configuration:

3 servers, 1 agent in a dualstack (IPv4 + IPv6) configuration.

Describe the bug:

Only the first InternalIP of a node is added to the CoreDNS NodeHosts configmap. In my case, this is the IPv4 address. This breaks dns resolving of the k3s node's AAAA records.

Steps To Reproduce:

Reproducing works with a simple single node with dualstack networking. Use either the args from the official docs (https://docs.k3s.io/installation/network-options#dual-stack-ipv4--ipv6-networking), or my config (see below).

Expected behavior:

AAAA resolving of the k3s nodes should work inside the cluster. Either all InternalIP's should be added to the NodeHosts, or upstream dns resolvers should be used when AAAA record is not found in NodeHosts, or there should be an option to disable NodeHosts alltogether and fully rely on upstream dns resolvers.

Actual behavior:

AAAA records are not resolved for k3s nodes

Additional context / logs:

K3s server config:

---
token: "<token>"
server: "https://api.k3s.<domain>:6443"
kubelet-arg: "node-ip=::"


tls-san: "api.k3s.<domain>"
datastore-endpoint: "https://localhost:2379"
datastore-certfile: /etc/rancher/k3s/etcd.crt
datastore-keyfile: /etc/rancher/k3s/etcd.key
kube-apiserver-arg:
  - "oidc-issuer-url=https://auth.<domain>/realms/dmaes"
  - "oidc-client-id=k3s-srv"
  - "oidc-username-claim=username"
  - "oidc-username-prefix=oidc:"
  - "oidc-groups-claim=groups"
  - "oidc-groups-prefix=oidc:"

flannel-backend: "wireguard-native"
flannel-ipv6-masq: "true"
cluster-cidr:
  - "192.168.32.0/22"
  - "2a02:xxxx:yyyy:f20::/62"
service-cidr:
  - "192.168.36.0/22"
  - "2a02:xxxx:yyyy:f24::/108"

disable:
  - traefik

write-kubeconfig: "/etc/rancher/k3s/k3s.yaml"
write-kubeconfig-mode: "0600"

DNS lookups from inside a pod:

# dig -t AAAA k3sserver-01-srv.<domain>; dig -t A k3sserver-01-srv.<domain>; dig -t AAAA k3sserver-01-srv.<domain> @192.168.30.10; dig -t A k3sserver-01-srv.<domain> @192.168.30.10

; <<>> DiG 9.18.24-1-Debian <<>> -t AAAA k3sserver-01-srv.<domain>
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 10871
;; flags: qr aa rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
; COOKIE: 00b973aa09a4c24c (echoed)
;; QUESTION SECTION:
;k3sserver-01-srv.<domain>.	IN	AAAA

;; Query time: 0 msec
;; SERVER: 192.168.36.10#53(192.168.36.10) (UDP)
;; WHEN: Thu Feb 22 11:29:13 UTC 2024
;; MSG SIZE  rcvd: 66


; <<>> DiG 9.18.24-1-Debian <<>> -t A k3sserver-01-srv.<domain>
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 13219
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
; COOKIE: 47d21deaaeca77d3 (echoed)
;; QUESTION SECTION:
;k3sserver-01-srv.<domain>.	IN	A

;; ANSWER SECTION:
k3sserver-01-srv.<domain>. 30	IN	A	192.168.30.74

;; Query time: 0 msec
;; SERVER: 192.168.36.10#53(192.168.36.10) (UDP)
;; WHEN: Thu Feb 22 11:29:14 UTC 2024
;; MSG SIZE  rcvd: 107


; <<>> DiG 9.18.24-1-Debian <<>> -t AAAA k3sserver-01-srv.<domain> @192.168.30.10
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 2890
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;k3sserver-01-srv.<domain>.	IN	AAAA

;; ANSWER SECTION:
k3sserver-01-srv.<domain>. 31	IN	AAAA	2a02:xxxx:yyyy:f1e::74

;; Query time: 3 msec
;; SERVER: 192.168.30.10#53(192.168.30.10) (UDP)
;; WHEN: Thu Feb 22 11:29:14 UTC 2024
;; MSG SIZE  rcvd: 82


; <<>> DiG 9.18.24-1-Debian <<>> -t A k3sserver-01-srv.<domain> @192.168.30.10
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 42660
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;k3sserver-01-srv.<domain>.	IN	A

;; ANSWER SECTION:
k3sserver-01-srv.<domain>. 31	IN	A	192.168.30.74

;; Query time: 0 msec
;; SERVER: 192.168.30.10#53(192.168.30.10) (UDP)
;; WHEN: Thu Feb 22 11:29:14 UTC 2024
;; MSG SIZE  rcvd: 70

Coredns config (kubectl -n kube-system get cm coredns):

apiVersion: v1
data:
  Corefile: |
    .:53 {
        errors
        health
        ready
        kubernetes cluster.local in-addr.arpa ip6.arpa {
          pods insecure
          fallthrough in-addr.arpa ip6.arpa
        }
        hosts /etc/coredns/NodeHosts {
          ttl 60
          reload 15s
          fallthrough
        }
        prometheus :9153
        forward . /etc/resolv.conf
        cache 30
        loop
        reload
        loadbalance
        import /etc/coredns/custom/*.override
    }
    import /etc/coredns/custom/*.server
  NodeHosts: |
    192.168.30.74 k3sserver-01-srv.<domain>
    192.168.30.94 k3sserver-02-srv.<domain>
    192.168.30.72 k3sagent-01-srv.<domain>
    192.168.30.95 k3sserver-03-srv.<domain>
kind: ConfigMap
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      ...
    objectset.rio.cattle.io/applied: ...
    objectset.rio.cattle.io/id: ""
    objectset.rio.cattle.io/owner-gvk: k3s.cattle.io/v1, Kind=Addon
    objectset.rio.cattle.io/owner-name: coredns
    objectset.rio.cattle.io/owner-namespace: kube-system
  creationTimestamp: "2023-09-27T16:36:57Z"
  labels:
    objectset.rio.cattle.io/hash: bce283298811743a0386ab510f2f67ef74240c57
  name: coredns
  namespace: kube-system
  resourceVersion: "54819030"
  uid: 4851c84a-822c-4dde-a468-64e2b214d7df

function updating the NodeHosts config with node's ip address: https://github.com/k3s-io/k3s/blob/fae0d998631ec7e001934251900b8b25c5d3cb4d/pkg/node/controller.go#L74 code that lists all node addresses and passes only the first hit for InternalIP to said function: https://github.com/k3s-io/k3s/blob/fae0d998631ec7e001934251900b8b25c5d3cb4d/pkg/node/controller.go#L54-L70

dmaes avatar Feb 22 '24 11:02 dmaes

or upstream dns resolvers should be used when AAAA record is not found in NodeHosts

This seems like something that would need to be fixed in coredns. If the hosts file only contains IPv4 entries for a host, but the lookup is for an IPv6 record, I don't know why it wouldn't recurse as if the entries did not exists in the hosts file.

We can look at adding all InternalIP entries to the hosts file, but the root cause here seems like a coredns problem.

brandond avatar Feb 22 '24 18:02 brandond

On further investigation, not falling through for IPv6 when only an IPv4 record is defined is also how /etc/hosts behaves, so I don't think CoreDNS should deviate from that and I think either adding all InternalIP's to NodeHosts or having an option to disable NodeHosts would be the best solution. (but I'm still willing to open an issue for CoreDNS if the k3s maintainers think it should fallthrough).

Test on local normal linux machine:

> ~ $ cat /etc/hosts | grep k3sserver
> ~ $ ping -4 -c 1 k3sserver-01-srv.<domain>
PING k3sserver-01-srv.<domain> (192.168.30.74) 56(84) bytes of data.
64 bytes from k3sserver-01-srv.<domain> (192.168.30.74): icmp_seq=1 ttl=63 time=0.358 ms

--- k3sserver-01-srv.<domain> ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.358/0.358/0.358/0.000 ms
> ~ $ ping -6 -c 1 k3sserver-01-srv.<domain>
PING k3sserver-01-srv.<domain> (2a02:xxxx:yyyy:f1e::74) 56 data bytes
64 bytes from k3sserver-01-srv.<domain> (2a02:xxxx:yyyy:f1e::74): icmp_seq=1 ttl=63 time=0.920 ms

--- k3sserver-01-srv.<domain> ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.920/0.920/0.920/0.000 ms


> ~ $ echo "$(dig -t A +short k3sserver-01-srv.<domain>) k3sserver-01-srv.<domain>" | sudo tee -a /etc/hosts
192.168.30.74 k3sserver-01-srv.<domain>
> ~ $ cat /etc/hosts | grep k3sserver
192.168.30.74 k3sserver-01-srv.<domain>
> ~ $ ping -4 -c 1 k3sserver-01-srv.<domain>
PING k3sserver-01-srv.<domain> (192.168.30.74) 56(84) bytes of data.
64 bytes from k3sserver-01-srv.<domain> (192.168.30.74): icmp_seq=1 ttl=63 time=0.476 ms

--- k3sserver-01-srv.<domain> ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.476/0.476/0.476/0.000 ms
> ~ $ ping -6 -c 1 k3sserver-01-srv.<domain>
ping: k3sserver-01-srv.<domain>: Address family for hostname not supported
> ~ $ curl -6 -I -k https://k3sserver-01-srv.<domain>:6443
curl: (6) Could not resolve host: k3sserver-01-srv.<domain>

dmaes avatar Feb 26 '24 11:02 dmaes

not falling through for IPv6 when only an IPv4 record is defined is also how /etc/hosts behaves

That is interesting, thanks for the data point! I think we can address this in the March release cycle.

brandond avatar Feb 26 '24 23:02 brandond

##Environment Details Reproduced using VERSION=v1.29.2+k3s1 Validated using VERSION=v1.29.3-rc1+k3s1

Infrastructure

  • [X] Cloud

Node(s) CPU architecture, OS, and version:

Linux 5.11.0-1022-aws x86_64 GNU/Linux PRETTY_NAME="Ubuntu 20.04.3 LTS"

Cluster Configuration:

NAME               STATUS   ROLES                       AGE   VERSION
ip-1-1-1-29        Ready    control-plane,etcd,master   12m   v1.29.3-rc1+k3s1

Config.yaml:

$ get_figs

=========== k3s config ===========
node-external-ip: 1.2.3.3,cafd:1ced:cabe:ee48:8488:7755:3379:91cf
token: YOUR_TOKEN_HERE
write-kubeconfig-mode: 644
debug: true
protect-kernel-defaults: true
cluster-init: true
node-ip: 1.1.1.29,cafd:1ced:cabe:ee48:8488:7755:3379:91cf
cluster-cidr: 10.42.0.0/16,2001:cafe:42:0::/56
service-cidr: 10.43.0.0/16,2001:cafe:42:1::/112

Reproduction

$ curl https://get.k3s.io --output install-"k3s".sh
$ sudo chmod +x install-"k3s".sh
$ sudo groupadd --system etcd && sudo useradd -s /sbin/nologin --system -g etcd etcd
$ sudo modprobe ip_vs_rr
$ sudo modprobe ip_vs_wrr
$ sudo modprobe ip_vs_sh
$ sudo printf "on_oovm.panic_on_oom=0 \nvm.overcommit_memory=1 \nkernel.panic=10 \nkernel.panic_ps=1 \nkernel.panic_on_oops=1 \n" > ~/90-kubelet.conf
$ sudo cp 90-kubelet.conf /etc/sysctl.d/
$ sudo systemctl restart systemd-sysctl
$ VERSION=v1.29.2+k3s1
$ sudo INSTALL_K3S_VERSION=$VERSION INSTALL_K3S_EXEC=server ./install-k3s.sh
$ kgn //kubectl get nodes checking status
$ set_kubefig //export KUBECONFIG=/etc/rancher/k3s/k3s.yaml
$ kg no,po,svc -o wide -A
$ kg cm coredns -n kube-system -o yaml
$ void //sudo k3s-killall.sh && sudo k3s-uninstall.sh vacuum logs etc
$ go_replay //recreate /etc/rancher/k3s/ directory re-copy previously good config.yaml
$ get_figs //view config.yaml in /etc/rancher/k3s/
$ VERSION=v1.29.3-rc1+k3s1
$ sudo INSTALL_K3S_VERSION=$VERSION INSTALL_K3S_EXEC=server ./install-k3s.sh
$ kg no,po,svc -A -o wide
$ kg cm coredns -n kube-system -o yaml

Results:

//previous release behavior on dualstack node - the ipv6 address from the host does not get written into the coredns configmap

$ kg cm coredns -n kube-system -o yaml

apiVersion: v1
data:
  Corefile: |
    .:53 {
        errors
        health
        ready
        kubernetes cluster.local in-addr.arpa ip6.arpa {
          pods insecure
          fallthrough in-addr.arpa ip6.arpa
        }
        hosts /etc/coredns/NodeHosts {
          ttl 60
          reload 15s
          fallthrough
        }
        prometheus :9153
        forward . /etc/resolv.conf
        cache 30
        loop
        reload
        loadbalance
        import /etc/coredns/custom/*.override
    }
    import /etc/coredns/custom/*.server
  NodeHosts: |
    1.1.1.29 ip-1-1-1-29
kind: ConfigMap
metadata:
  annotations:
    objectset.rio.cattle.io/applied: H4sIAAAAAAAA/4yQwWrzMBCEX0Xs2fEf20nsX9BDybH02lMva2kdq1Z2g6SkBJN3L8IUCiVtbyNGOzvfzoAn90IhOmHQcKmgAIsJQc+wl0CD8wQaSr1t1PzKSilFIUiIix4JfRoXHQjtdZHTuafAlCgq488xUSi9wK2AybEFDXvhwR2e8QQFHCnh50ZkloTJCcf8lP6NTIqUyuCkNJiSp9LJP5czoLjryztTWB0uE2iYmvjFuVSFenJsHx6tFf41gvGY6Y0Eshz/9D2e0OSZfIJVvMZExwzusSf/I9SIcQQNvaG6a+r/XVdV7abBddPtsN9W66Eedi0N7aberM22zaHf6t0tcPsIAAD//8Ix+PfoAQAA
    objectset.rio.cattle.io/id: ""
    objectset.rio.cattle.io/owner-gvk: k3s.cattle.io/v1, Kind=Addon
    objectset.rio.cattle.io/owner-name: coredns
    objectset.rio.cattle.io/owner-namespace: kube-system
  creationTimestamp: "2024-03-22T16:51:51Z"
  labels:
    objectset.rio.cattle.io/hash: bce283298811743a0386ab510f2f67ef74240c57
  name: coredns
  namespace: kube-system
  resourceVersion: "320"
  uid: 79eb205f-b320-460c-8f50-59d6b59d12b8

//new behavior coredns configmap gets nodes full dualstack ipv4/ipv6 addresses written in.

$ kg cm coredns -n kube-system -o yaml

apiVersion: v1
data:
  Corefile: |
    .:53 {
        errors
        health
        ready
        kubernetes cluster.local in-addr.arpa ip6.arpa {
          pods insecure
          fallthrough in-addr.arpa ip6.arpa
        }
        hosts /etc/coredns/NodeHosts {
          ttl 60
          reload 15s
          fallthrough
        }
        prometheus :9153
        forward . /etc/resolv.conf
        cache 30
        loop
        reload
        loadbalance
        import /etc/coredns/custom/*.override
    }
    import /etc/coredns/custom/*.server
  NodeHosts: |
    1.1.1.29 ip-1-1-1-29
    cafd:1ced:cabe:ee48:8488:7755:3379:91cf ip-1-1-1-29
kind: ConfigMap
metadata:
  annotations:
    objectset.rio.cattle.io/applied: H4sIAAAAAAAA/4yQwWrzMBCEX0Xs2fEf20nsX9BDybH02lMva2kdq1Z2g6SkBJN3L8IUCiVtbyNGOzvfzoAn90IhOmHQcKmgAIsJQc+wl0CD8wQaSr1t1PzKSilFIUiIix4JfRoXHQjtdZHTuafAlCgq488xUSi9wK2AybEFDXvhwR2e8QQFHCnh50ZkloTJCcf8lP6NTIqUyuCkNJiSp9LJP5czoLjryztTWB0uE2iYmvjFuVSFenJsHx6tFf41gvGY6Y0Eshz/9D2e0OSZfIJVvMZExwzusSf/I9SIcQQNvaG6a+r/XVdV7abBddPtsN9W66Eedi0N7aberM22zaHf6t0tcPsIAAD//8Ix+PfoAQAA
    objectset.rio.cattle.io/id: ""
    objectset.rio.cattle.io/owner-gvk: k3s.cattle.io/v1, Kind=Addon
    objectset.rio.cattle.io/owner-name: coredns
    objectset.rio.cattle.io/owner-namespace: kube-system
  creationTimestamp: "2024-03-22T17:17:02Z"
  labels:
    objectset.rio.cattle.io/hash: bce283298811743a0386ab510f2f67ef74240c57
  name: coredns
  namespace: kube-system
  resourceVersion: "409"
  uid: 15b87099-e97d-4580-b426-5b85e4d4db76

VestigeJ avatar Mar 22 '24 17:03 VestigeJ