dns icon indicating copy to clipboard operation
dns copied to clipboard

nodelocaldns rewrite throws NXDOMAIN

Open jsalatiel opened this issue 2 years ago • 3 comments

I am trying to use rewrite module to change the DNS response for pods inside the cluster. My setup contains coredns + nodelocaldns:

The default nodelocaldns configmap installed by kubespray is as follows:

apiVersion: v1
data:
  Corefile: |
    k8s.cluster:53 {
        errors
        cache {
            success 9984 30
            denial 9984 5
        }
        reload
        loop
        bind 169.254.25.10
        forward . 10.239.0.3 {
            force_tcp
        }
        prometheus :9253
        health 169.254.25.10:9254
    }
    in-addr.arpa:53 {
        errors
        cache 30
        reload
        loop
        bind 169.254.25.10
        forward . 10.239.0.3 {
            force_tcp
        }
        prometheus :9253
    }
    ip6.arpa:53 {
        errors
        cache 30
        reload
        loop
        bind 169.254.25.10
        forward . 10.239.0.3 {
            force_tcp
        }
        prometheus :9253
    }
    .:53 {
        errors
        cache 30
        reload
        loop
        bind 169.254.25.10
        forward . 172.16.0.162 172.16.0.163
        prometheus :9253
    }

Now what I am trying is to change the DNS for git.my.domain to git.gogs.svc.k8s.cluster. Before doing any changes to the nodelocaldns config map, this is the response I get from any pod:

For git.my.domain

# dig git.my.domain

; <<>> DiG 9.11.36-RedHat-9.11.36-3.el8 <<>> git.my.domain
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 1739
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: b969d9a997ba7123 (echoed)
;; QUESTION SECTION:
;git.my.domain.		IN	A

;; ANSWER SECTION:
git.my.domain.	27	IN	CNAME	lbi.my.domain.
lbi.my.domain.	27	IN	A	10.199.0.203

;; Query time: 0 msec
;; SERVER: 169.254.25.10#53(169.254.25.10)
;; WHEN: Tue Jun 21 19:48:23 UTC 2022
;; MSG SIZE  rcvd: 131

For git.gogs.svc.k8s.cluster:

# dig  git.gogs.svc.k8s.cluster

; <<>> DiG 9.11.36-RedHat-9.11.36-3.el8 <<>> git.gogs.svc.k8s.cluster
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 30461
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: 5f00ba7144c631a3 (echoed)
;; QUESTION SECTION:
;git.gogs.svc.k8s.cluster.	IN	A

;; ANSWER SECTION:
git.gogs.svc.k8s.cluster. 2	IN	A	10.239.7.63

;; Query time: 0 msec
;; SERVER: 169.254.25.10#53(169.254.25.10)
;; WHEN: Tue Jun 21 19:49:40 UTC 2022
;; MSG SIZE  rcvd: 111

Now I add the rewrite name git.my.domain git.gogs.svc.k8s.cluster to the configmap:

apiVersion: v1
data:
  Corefile: |
    k8s.cluster:53 {
        errors
        cache {
            success 9984 30
            denial 9984 5
        }
        reload
        loop
        bind 169.254.25.10
        forward . 10.239.0.3 {
            force_tcp
        }
        prometheus :9253
        health 169.254.25.10:9254
    }
    in-addr.arpa:53 {
        errors
        cache 30
        reload
        loop
        bind 169.254.25.10
        forward . 10.239.0.3 {
            force_tcp
        }
        prometheus :9253
    }
    ip6.arpa:53 {
        errors
        cache 30
        reload
        loop
        bind 169.254.25.10
        forward . 10.239.0.3 {
            force_tcp
        }
        prometheus :9253
    }
    .:53 {
        errors
        rewrite name git.my.domain git.gogs.svc.k8s.cluster                 <--------- THIS LINE ADDED
        cache 30
        reload
        loop
        bind 169.254.25.10
        forward . 172.16.0.162 172.16.0.163
        prometheus :9253
    }

and restart the nodelocaldns pods. After that all pods gets NXDOMAIN for the git.my.domain

# dig  git.my.domain

; <<>> DiG 9.11.36-RedHat-9.11.36-3.el8 <<>> git.my.domain
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 12586
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;git.my.domain.		IN	A

;; AUTHORITY SECTION:
.			30	IN	SOA	a.root-servers.net. nstld.verisign-grs.com. 2022062101 1800 900 604800 86400

;; Query time: 3 msec
;; SERVER: 169.254.25.10#53(169.254.25.10)
;; WHEN: Tue Jun 21 19:52:01 UTC 2022
;; MSG SIZE  rcvd: 119

Wasn't that suppose to work?

jsalatiel avatar Jun 21 '22 21:06 jsalatiel

The rewrite plugin doesn't re-feed the query back into coredns, the query more or less continues down the plugin chain, in the same server block, but with the query name altered. So, it's 172.16.0.162 172.16.0.163 that try to resolve the git.gogs.svc.k8s.cluster name, and they return NXDOMAIN.

The following will forward the rewritten query to 10.239.0.3.

apiVersion: v1
data:
  Corefile: |
    k8s.cluster:53 git.my.domain:53 {
        errors
        rewrite name git.my.domain git.gogs.svc.k8s.cluster
        cache {
            success 9984 30
            denial 9984 5
        }
        reload
        loop
        bind 169.254.25.10
        forward . 10.239.0.3 {
            force_tcp
        }
        prometheus :9253
        health 169.254.25.10:9254
    }
    in-addr.arpa:53 {
        errors
        cache 30
        reload
        loop
        bind 169.254.25.10
        forward . 10.239.0.3 {
            force_tcp
        }
        prometheus :9253
    }
    ip6.arpa:53 {
        errors
        cache 30
        reload
        loop
        bind 169.254.25.10
        forward . 10.239.0.3 {
            force_tcp
        }
        prometheus :9253
    }
    .:53 {
        errors
        cache 30
        reload
        loop
        bind 169.254.25.10
        forward . 172.16.0.162 172.16.0.163
        prometheus :9253
    }

chrisohaver avatar Jun 21 '22 22:06 chrisohaver

Thank you very much. It worked! Would you know how much entries I can have on the same block? Is there any string limitation? k8s.cluster:53 a:53 b:53 c:53 d:53 e:53 .... {

}

jsalatiel avatar Jun 22 '22 11:06 jsalatiel

Is there any string limitation?

No

chrisohaver avatar Jun 22 '22 12:06 chrisohaver

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot avatar Sep 20 '22 13:09 k8s-triage-robot

/close

dpasiukevich avatar Sep 20 '22 14:09 dpasiukevich