coredns icon indicating copy to clipboard operation
coredns copied to clipboard

CoreDNS resolution failure for oauth2.googleapis.com with "overflow unpacking uint32"

Open adorn opened this issue 1 year ago • 12 comments
trafficstars

I'm not able to resolve oauth2.googleapis.com in my pods / docker-desktop kubernetes cluster

kubectl logs coredns-5dd5756b68-5cstw --namespace=kube-system
CoreDNS-1.10.1
linux/arm64, go1.20, 055b2c3
[ERROR] plugin/errors: 2 oauth2.googleapis.com. A: dns: overflow unpacking uint32

To get it work, I patched it back to 1.10.0: kubectl patch deployment coredns -n kube-system -p '{"spec":{"template":{"spec":{"containers":[{"name":"coredns", "image":"coredns/coredns:1.10.0"}]}}}}'

You may close this Issure and reopen: (https://github.com/coredns/coredns/issues/3305)

adorn avatar Dec 11 '23 09:12 adorn

Can you provide a packet capture of the DNS response from the upstream DNS server?

chrisohaver avatar Dec 11 '23 13:12 chrisohaver

If you possible, can you build and test using the latest commit in the master branch? There have been some workarounds recently committed related to overflowed packets received from upstream servers.

chrisohaver avatar Dec 11 '23 14:12 chrisohaver

I went over the changes from 1.10.0 to 1.10.1, and I don't see anything obviously related. And the version of the dns library (miekg/dns) was the same for both these versions (github.com/miekg/dns v1.1.50)

Is this error consistently reproducible in 1.10.1, and consistently not present in 1.10.0?

chrisohaver avatar Dec 11 '23 14:12 chrisohaver

@chrisohaver in this comment I've added a hexdump of the DNS responses (one in-front of docker-desktop, one within the docker-desktop Kubernetes cluster targeting CoreDNS). In this case I think its the fault of docker-desktop not supporting compression and exceeding the maximum UDP datagram size of 512bytes

xvzf avatar Dec 13 '23 15:12 xvzf

The error is consistently and reproducible in 1.10.1, and consistently not present in 1.10.0! Today I was patching between these two versions serval times and it's always working in 1.10.0 and not in 1.10.1 It's also not working in Version 1.11.1

adorn avatar Dec 13 '23 19:12 adorn

It's also not working in Version 1.11.1

Thanks, @adorn. If possible, can you build and test using the latest commit in the master branch? There have been some workarounds recently committed related to overflowed packets received from upstream servers.

chrisohaver avatar Dec 13 '23 19:12 chrisohaver

I tested this using build of the current master branch in a kind cluster running on docker desktop for Mac, and it worked. However I also similarly tested 1.10.1 in the same way and I was unable to replicate the error. I was able to query for oauth2.googleapis.com without error in both cases.

cohaver coredns % kubectl -n kube-system logs coredns-fbf49465b-n4v48

.:53
[INFO] plugin/reload: Running configuration SHA512 = 591cf328cccc12bc490481273e738df59329c62c0b729d94e8b61db9961c2fa5f046dd37f1cf888b953814040d180f52594972691cd6ff41be96639138a43908
CoreDNS-1.10.1
linux/amd64, go1.21.1, 055b2c31a

cohaver coredns % kubectl exec -it dnsutils -- bash           
  
root@dnsutils:/# dig oauth2.googleapis.com        

; <<>> DiG 9.9.5-9+deb8u19-Debian <<>> oauth2.googleapis.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 17674
;; flags: qr rd ra; QUERY: 1, ANSWER: 8, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;oauth2.googleapis.com.         IN      A

;; ANSWER SECTION:
oauth2.googleapis.com.  30      IN      A       142.251.111.95
oauth2.googleapis.com.  30      IN      A       172.253.122.95
oauth2.googleapis.com.  30      IN      A       172.253.63.95
oauth2.googleapis.com.  30      IN      A       142.251.163.95
oauth2.googleapis.com.  30      IN      A       142.251.167.95
oauth2.googleapis.com.  30      IN      A       172.253.115.95
oauth2.googleapis.com.  30      IN      A       172.253.62.95
oauth2.googleapis.com.  30      IN      A       142.251.16.95

;; Query time: 154 msec
;; SERVER: 10.96.0.10#53(10.96.0.10)
;; WHEN: Wed Dec 13 21:36:56 UTC 2023
;; MSG SIZE  rcvd: 346

root@dnsutils:/# 

Anyways, I suspect this issue is related to https://github.com/coredns/coredns/issues/5998 (in that issue there is some explanation as to why this occurs). There is a workaround already merged for it, so it will be included in the next CoreDNS release.

chrisohaver avatar Dec 13 '23 21:12 chrisohaver

did:

brew install go
git clone https://github.com/coredns/coredns 
cd coredns
make
docker build -t coredns/coredns:latest .
kubectl patch deployment coredns -n kube-system -p '{"spec":{"template":{"spec":{"containers":[{"name":"coredns", "image":"coredns/coredns:latest"}]}}}}'

kubectl get pods --namespace=kube-system returned:

NAME                                     READY   STATUS             RESTARTS      AGE
coredns-757d49bccd-j8x6n                 0/1     CrashLoopBackOff   6 (16s ago)   6m17s
coredns-757d49bccd-mb8pf                 0/1     CrashLoopBackOff   6 (33s ago)   6m17s
coredns-85d98f4675-wqmbw                 1/1     Running            0             117m

kubectl logs coredns-757d49bccd-j8x6n --namespace=kube-system exec /coredns: exec format error

Running coredns outside docker is fine: ./coredns -dns.port 5300

dig @127.0.0.1 -p 5300 oauth2.googleapis.com

; <<>> DiG 9.18.20 <<>> @127.0.0.1 -p 5300 oauth2.googleapis.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 6365
;; flags: qr aa rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 3
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
; COOKIE: 6a3f8a8fd6eaba72 (echoed)
;; QUESTION SECTION:
;oauth2.googleapis.com.		IN	A

;; ADDITIONAL SECTION:
oauth2.googleapis.com.	0	IN	A	127.0.0.1
_udp.oauth2.googleapis.com. 0	IN	SRV	0 0 57584 .

;; Query time: 0 msec
;; SERVER: 127.0.0.1#5300(127.0.0.1) (UDP)
;; WHEN: Wed Dec 13 23:03:10 CET 2023
;; MSG SIZE  rcvd: 144

logs:

.:5300
CoreDNS-1.11.1
darwin/arm64, go1.21.5, d3e58b3f
[INFO] 127.0.0.1:58215 - 8275 "A IN oauth2.googleapis.com. udp 62 false 1232" NOERROR qr,aa,rd 121 0.00318075s

adorn avatar Dec 13 '23 22:12 adorn

RE:

kubectl logs coredns-757d49bccd-j8x6n --namespace=kube-system
exec /coredns: exec format error

If your cluster is backed by x86-64 machines, and you're using an ARM Mac:

brew install go
...
docker build -t coredns/coredns:latest .

You'll need to build for the correct platform. Try:

docker build -t coredns/coredns:latest --platform=linux/amd64 .

(I think building off of latest upstream will resolve the issue you're facing here - upstream contains https://github.com/coredns/coredns/pull/6277, which I believe fixed this on our end.)

cjgibson avatar Dec 21 '23 05:12 cjgibson

A small piece of log from the linked issue (https://github.com/docker/for-win/issues/13808):

# ping oauth2.googleapis.com
ping: connect: Cannot assign requested address

# ping oauth2.googleapis.com
ping: oauth2.googleapis.com: Temporary failure in name resolution

# ping login.microsoft.com
ping: login.microsoft.com: Temporary failure in name resolution

# ping google.com
PING google.com (142.250.187.110) 56(84) bytes of data.

ArtemAvramenko avatar Feb 13 '24 13:02 ArtemAvramenko

There is a workaround this issue already merged as of September 2023 (https://github.com/coredns/coredns/pull/6277).

It will be included in the next CoreDNS release (1.11.2).

chrisohaver avatar Feb 13 '24 14:02 chrisohaver

One confusing point - the release notes say that CoreDNS was updated to v1.10.1 in Docker 4.21, but for some reason the issue only appeared in 4.25.

ArtemAvramenko avatar Feb 14 '24 09:02 ArtemAvramenko