unbound
unbound copied to clipboard
Dual ipv4 ipv6 resolving not working in container with alpine image
Hi, I encountered a bug when trying to reach the Nameserver directly (not systemd-resolved) from various ubuntu and debian versions (18.04 - 22.04, Deb 10, 11)
When trying to resolve both A and AAAA in one request, Unbound does reject the AAAA Answer and there is a timeout of 5 seconds until single requests are being made to unbound which then get resolved.
the resolv.conf on the client (10.0.23.99) contains only
nameserver 10.0.1.100
Here a TCPDUMP to show the problem using ping on the client for the host ard.de
11:26:23.597146 IP 10.0.23.99.58527 > 10.0.1.100.domain: 23760+ A? ard.de. (24)
11:26:23.597161 IP 10.0.23.99.58527 > 10.0.1.100.domain: 30934+ AAAA? ard.de. (24)
11:26:23.605194 IP 10.0.1.100.domain > 10.0.23.99.58527: 30934 0/1/0 (96)
.... now client is waiting 5 seconds because AAAA was not answered
11:26:28.601337 IP 10.0.23.99.58527 > 10.0.1.100.domain: 23760+ A? ard.de. (24)
11:26:28.609296 IP 10.0.1.100.domain > 10.0.23.99.58527: 23760 1/0/0 A 34.120.237.106 (40)
11:26:28.609363 IP 10.0.23.99.58527 > 10.0.1.100.domain: 30934+ AAAA? ard.de. (24)
11:26:28.616915 IP 10.0.1.100.domain > 10.0.23.99.58527: 30934 0/1/0 (96)
also for a host google.com
11:33:34.604438 IP 10.0.23.99.36378 > 10.0.1.100.domain: 42776+ A? google.com. (28)
11:33:34.604453 IP 10.0.23.99.36378 > 10.0.1.100.domain: 19202+ AAAA? google.com. (28)
11:33:34.612560 IP 10.0.1.100.domain > 10.0.23.99.36378: 42776 1/0/0 A 142.250.180.110 (44)
.... now client is waiting 5 seconds because AAAA was not answered
11:33:39.608676 IP 10.0.23.99.36378 > 10.0.1.100.domain: 42776+ A? google.com. (28)
11:33:39.616645 IP 10.0.1.100.domain > 10.0.23.99.36378: 42776 1/0/0 A 142.250.180.110 (44)
11:33:39.616710 IP 10.0.23.99.36378 > 10.0.1.100.domain: 19202+ AAAA? google.com. (28)
11:33:39.623973 IP 10.0.1.100.domain > 10.0.23.99.36378: 19202 1/0/0 AAAA 2a00:1450:4008:805::200e (56)
there you see, the AAAA answer is only sent back when a single request is being made
A way to prove this once more is to add "options single-request" to /etc/resolv.conf
Tested with unbound 1.13 - 1.15 on kubernetes with Alpine image
it seems like it is corrolated with the glibc change in 2.9 years ago which enabled that feature https://udrepper.livejournal.com/20948.html
might also be completely an network issue related to race conditions in UDP: https://www.weave.works/blog/racy-conntrack-and-dns-lookup-timeouts