docker-alpine
docker-alpine copied to clipboard
Only first response from DNS is used
Summary
When docker host machine has several DNS servers, only first response is used by applications in alpine container.
Details
I have debian host with networkmanager and openvpn VPN. It has 3 nameservers configured at /etc/resolv.conf: one for intranet (comes from VPN), and other 2 are world-wide:
$ cat /etc/resolv.conf
# Generated by NetworkManager
search lan.*** companylocal
nameserver 10.0.4.1
nameserver 192.168.0.1
nameserver 8.8.8.8
All command inside alpine container are unable to connect to resources available by VPN, which addresses are provided by VPN DNS server (10.0.4.1). For example:
$ docker run -ti alpine:3.12.1 wget http://gitlab.***/
wget: bad address 'gitlab.lan.***'
tcpdump for request above is:
12:59:09.304769 IP 172.17.0.2.53625 > 10.0.4.1.53: 3417+ A? gitlab.lan.***. (35)
12:59:09.304813 IP 172.17.0.2.53625 > 192.168.0.1.53: 3417+ A? gitlab.lan.***. (35)
12:59:09.304830 IP 172.17.0.2.53625 > 8.8.8.8.53: 3417+ A? gitlab.lan.***. (35)
12:59:09.304840 IP 172.17.0.2.53625 > 10.0.4.1.53: 3758+ AAAA? gitlab.lan.***. (35)
12:59:09.304847 IP 172.17.0.2.53625 > 192.168.0.1.53: 3758+ AAAA? gitlab.lan.***. (35)
12:59:09.304853 IP 172.17.0.2.53625 > 8.8.8.8.53: 3758+ AAAA? gitlab.lan.***. (35)
12:59:09.305533 IP 192.168.0.1.53 > 172.17.0.2.53625: 3417 NXDomain 0/1/0 (119)
12:59:09.305821 IP 192.168.0.1.53 > 172.17.0.2.53625: 3758 NXDomain 0/1/0 (119)
12:59:09.306421 IP 10.0.4.1.53 > 172.17.0.2.53625: 3758* 0/1/0 (76)
12:59:09.306460 IP 10.0.4.1.53 > 172.17.0.2.53625: 3417* 1/1/1 A 10.0.4.142 (84)
12:59:09.320609 IP 8.8.8.8.53 > 172.17.0.2.53625: 3758 NXDomain 0/1/0 (119)
12:59:09.378054 IP 8.8.8.8.53 > 172.17.0.2.53625: 3417 NXDomain 0/1/0 (119)
But, when response from intranet DNS comes first, resolving succeeds. Containers use default 'bridge' network. I checked /etc/resolve.conf inside containers - it matches one from host.
In other containers, not based on alpine (I checked debian and ubuntu), everything works fine.
@vsevolod-fedorov @VBelozyorov
IMHO it was a bug in the libc (musl) which provided the dns query functions to alpine (busybox),
the official repo: git://git.musl-libc.org/musl, in the code src/network/res_msend.c, and the function __res_msend_rc
would send mutiple DNS queries to all NS (e,g. 10.0.4.1, 192.168.0.1, 8.8.8.8) in parallel, and the tcpdump snips in your previous comment also indicated it, but when receiving the DNS responses, it (the function __res_msend_rc
) just saved the first A/AAAA response from any of the NS, so if the first A/AAAA response was NXDomain, the upper-layer functions e.g. getaddrinfo/gethostbyname
depended by wget (on alpine) got no chance to handle other nameservers' A/AAAA responses, and just returned the error: wget: bad address 'gitlab.lan.***'
thanks,
@inter169 thank you for the explanation! So is it already fixed there, should we just wait when fix will be merged? I'm failed to find it by myself in the specified repo.
@VBelozyorov the official musl repo ( git://git.musl-libc.org/musl, src/network/res_msend.c) hasn't any code implements for this yet. just discussed this issue on the IRC #musl, there's no plan to implement for unioning the DNS records in case of NXDomain. they suggested to set up the dns configuration correctly.
@VBelozyorov I coded an implement to return the first A/AAAA record if available from another nameserver in case of NxDomain. and pushed the docker image: geekidea/alpine-nx:3.12
thanks
I had a similar error when I updated Alpine from 3.12.3 to 3.13.0 and a Docker restart + update fixed it 😅
same issue with :
NAME="Alpine Linux"
ID=alpine
VERSION_ID=3.13.7
PRETTY_NAME="Alpine Linux v3.13"
HOME_URL="https://alpinelinux.org/"
BUG_REPORT_URL="https://bugs.alpinelinux.org/"
cpu arch: aarch64