mailcow-dockerized
mailcow-dockerized copied to clipboard
Docker v26.0.0 breaks DNS
Contribution guidelines
- [X] I've read the contribution guidelines and wholeheartedly agree
I've found a bug and checked that ...
- [X] ... I understand that not following the below instructions will result in immediate closure and/or deletion of my issue.
- [X] ... I have understood that this bug report is dedicated for bugs, and not for support-related inquiries.
- [X] ... I have understood that answers are voluntary and community-driven, and not commercial support.
- [X] ... I have verified that my issue has not been already answered in the past. I also checked previous issues.
Description
Since Upgrading to Docker 26.0.0 Mailcow is producing lots of DNS errors.
This MAY be connected to the following depreciation in Docker 26.0.0!!??
CVE-2024-29018
: Do not forward requests to external DNS servers for a container that is only connected to an 'internal' network. Previously, requests were forwarded if the host's DNS server was running on a loopback address, like systemd's 127.0.0.53. moby/moby#47589
Source: https://docs.docker.com/engine/release-notes/26.0/#bug-fixes-and-enhancements
### Logs:
```plain text
Mar 22 08:02:03 olive dockerd[220407]: time="2024-03-22T08:02:03.235990882Z" level=error msg="[resolver] failed to query external DNS server" client-addr="udp:172.22.1.253:44521" dns-server="udp:172.22.1.254:53" error="read udp 172.22.1.253:44521->172.22.1.254:53: i/o timeout" question=";133.138.180.139.bl.spamcop.net.\tIN\t A" spanID=6a533445a593b326 traceID=e8790345dcf9b4465b776ee33783b6cd
Mar 22 08:02:38 olive dockerd[220407]: time="2024-03-22T08:02:38.492917213Z" level=error msg="[resolver] failed to query external DNS server" client-addr="udp:172.22.1.253:43701" dns-server="udp:172.22.1.254:53" error="read udp 172.22.1.253:43701->172.22.1.254:53: i/o timeout" question=";2.4.8.d.7.a.e.f.f.f.1.0.0.0.4.5.f.9.3.2.5.0.0.0.0.f.9.1.1.0.0.2.b.barracudacentral.org.\tIN\t A" spanID=af98360abc780b84 traceID=22daca411668dcd55f3f5d773b9b5c1e
Mar 22 08:02:38 olive dockerd[220407]: time="2024-03-22T08:02:38.500301774Z" level=error msg="[resolver] failed to query external DNS server" client-addr="udp:172.22.1.253:42026" dns-server="udp:172.22.1.254:53" error="read udp 172.22.1.253:42026->172.22.1.254:53: i/o timeout" question=";2.4.8.d.7.a.e.f.f.f.1.0.0.0.4.5.f.9.3.2.5.0.0.0.0.f.9.1.1.0.0.2.bl.spamcop.net.\tIN\t A" spanID=d18c21e1eb7f6063 traceID=531742d1fafa14d82b97d8b1e3963bf2
Mar 22 08:02:42 olive dockerd[220407]: time="2024-03-22T08:02:42.501940834Z" level=error msg="[resolver] failed to query external DNS server" client-addr="udp:172.22.1.253:45499" dns-server="udp:172.22.1.254:53" error="read udp 172.22.1.253:45499->172.22.1.254:53: i/o timeout" question=";2.4.8.d.7.a.e.f.f.f.1.0.0.0.4.5.f.9.3.2.5.0.0.0.0.f.9.1.1.0.0.2.bl.spamcop.net.\tIN\t A" spanID=d1f1e178d36c79f5 traceID=b1e263f5139bbd02f59b05b89f0926e6
Mar 22 08:03:02 olive dockerd[220407]: time="2024-03-22T08:03:02.268089673Z" level=error msg="[resolver] failed to query external DNS server" client-addr="udp:172.22.1.253:39105" dns-server="udp:172.22.1.254:53" error="read udp 172.22.1.253:39105->172.22.1.254:53: i/o timeout" question=";11.16.227.165.bl.spamcop.net.\tIN\t A" spanID=d904f6f2b08ce936 traceID=3d00cbfc2ba83136d65b88c0dfc68b85
Steps to reproduce:
Run Mailcow on Docker v26.0.0
Which branch are you using?
master
Which architecture are you using?
x86
Operating System:
Ubuntu 22.04 LTS
Server/VM specifications:
2 cores, 4GB RAM
Is Apparmor, SELinux or similar active?
no
Virtualization technology:
KVM I think, it's a VPS server
Docker version:
v26.0.0
docker-compose version or docker compose version:
v2.25.0
mailcow version:
2024-02
Reverse proxy:
None
Logs of git diff:
N/A
Logs of iptables -L -vn:
N/A
Logs of ip6tables -L -vn:
N/A
Logs of iptables -L -vn -t nat:
N/A
Logs of ip6tables -L -vn -t nat:
N/A
DNS check:
104.18.32.7
172.64.155.249
Any update on this please? People on Docker v26 will have no RBL functionality on their server until we find a fix. Thank you
@dogsbody where exactly are you seeing these log messages? I upgraded to Docker v26.0.0 this morning and mailcow runs for several hours without any issues.
With either of these commands
grep "failed to query external DNS" /var/log/syslog
journalctl --since today | grep "failed to query external DNS"
Here's another line from another Mailcow server from a few minutes ago
Mar 27 09:25:50 mail01 dockerd[355773]: time="2024-03-27T09:25:50.895007331Z" level=error msg="[resolver] failed to query external DNS server" client-addr="udp:172.22.1.253:44732" dns-server="udp:172.22.1.254:53" error="read udp 172.22.1.253:44732->172.22.1.254:53: i/o timeout" question=";169.33.6.112.in-addr.arpa.\tIN\t PTR" spanID=ca8733dffb2eeeb8 traceID=2e6f7ee85a991ced1b514afd24216fd9
Oh - I only checked the container logs. Thanks! I can also find these messages in the systemd logs on my system (AlmaLinux 9.3)
hi, I also see errors
mars 30 13:11:22 Dell-9010 dockerd[536]: time="2024-03-30T13:11:22.951801038+01:00" level=error msg="[resolver] failed to query external DNS server" client-addr="udp:172.22.1.253:49241" dns-server="udp:172.22.1.254:53" error="read udp 172.22.1.253:49241->172.22.1.254:53: i/o timeout" question=";254.1.168.192.dnsbl.sorbs.net.\tIN\t A" spanID=74c0fb8ccfe6e7b6 traceID=bf95bdcf3b6cb5edf720eb2781f94ffc
mars 30 13:18:04 Dell-9010 dockerd[536]: time="2024-03-30T13:18:04.983303145+01:00" level=error msg="[resolver] failed to query external DNS server" client-addr="udp:172.22.1.14:60250" dns-server="udp:172.22.1.254:53" error="read udp 172.22.1.14:60250->172.22.1.254:53: i/o timeout" question=";_25._tcp.mail.datanetwork.cloud.\tIN\t ANY" spanID=14de718d26666066 traceID=aab42a4edc17280d27b4028b49f42ddd
mars 30 13:18:07 Dell-9010 dockerd[536]: time="2024-03-30T13:18:07.486205113+01:00" level=error msg="[resolver] failed to query external DNS server" client-addr="udp:172.22.1.14:40350" dns-server="udp:172.22.1.254:53" error="read udp 172.22.1.14:40350->172.22.1.254:53: i/o timeout" question=";_25._tcp.mail.datanetwork.cloud.\tIN\t ANY" spanID=54772f74dbbea61a traceID=ff8ad12e7e09713747d856a566f57cbd
mars 30 13:18:08 Dell-9010 dockerd[536]: time="2024-03-30T13:18:08.984187656+01:00" level=error msg="[resolver] failed to query external DNS server" client-addr="udp:172.22.1.14:60524" dns-server="udp:172.22.1.254:53" error="read udp 172.22.1.14:60524->172.22.1.254:53: i/o timeout" question=";_25._tcp.mail.datanetwork.cloud.\tIN\t ANY" spanID=fd9cc2cde9d88045 traceID=1677e97317a7f6226bff8aed526956c9
Hey the same observation
Apr 01 12:06:42 mail dockerd[2352174]: time="2024-04-01T12:06:42.544981137+02:00" level=error msg="[resolver] failed to query external DNS server" client-addr="udp:172.22.1.3:50070" dns-server="udp:172.22.1.254:53" error="read udp 172.22.1.3:50070->172.22.1.254:53: read: connection refused" question=";current.cvd.clamav.net.\tIN\t TXT" spanID=846ed9332c131ab1 traceID=efeb68ffb83da1ed5362b57509568981
Apr 01 12:06:42 mail dockerd[2352174]: time="2024-04-01T12:06:42.545918098+02:00" level=error msg="[resolver] failed to query external DNS server" client-addr="udp:172.22.1.3:39748" dns-server="udp:172.22.1.254:53" error="read udp 172.22.1.3:39748->172.22.1.254:53: read: connection refused" question=";current.cvd.clamav.net.\tIN\t TXT" spanID=cd72e472b9f3cdb5 traceID=f7339a81051c257e4dc10edea21e0037
Apr 01 12:06:42 mail dockerd[2352174]: time="2024-04-01T12:06:42.546740518+02:00" level=error msg="[resolver] failed to query external DNS server" client-addr="udp:172.22.1.3:48636" dns-server="udp:172.22.1.254:53" error="read udp 172.22.1.3:48636->172.22.1.254:53: read: connection refused" question=";current.cvd.clamav.net.\tIN\t TXT" spanID=9810963c5c9d81c5 traceID=2b2185cc5c91224124cd0eb2e6467d61
It seems others are having this issue as well :-(
I believe that those of us on Docker v26.0 no longer have DNS RBL protection for our mailcow instances.
What does Postfix's Logs say? If there are sections from Spamhaus regarding: listed on 127.0.0.X it is working as expected.
actually I see these entries in the postfix log...
postfix-mailcow-1 | Apr 11 07:18:39 70aa683f8222 postfix/smtps/smtpd[630]: timeout after AUTH from unknown[80.244.11.199]
postfix-mailcow-1 | Apr 11 07:18:39 70aa683f8222 postfix/smtps/smtpd[630]: disconnect from unknown[80.244.11.199] ehlo=1 auth=0/1 rset=1 commands=2/3
postfix-mailcow-1 | Apr 11 07:19:17 70aa683f8222 postfix/postscreen[648]: CONNECT from [80.94.92.112]:59173 to [172.22.1.253]:25
postfix-mailcow-1 | Apr 11 07:19:17 70aa683f8222 whitelist_forwardinghosts: Look up 80.94.92.112 on whitelist, result 200 DUNNO
postfix-mailcow-1 | Apr 11 07:19:17 70aa683f8222 postfix/dnsblog[658]: addr 80.94.92.112 listed by domain bl.mailspike.net as 127.0.0.2
postfix-mailcow-1 | Apr 11 07:19:17 70aa683f8222 postfix/dnsblog[661]: addr 80.94.92.112 listed by domain zen.spamhaus.org as 127.0.0.9
postfix-mailcow-1 | Apr 11 07:19:17 70aa683f8222 postfix/dnsblog[661]: addr 80.94.92.112 listed by domain zen.spamhaus.org as 127.0.0.4
postfix-mailcow-1 | Apr 11 07:19:17 70aa683f8222 postfix/dnsblog[661]: addr 80.94.92.112 listed by domain zen.spamhaus.org as 127.0.0.2
postfix-mailcow-1 | Apr 11 07:19:18 70aa683f8222 postfix/dnsblog[666]: addr 80.94.92.112 listed by domain hostkarma.junkemailfilter.com as 127.0.0.2
postfix-mailcow-1 | Apr 11 07:19:20 70aa683f8222 postfix/postscreen[648]: DNSBL rank 17 for [80.94.92.112]:59173
postfix-mailcow-1 | Apr 11 07:19:20 70aa683f8222 postfix/postscreen[648]: HANGUP after 0.14 from [80.94.92.112]:59173 in tests after SMTP handshake
postfix-mailcow-1 | Apr 11 07:19:20 70aa683f8222 postfix/postscreen[648]: DISCONNECT [80.94.92.112]:59173
postfix-mailcow-1 | Apr 11 07:19:25 70aa683f8222 postfix/dnsblog[664]: warning: dnsblog_query: lookup error for DNS query 112.92.94.80.dnsbl.sorbs.net: Host or domain name not found. Name service error for name=112.92.94.80.dnsbl.sorbs.net type=A: Host not found, try again
actually I see these entries in the postfix log...
postfix-mailcow-1 | Apr 11 07:18:39 70aa683f8222 postfix/smtps/smtpd[630]: timeout after AUTH from unknown[80.244.11.199] postfix-mailcow-1 | Apr 11 07:18:39 70aa683f8222 postfix/smtps/smtpd[630]: disconnect from unknown[80.244.11.199] ehlo=1 auth=0/1 rset=1 commands=2/3 postfix-mailcow-1 | Apr 11 07:19:17 70aa683f8222 postfix/postscreen[648]: CONNECT from [80.94.92.112]:59173 to [172.22.1.253]:25 postfix-mailcow-1 | Apr 11 07:19:17 70aa683f8222 whitelist_forwardinghosts: Look up 80.94.92.112 on whitelist, result 200 DUNNO postfix-mailcow-1 | Apr 11 07:19:17 70aa683f8222 postfix/dnsblog[658]: addr 80.94.92.112 listed by domain bl.mailspike.net as 127.0.0.2 postfix-mailcow-1 | Apr 11 07:19:17 70aa683f8222 postfix/dnsblog[661]: addr 80.94.92.112 listed by domain zen.spamhaus.org as 127.0.0.9 postfix-mailcow-1 | Apr 11 07:19:17 70aa683f8222 postfix/dnsblog[661]: addr 80.94.92.112 listed by domain zen.spamhaus.org as 127.0.0.4 postfix-mailcow-1 | Apr 11 07:19:17 70aa683f8222 postfix/dnsblog[661]: addr 80.94.92.112 listed by domain zen.spamhaus.org as 127.0.0.2 postfix-mailcow-1 | Apr 11 07:19:18 70aa683f8222 postfix/dnsblog[666]: addr 80.94.92.112 listed by domain hostkarma.junkemailfilter.com as 127.0.0.2 postfix-mailcow-1 | Apr 11 07:19:20 70aa683f8222 postfix/postscreen[648]: DNSBL rank 17 for [80.94.92.112]:59173 postfix-mailcow-1 | Apr 11 07:19:20 70aa683f8222 postfix/postscreen[648]: HANGUP after 0.14 from [80.94.92.112]:59173 in tests after SMTP handshake postfix-mailcow-1 | Apr 11 07:19:20 70aa683f8222 postfix/postscreen[648]: DISCONNECT [80.94.92.112]:59173 postfix-mailcow-1 | Apr 11 07:19:25 70aa683f8222 postfix/dnsblog[664]: warning: dnsblog_query: lookup error for DNS query 112.92.94.80.dnsbl.sorbs.net: Host or domain name not found. Name service error for name=112.92.94.80.dnsbl.sorbs.net type=A: Host not found, try again
Yeah that is the expected behaviour.
I'm following this issue with interest, but I'm not sure what the status is now. There are error messages in the journal, but the DNS blocklists still work? I'm a bit confused...
I have seen that there is now a v26.0.1 release that has changed something in the DNS resolution again:
- https://docs.docker.com/engine/release-notes/26.0/#2601
- https://github.com/moby/moby/pull/47705
Does that change anything?
EDIT: Sorry, I'm just now realizing that the changes only affect ipvlan
interfaces. I assume that the v26.0.1 update will not change anything.
Confirmed. I updated to v26.0.1 last night and got all the same errors overnight :-(
Digging into this a little, I noticed that the error happens on the initial connect and not during the dnsbl lookup. It seems that when it cannot complete a reverse lookup (Doesn't happen all the time) it produces the error from docker we are seeing about a timeout.
If you have verbose logging on unbound eventually it produces an error like the following:
unbound: [3004587:1] error: SERVFAIL <49.174.15.111.in-addr.arpa. PTR IN>: all servers for this domain failed, at zone 174.15.111.in-addr.arpa. from (inet_ntop_error) upstream server timeout
So, this seems to be an upstream DNS failure that is now being reported differently by docker v26.
EDIT: Hmm, actually I was able to reproduce the error with docker v25. But the error message did change:
Docker v26:
dockerd[4185627]: time="2024-04-12T10:50:57.097091978+02:00" level=error msg="[resolver] failed to query external DNS server" client-addr="udp:172.22.1.14:33743" dns-server="udp:172.22.1.254:53" error="read udp 172.22.1.14:33743->172.22.1.254:53: i/o timeout" question=";49.174.15.111.in-addr.arpa.\tIN\t PTR" spanID=43318bd4d322b968 traceID=bd4acefe2bd82f26128ec09812f4a5f2
Docker v25:
dockerd[5204]: time="2024-04-12T11:05:51.129790187+02:00" level=error msg="[resolver] failed to query DNS server: 172.22.1.254:53, query: ;49.174.15.111.in-addr.arpa.\tIN\t PTR" error="read udp 172.22.1.14:52220->172.22.1.254:53: i/o timeout
I can confirm the tests of @kilo666mj with Docker v25. What I still wonder: Are the error messages now works-as-designed or is there really a bug here?
Interestingly, since upgrading from Docker v26.0.0 to v26.0.1 I have also started getting the additional error...
dockerd[468953]: time="2024-04-16T02:55:02.478443307Z" level=info msg="No non-localhost DNS nameservers are left in resolv.conf. Using default external servers"
We have come to the conclusion that nothing is actually broken. Docker is now just being more verbose about DNS entries that don't resolve (NXDOMAIN).
We have done tests from both inside and out of the docker containers and DNS seems to be looking up fine. It's only DNS lookups that result in a NXDOMAIN that produce the log.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.