mailcow-dockerized icon indicating copy to clipboard operation
mailcow-dockerized copied to clipboard

Docker v26.0.0 breaks DNS

Open dogsbody opened this issue 11 months ago • 16 comments

Contribution guidelines

I've found a bug and checked that ...

  • [X] ... I understand that not following the below instructions will result in immediate closure and/or deletion of my issue.
  • [X] ... I have understood that this bug report is dedicated for bugs, and not for support-related inquiries.
  • [X] ... I have understood that answers are voluntary and community-driven, and not commercial support.
  • [X] ... I have verified that my issue has not been already answered in the past. I also checked previous issues.

Description

Since Upgrading to Docker 26.0.0 Mailcow is producing lots of DNS errors.

This MAY be connected to the following depreciation in Docker 26.0.0!!??


CVE-2024-29018
: Do not forward requests to external DNS servers for a container that is only connected to an 'internal' network. Previously, requests were forwarded if the host's DNS server was running on a loopback address, like systemd's 127.0.0.53. moby/moby#47589

Source: https://docs.docker.com/engine/release-notes/26.0/#bug-fixes-and-enhancements



### Logs:

```plain text
Mar 22 08:02:03 olive dockerd[220407]: time="2024-03-22T08:02:03.235990882Z" level=error msg="[resolver] failed to query external DNS server" client-addr="udp:172.22.1.253:44521" dns-server="udp:172.22.1.254:53" error="read udp 172.22.1.253:44521->172.22.1.254:53: i/o timeout" question=";133.138.180.139.bl.spamcop.net.\tIN\t A" spanID=6a533445a593b326 traceID=e8790345dcf9b4465b776ee33783b6cd
Mar 22 08:02:38 olive dockerd[220407]: time="2024-03-22T08:02:38.492917213Z" level=error msg="[resolver] failed to query external DNS server" client-addr="udp:172.22.1.253:43701" dns-server="udp:172.22.1.254:53" error="read udp 172.22.1.253:43701->172.22.1.254:53: i/o timeout" question=";2.4.8.d.7.a.e.f.f.f.1.0.0.0.4.5.f.9.3.2.5.0.0.0.0.f.9.1.1.0.0.2.b.barracudacentral.org.\tIN\t A" spanID=af98360abc780b84 traceID=22daca411668dcd55f3f5d773b9b5c1e
Mar 22 08:02:38 olive dockerd[220407]: time="2024-03-22T08:02:38.500301774Z" level=error msg="[resolver] failed to query external DNS server" client-addr="udp:172.22.1.253:42026" dns-server="udp:172.22.1.254:53" error="read udp 172.22.1.253:42026->172.22.1.254:53: i/o timeout" question=";2.4.8.d.7.a.e.f.f.f.1.0.0.0.4.5.f.9.3.2.5.0.0.0.0.f.9.1.1.0.0.2.bl.spamcop.net.\tIN\t A" spanID=d18c21e1eb7f6063 traceID=531742d1fafa14d82b97d8b1e3963bf2
Mar 22 08:02:42 olive dockerd[220407]: time="2024-03-22T08:02:42.501940834Z" level=error msg="[resolver] failed to query external DNS server" client-addr="udp:172.22.1.253:45499" dns-server="udp:172.22.1.254:53" error="read udp 172.22.1.253:45499->172.22.1.254:53: i/o timeout" question=";2.4.8.d.7.a.e.f.f.f.1.0.0.0.4.5.f.9.3.2.5.0.0.0.0.f.9.1.1.0.0.2.bl.spamcop.net.\tIN\t A" spanID=d1f1e178d36c79f5 traceID=b1e263f5139bbd02f59b05b89f0926e6
Mar 22 08:03:02 olive dockerd[220407]: time="2024-03-22T08:03:02.268089673Z" level=error msg="[resolver] failed to query external DNS server" client-addr="udp:172.22.1.253:39105" dns-server="udp:172.22.1.254:53" error="read udp 172.22.1.253:39105->172.22.1.254:53: i/o timeout" question=";11.16.227.165.bl.spamcop.net.\tIN\t A" spanID=d904f6f2b08ce936 traceID=3d00cbfc2ba83136d65b88c0dfc68b85

Steps to reproduce:

Run Mailcow on Docker v26.0.0

Which branch are you using?

master

Which architecture are you using?

x86

Operating System:

Ubuntu 22.04 LTS

Server/VM specifications:

2 cores, 4GB RAM

Is Apparmor, SELinux or similar active?

no

Virtualization technology:

KVM I think, it's a VPS server

Docker version:

v26.0.0

docker-compose version or docker compose version:

v2.25.0

mailcow version:

2024-02

Reverse proxy:

None

Logs of git diff:

N/A

Logs of iptables -L -vn:

N/A

Logs of ip6tables -L -vn:

N/A

Logs of iptables -L -vn -t nat:

N/A

Logs of ip6tables -L -vn -t nat:

N/A

DNS check:

104.18.32.7
172.64.155.249

dogsbody avatar Mar 22 '24 10:03 dogsbody

Any update on this please? People on Docker v26 will have no RBL functionality on their server until we find a fix. Thank you

dogsbody avatar Mar 25 '24 09:03 dogsbody

@dogsbody where exactly are you seeing these log messages? I upgraded to Docker v26.0.0 this morning and mailcow runs for several hours without any issues.

mbu147 avatar Mar 27 '24 09:03 mbu147

With either of these commands

grep "failed to query external DNS" /var/log/syslog
journalctl --since today | grep "failed to query external DNS"

Here's another line from another Mailcow server from a few minutes ago

Mar 27 09:25:50 mail01 dockerd[355773]: time="2024-03-27T09:25:50.895007331Z" level=error msg="[resolver] failed to query external DNS server" client-addr="udp:172.22.1.253:44732" dns-server="udp:172.22.1.254:53" error="read udp 172.22.1.253:44732->172.22.1.254:53: i/o timeout" question=";169.33.6.112.in-addr.arpa.\tIN\t PTR" spanID=ca8733dffb2eeeb8 traceID=2e6f7ee85a991ced1b514afd24216fd9

dogsbody-josh avatar Mar 27 '24 09:03 dogsbody-josh

Oh - I only checked the container logs. Thanks! I can also find these messages in the systemd logs on my system (AlmaLinux 9.3)

mbu147 avatar Mar 27 '24 10:03 mbu147

hi, I also see errors

mars 30 13:11:22 Dell-9010 dockerd[536]: time="2024-03-30T13:11:22.951801038+01:00" level=error msg="[resolver] failed to query external DNS server" client-addr="udp:172.22.1.253:49241" dns-server="udp:172.22.1.254:53" error="read udp 172.22.1.253:49241->172.22.1.254:53: i/o timeout" question=";254.1.168.192.dnsbl.sorbs.net.\tIN\t A" spanID=74c0fb8ccfe6e7b6 traceID=bf95bdcf3b6cb5edf720eb2781f94ffc
mars 30 13:18:04 Dell-9010 dockerd[536]: time="2024-03-30T13:18:04.983303145+01:00" level=error msg="[resolver] failed to query external DNS server" client-addr="udp:172.22.1.14:60250" dns-server="udp:172.22.1.254:53" error="read udp 172.22.1.14:60250->172.22.1.254:53: i/o timeout" question=";_25._tcp.mail.datanetwork.cloud.\tIN\t ANY" spanID=14de718d26666066 traceID=aab42a4edc17280d27b4028b49f42ddd
mars 30 13:18:07 Dell-9010 dockerd[536]: time="2024-03-30T13:18:07.486205113+01:00" level=error msg="[resolver] failed to query external DNS server" client-addr="udp:172.22.1.14:40350" dns-server="udp:172.22.1.254:53" error="read udp 172.22.1.14:40350->172.22.1.254:53: i/o timeout" question=";_25._tcp.mail.datanetwork.cloud.\tIN\t ANY" spanID=54772f74dbbea61a traceID=ff8ad12e7e09713747d856a566f57cbd
mars 30 13:18:08 Dell-9010 dockerd[536]: time="2024-03-30T13:18:08.984187656+01:00" level=error msg="[resolver] failed to query external DNS server" client-addr="udp:172.22.1.14:60524" dns-server="udp:172.22.1.254:53" error="read udp 172.22.1.14:60524->172.22.1.254:53: i/o timeout" question=";_25._tcp.mail.datanetwork.cloud.\tIN\t ANY" spanID=fd9cc2cde9d88045 traceID=1677e97317a7f6226bff8aed526956c9

Cisco30 avatar Mar 30 '24 12:03 Cisco30

Hey the same observation

Apr 01 12:06:42 mail dockerd[2352174]: time="2024-04-01T12:06:42.544981137+02:00" level=error msg="[resolver] failed to query external DNS server" client-addr="udp:172.22.1.3:50070" dns-server="udp:172.22.1.254:53" error="read udp 172.22.1.3:50070->172.22.1.254:53: read: connection refused" question=";current.cvd.clamav.net.\tIN\t TXT" spanID=846ed9332c131ab1 traceID=efeb68ffb83da1ed5362b57509568981
Apr 01 12:06:42 mail dockerd[2352174]: time="2024-04-01T12:06:42.545918098+02:00" level=error msg="[resolver] failed to query external DNS server" client-addr="udp:172.22.1.3:39748" dns-server="udp:172.22.1.254:53" error="read udp 172.22.1.3:39748->172.22.1.254:53: read: connection refused" question=";current.cvd.clamav.net.\tIN\t TXT" spanID=cd72e472b9f3cdb5 traceID=f7339a81051c257e4dc10edea21e0037
Apr 01 12:06:42 mail dockerd[2352174]: time="2024-04-01T12:06:42.546740518+02:00" level=error msg="[resolver] failed to query external DNS server" client-addr="udp:172.22.1.3:48636" dns-server="udp:172.22.1.254:53" error="read udp 172.22.1.3:48636->172.22.1.254:53: read: connection refused" question=";current.cvd.clamav.net.\tIN\t TXT" spanID=9810963c5c9d81c5 traceID=2b2185cc5c91224124cd0eb2e6467d61

MatthieuLeboeuf avatar Apr 01 '24 10:04 MatthieuLeboeuf

It seems others are having this issue as well :-(

I believe that those of us on Docker v26.0 no longer have DNS RBL protection for our mailcow instances.

dogsbody avatar Apr 10 '24 19:04 dogsbody

What does Postfix's Logs say? If there are sections from Spamhaus regarding: listed on 127.0.0.X it is working as expected.

DerLinkman avatar Apr 11 '24 05:04 DerLinkman

actually I see these entries in the postfix log...


postfix-mailcow-1  | Apr 11 07:18:39 70aa683f8222 postfix/smtps/smtpd[630]: timeout after AUTH from unknown[80.244.11.199]
postfix-mailcow-1  | Apr 11 07:18:39 70aa683f8222 postfix/smtps/smtpd[630]: disconnect from unknown[80.244.11.199] ehlo=1 auth=0/1 rset=1 commands=2/3
postfix-mailcow-1  | Apr 11 07:19:17 70aa683f8222 postfix/postscreen[648]: CONNECT from [80.94.92.112]:59173 to [172.22.1.253]:25
postfix-mailcow-1  | Apr 11 07:19:17 70aa683f8222 whitelist_forwardinghosts: Look up 80.94.92.112 on whitelist, result 200 DUNNO
postfix-mailcow-1  | Apr 11 07:19:17 70aa683f8222 postfix/dnsblog[658]: addr 80.94.92.112 listed by domain bl.mailspike.net as 127.0.0.2
postfix-mailcow-1  | Apr 11 07:19:17 70aa683f8222 postfix/dnsblog[661]: addr 80.94.92.112 listed by domain zen.spamhaus.org as 127.0.0.9
postfix-mailcow-1  | Apr 11 07:19:17 70aa683f8222 postfix/dnsblog[661]: addr 80.94.92.112 listed by domain zen.spamhaus.org as 127.0.0.4
postfix-mailcow-1  | Apr 11 07:19:17 70aa683f8222 postfix/dnsblog[661]: addr 80.94.92.112 listed by domain zen.spamhaus.org as 127.0.0.2
postfix-mailcow-1  | Apr 11 07:19:18 70aa683f8222 postfix/dnsblog[666]: addr 80.94.92.112 listed by domain hostkarma.junkemailfilter.com as 127.0.0.2
postfix-mailcow-1  | Apr 11 07:19:20 70aa683f8222 postfix/postscreen[648]: DNSBL rank 17 for [80.94.92.112]:59173
postfix-mailcow-1  | Apr 11 07:19:20 70aa683f8222 postfix/postscreen[648]: HANGUP after 0.14 from [80.94.92.112]:59173 in tests after SMTP handshake
postfix-mailcow-1  | Apr 11 07:19:20 70aa683f8222 postfix/postscreen[648]: DISCONNECT [80.94.92.112]:59173
postfix-mailcow-1  | Apr 11 07:19:25 70aa683f8222 postfix/dnsblog[664]: warning: dnsblog_query: lookup error for DNS query 112.92.94.80.dnsbl.sorbs.net: Host or domain name not found. Name service error for name=112.92.94.80.dnsbl.sorbs.net type=A: Host not found, try again

Cisco30 avatar Apr 11 '24 05:04 Cisco30

actually I see these entries in the postfix log...




postfix-mailcow-1  | Apr 11 07:18:39 70aa683f8222 postfix/smtps/smtpd[630]: timeout after AUTH from unknown[80.244.11.199]

postfix-mailcow-1  | Apr 11 07:18:39 70aa683f8222 postfix/smtps/smtpd[630]: disconnect from unknown[80.244.11.199] ehlo=1 auth=0/1 rset=1 commands=2/3

postfix-mailcow-1  | Apr 11 07:19:17 70aa683f8222 postfix/postscreen[648]: CONNECT from [80.94.92.112]:59173 to [172.22.1.253]:25

postfix-mailcow-1  | Apr 11 07:19:17 70aa683f8222 whitelist_forwardinghosts: Look up 80.94.92.112 on whitelist, result 200 DUNNO

postfix-mailcow-1  | Apr 11 07:19:17 70aa683f8222 postfix/dnsblog[658]: addr 80.94.92.112 listed by domain bl.mailspike.net as 127.0.0.2

postfix-mailcow-1  | Apr 11 07:19:17 70aa683f8222 postfix/dnsblog[661]: addr 80.94.92.112 listed by domain zen.spamhaus.org as 127.0.0.9

postfix-mailcow-1  | Apr 11 07:19:17 70aa683f8222 postfix/dnsblog[661]: addr 80.94.92.112 listed by domain zen.spamhaus.org as 127.0.0.4

postfix-mailcow-1  | Apr 11 07:19:17 70aa683f8222 postfix/dnsblog[661]: addr 80.94.92.112 listed by domain zen.spamhaus.org as 127.0.0.2

postfix-mailcow-1  | Apr 11 07:19:18 70aa683f8222 postfix/dnsblog[666]: addr 80.94.92.112 listed by domain hostkarma.junkemailfilter.com as 127.0.0.2

postfix-mailcow-1  | Apr 11 07:19:20 70aa683f8222 postfix/postscreen[648]: DNSBL rank 17 for [80.94.92.112]:59173

postfix-mailcow-1  | Apr 11 07:19:20 70aa683f8222 postfix/postscreen[648]: HANGUP after 0.14 from [80.94.92.112]:59173 in tests after SMTP handshake

postfix-mailcow-1  | Apr 11 07:19:20 70aa683f8222 postfix/postscreen[648]: DISCONNECT [80.94.92.112]:59173

postfix-mailcow-1  | Apr 11 07:19:25 70aa683f8222 postfix/dnsblog[664]: warning: dnsblog_query: lookup error for DNS query 112.92.94.80.dnsbl.sorbs.net: Host or domain name not found. Name service error for name=112.92.94.80.dnsbl.sorbs.net type=A: Host not found, try again



Yeah that is the expected behaviour.

DerLinkman avatar Apr 11 '24 10:04 DerLinkman

I'm following this issue with interest, but I'm not sure what the status is now. There are error messages in the journal, but the DNS blocklists still work? I'm a bit confused...

I have seen that there is now a v26.0.1 release that has changed something in the DNS resolution again:

  • https://docs.docker.com/engine/release-notes/26.0/#2601
  • https://github.com/moby/moby/pull/47705

Does that change anything?

EDIT: Sorry, I'm just now realizing that the changes only affect ipvlan interfaces. I assume that the v26.0.1 update will not change anything.

mrclschstr avatar Apr 11 '24 18:04 mrclschstr

Confirmed. I updated to v26.0.1 last night and got all the same errors overnight :-(

dogsbody avatar Apr 12 '24 08:04 dogsbody

Digging into this a little, I noticed that the error happens on the initial connect and not during the dnsbl lookup. It seems that when it cannot complete a reverse lookup (Doesn't happen all the time) it produces the error from docker we are seeing about a timeout.

If you have verbose logging on unbound eventually it produces an error like the following: unbound: [3004587:1] error: SERVFAIL <49.174.15.111.in-addr.arpa. PTR IN>: all servers for this domain failed, at zone 174.15.111.in-addr.arpa. from (inet_ntop_error) upstream server timeout

So, this seems to be an upstream DNS failure that is now being reported differently by docker v26.

EDIT: Hmm, actually I was able to reproduce the error with docker v25. But the error message did change:

Docker v26: dockerd[4185627]: time="2024-04-12T10:50:57.097091978+02:00" level=error msg="[resolver] failed to query external DNS server" client-addr="udp:172.22.1.14:33743" dns-server="udp:172.22.1.254:53" error="read udp 172.22.1.14:33743->172.22.1.254:53: i/o timeout" question=";49.174.15.111.in-addr.arpa.\tIN\t PTR" spanID=43318bd4d322b968 traceID=bd4acefe2bd82f26128ec09812f4a5f2 Docker v25: dockerd[5204]: time="2024-04-12T11:05:51.129790187+02:00" level=error msg="[resolver] failed to query DNS server: 172.22.1.254:53, query: ;49.174.15.111.in-addr.arpa.\tIN\t PTR" error="read udp 172.22.1.14:52220->172.22.1.254:53: i/o timeout

kilo666mj avatar Apr 12 '24 09:04 kilo666mj

I can confirm the tests of @kilo666mj with Docker v25. What I still wonder: Are the error messages now works-as-designed or is there really a bug here?

mrclschstr avatar Apr 13 '24 07:04 mrclschstr

Interestingly, since upgrading from Docker v26.0.0 to v26.0.1 I have also started getting the additional error... dockerd[468953]: time="2024-04-16T02:55:02.478443307Z" level=info msg="No non-localhost DNS nameservers are left in resolv.conf. Using default external servers"

dogsbody avatar Apr 16 '24 10:04 dogsbody

We have come to the conclusion that nothing is actually broken. Docker is now just being more verbose about DNS entries that don't resolve (NXDOMAIN).

We have done tests from both inside and out of the docker containers and DNS seems to be looking up fine. It's only DNS lookups that result in a NXDOMAIN that produce the log.

dogsbody avatar Apr 23 '24 08:04 dogsbody

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

milkmaker avatar Jun 23 '24 00:06 milkmaker