[BUG] Networking seems broken in 2.40.3
Description
Hi,
I upgraded from 2.40.2 to 2.40.3:
APT-Log:
Upgrade: docker-compose-plugin:amd64 (2.40.2-1~ubuntu.24.04~noble, 2.40.3-1~ubuntu.24.04~noble)
Since the upgrade the networking seems to be completely broken. Services cant reach each other or reach the wrong service.
After downgrading from 2.40.3 to 2.40.2 everything is working again.
I looked through the issues and didn't find any. Does anyone else have the same problem?
Steps To Reproduce
Sorry, I am a little bit unsure about this.
In fact we didn't change our compose setup. Simply the networking seems to go crazy after the release v2.40.3.
- We start our compose setup (normally working)
- Services contacting each other contact the wrong services or cannot contact the other service at all.
Compose Version
$ docker compose version
Docker Compose version v2.40.3
Docker Environment
$ docker info
Client: Docker Engine - Community
Version: 28.5.1
Context: default
Debug Mode: false
Plugins:
buildx: Docker Buildx (Docker Inc.)
Version: v0.29.1
Path: /usr/libexec/docker/cli-plugins/docker-buildx
compose: Docker Compose (Docker Inc.)
Version: v2.40.3
Path: /usr/libexec/docker/cli-plugins/docker-compose
Server:
Containers: 32
Running: 31
Paused: 0
Stopped: 1
Images: 78
Server Version: 28.5.1
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Using metacopy: false
Native Overlay Diff: true
userxattr: false
Logging Driver: json-file
Cgroup Driver: systemd
Cgroup Version: 2
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
CDI spec directories:
/etc/cdi
/var/run/cdi
Swarm: inactive
Runtimes: io.containerd.runc.v2 runc
Default Runtime: runc
Init Binary: docker-init
containerd version: b98a3aace656320842a23f4a392a33f46af97866
runc version: v1.3.0-0-g4ca628d1
init version: de40ad0
Security Options:
apparmor
seccomp
Profile: builtin
cgroupns
Kernel Version: 6.14.0-114036-tuxedo
Operating System: Ubuntu 24.04.3 LTS
OSType: linux
Architecture: x86_64
CPUs: 32
Total Memory: 62.5GiB
Name: -----
ID: 83acb5da-2c02-4489-9042-4fecaefc1089
Docker Root Dir: /var/lib/docker
Debug Mode: false
Experimental: false
Insecure Registries:
::1/128
127.0.0.0/8
Live Restore Enabled: false
Anything else?
No response
Without a reproductible example I hardly can help here. Can you inspect container and network being created by Compose 2.40.3 and compare with earlier version ?
I am just trying to figure out more details and prepare an example. I'll come back to you.
FYI: Maybe this is related to #13346 because we also have a lot of extends and depends_on in our code.
For me the internal DNS resolution between containers (using container names) completely broke when upgrading to Docker 29 and using docker compose - docker hostnames are not resolvable anymore, containers can't find each other. When downgrading to 28.5.2 it works again. I am using quite vanilla containers and compose files.
I can confirm the same thing. updating to Docker 29 broke internal DNS resolution and containers can't seem to talk to each other anymore. will try to downgrade
Edit: can confirm downgrading to 28.5.2 sovles it.
I am not sure if its related to compose tho or the docker engine in general
It'd be great to know more about this so we can investigate. An example compose file that reproduces the issue would be ideal.
Otherwise - inspect outputs for containers and networks would be good. Ideally, as @ndeloof says, working and broken versions that we can compare.
Or, if you can enable debugging in Docker, and send the logs from a broken service starting up - perhaps we can find clues in that.
@robmry I could do some tests in a few days or next week, although how would I get meaningful output in this instance? The containers all start up normally, they just can't communicate with each other because no Docker hostnames are reachable. If I bash into a running container and try to ping the hostname of another running container, the Docker DNS says that hostname does not exist (even though they do run on the same network and this has worked up to Docker 29), so I would assume the problem lies somewhere in the Docker DNS part. In my case it has nothing to do with extends or depends_on. I am using Docker on Debian 12.
@iquito can you try using a plain docker run --network xx alpine ping <other> to check a non-compose container can communicate over network to your other service ?
Then please attach docker inspect output for such a working container and the one from your compose stack which can't communicate
Thanks @iquito - a way to repro the issue would be ideal, there doesn't seem to be any issue with a simple compose file that just starts a couple of containers on a network. Otherwise, the inspect outputs and logs mentioned above might tell us something.
If it's happening on a swarm node (?), it could be https://github.com/moby/moby/issues/51491. (That'd make it separate from the original issue, reported against docker 28.5.1.)
We have Docker 29.0.0 / compose 2.40.3 and also see internal name resolution problems, but so far on one of three servers only as far as i can tell. They all have Debian 12 and a closely analogous setup though, so differences might be hard to tease apart.
Processes report an errno = -11 when that happens. According to netdb.h that would be EAI_SYSTEM. Interestingly, it works for us immediately after process start, but errors develop after somewhere 40-50 minutes of run time.
Restart fixes it, for a time.
Just observed it in action – compose env internal DNS seems to fail at pretty precisely 40' after each container start, and just for some containers, and on some hosts. In this case one service built on server-side Dart.
@dilbernd can you please confirm this issue only applies to container set by compose, and you don't get any issue running a container with a plain docker run --network xx ... command ?
More facts found in internal communication and experimentation:
- Other hosts are affected, only one env was a miscommunication.
- Downgrade of docker-ce to 28.5.2 did not solve the issue for us. We however restarted the containers that had been started under 29.0.0 and did not recreate (which we cannot easily do due to operational constraints) so a metadata issue may involved. Compose is still at 2.40.3.
- The Issues seem to not be exclusively DNS related: We have a db <- service1 <- service2 structure. “Fixing” the issue for ~ half an hour requires restarting the db and service2, not service1. DB is only connected to from service1, service1 is only connected to from service2. The DNS error was discovered in service2. service1 does not seem to have that problem, and DB does not attempt to resolve clients (explicitly set in config), but also seems to come into a bad state.
- Issue has occurred between ~25 and ~50 minutes after container start so far.
@ndeloof I have another container on that network now doing docker run --rm -ti --network affected-network alpine watch ping -c 1 service1 – should be enough, right?
Thanks @dilbernd,
Processes report an errno = -11 when that happens.
Any idea what system call is returning that errno?
The Issues seem to not be exclusively DNS related: [...], and DB does not attempt to resolve clients (explicitly set in config), but also seems to come into a bad state.
What is that bad state / what's not working apart from DNS?
Have you managed to collect any container/network inspect outputs or daemon logs?
@robmry
Any idea what system call is returning that errno?
No, can’t properly strace in the prod env. My guess would be that it’s calling through to gethostbyname in the libc since it’s during name resolution. But admittedly that’s a guess. Haven’t looked at implementation detail there or in our code so far. We don’t have a stack logged for that error so it would take a while to do that with certainty.
What is that bad state / what's not working apart from DNS?
Hard to tell, it doesn’t seem to log anything about that time in service1 or db. It’s just an observation by a colleague that restarting service2 does not fully resolve the error, but also restarting only db makes it work again.
Oh yeah,
Have you managed to collect any container/network inspect outputs or daemon logs?
I can only provide heavily redacted output, are there any values you’re interested in in particular?
Ok, the errno 11 is from name resolution - I wondered if it was related to the other issues you mentioned. So there may not be a non-DNS issue, we don't know yet. (For example, maybe restarting the db container restores its DNS entry.)
I can only provide heavily redacted output, are there any values you’re interested in in particular?
Hard to say, we've not got much to go on yet. But from network/container inspect outputs, any differences between when it's working and after it's failed might be interesting. Daemon (or host) logs from the point where it fails might be useful.
Daemon logs (with debug enabled) from a failed DNS lookup could be good.
Does anything else happen on the system at the point where it fails ... maybe another container or service starting/stopping, a Docker daemon or firewall reload, anything like that?
Ok, the errno 11 is from name resolution - I wondered if it was related to the other issues you mentioned. So there may not be a non-DNS issue, we don't know yet. (For example, maybe restarting the db container restores its DNS entry.)
Yeah, very possible. It’s very unclear to us what does this, since the only error that actually shows up in logs is in one specific type of service, the backend Dart HTTP server.
The thing is that this service2 and the DB never talk to each other, service2 only talks to service1, which talks to the DB, and that’s a happy camper in the middle that does not require restart to restore service, so that makes it even more confusing.
Does anything else happen on the system at the point where it fails ... maybe another container or service starting/stopping, a Docker daemon or firewall reload, anything like that?
No, nothing. I’m looking what I can make observable aside from that.
@ndeloof I have another container on that network now doing docker run --rm -ti --network affected-network alpine watch ping -c 1 service1 – should be enough, right?
This has run now for 2 hours without issue. I don’t think that that tells us much about docker v compose though, since most compose services here also seem to use the network and DNS without issue.
It seems to take very particular behaviour to trigger.
The IMO bigger indication that it’s compose is that we have already downgraded docker-ce but not docker-compose-plugin from the official apt repos on one machine and it still reoccurs there.
Also experiencing this problem. Test that failed:
docker-compose.yml:
services:
a:
image: alpine:3.20
command: ["sh", "-c", "sleep 1000000"]
b:
image: alpine:3.20
command: ["sh", "-c", "sleep 1000000"]
docker compose up -d
docker compose exec a sh -c "apk add --no-cache bind-tools >/dev/null && getent hosts b || nslookup b"
Output:
docker compose exec a sh -c "apk add --no-cache bind-tools >/dev/null && getent hosts b || nslookup b" ;; Got SERVFAIL reply from 127.0.0.11 Server: 127.0.0.11 Address: 127.0.0.11#53 ** server can't find b: SERVFAIL
Downgrading from 29.0.0 to 28.5.2 fixes internal DNS issues...
Thanks @blackadar ... I'm not able to reproduce the issue using that compose file - does it fail every time for you?
Could you enable debugging and send the logs?
We have downgraded our servers to stauch the bleeding and recreated the affected containers under the old version:
Client: Docker Engine - Community
Version: 28.5.2
[…]
Server: Docker Engine - Community
Engine:
Version: 28.5.2
[…]
containerd:
Version: v1.7.29
GitCommit: 442cb34bda9a6a0fed82a2ca7cade05c5c749582
runc:
Version: 1.3.3
GitCommit: v1.3.3-0-gd842d771
docker-init:
Version: 0.19.0
GitCommit: de40ad0
Compose version:
Docker Compose version v2.40.2
However, we still observe the same problem(s), albeit a bit more rarely, since the new version was not involved with anything related to these containers; except the docker networks that could not be recreated in space. Maybe that narrows it down – might it be the network definitions and the metadata associated with them?
I’d be interested whether the people who reported it fixed downed and re-upped their whole envs, including the docker networks.
We have also recently started seeing name resolution errors with errno = 24 (out of file descriptors); services that managed with the default (1024) before the upgrade have suddenly started seeing spikes that occasionally even 64K descriptors did not suffice for. We cannot be totally confident that it’s not related to changes on our side, so I wanted to solicit feedback if anyone else is seeing vastly higher fd consumption, or if that’s even potentially related to the issue under discussion here (might just come up under a similar context by blind chance)?
We’re at this point wondering whether the issues could not also be related to the new kernel release stemming from the recent kernel security advisories, which actually prompted the apt upgrade that preceded the problems in the first place.
Did the people who downgraded roll back all the recent upgrades or only docker?
We have also recently started seeing name resolution errors with errno = 24 (out of file descriptors); services that managed with the default (1024) before the upgrade have suddenly started seeing spikes that occasionally even 64K descriptors did not suffice for. We cannot be totally confident that it’s not related to changes on our side
This is a good observation :)
From the looks of it Docker 29.0.0 is when Docker Engine finally upgraded from containerd 1.7.x to 2.x series(the release notes are a bit vague on the before version, and the linked PR seems wrong as it's about CI usage? (and 2.x as previous version), while the v28 series mentions 1.7.x).
containerd 2.0 carries a change that can be considered breaking, that Docker Engine itself landed the equivalent change on it's side back in the v25 release for their systemd service file removing LimitNOFILE=infinity.
I've been waiting on this to land myself for a complete fix, but am aware of it affecting large enterprise scale deployments such as when Amazon adopted the change early and reverted it due to impact on their customers.
For a quick gist of it prior to the change in both projects, they bumped not only the hard limit of file descriptors but also the soft limit where infinity was too large on various linux hosts due to a systemd change in late 2018 IIRC. That resulted in a soft limit of over 1 billion which regressed quite a few services when containerized (causing OOM or significant processing delays with unnecessary CPU load).
Some services like Envoy relied upon the bug implicitly at the time where they've needed more than a million FDs apparently (which is also the default hard limit in Debian IIRC). Near the time of this change landing in Docker v25, Go also made a change to automatically raise the soft limit to the hard limit (although this had some conditions IIRC, so if Docker Compose relied upon that but it wasn't applicable it may be stuck on 1024). The default hard limit is inherited from systemd which should be about half a million, while the soft limit should be 1024 (for compatibility reasons).
If this is the cause, the affected application needs to raise the soft limit at runtime in it's code (proper fix), but you could also override the containerd systemd service file to have LimitNOFILE=infinity and restart that service as a quick test. For software that can raise the soft limit (such as Go does implicitly for developers), if the default hard limit is not high enough, you could bump that to infinity AFAIK and you'd be alright, but the soft limit should only be bumped per service (Nginx for example keeps the 1024 limit but bumps child processes where appropriate).
If you would like further insight to the change at both projects, I am the PR author of both and pushed for such a change.
@mschop Are you using docker compose "profiles"? https://docs.docker.com/compose/how-tos/profiles/
In my case that's what seems to break networking. i.e. Given the docker compose file
docker-compose.development.yaml
x-default-environment: &default-environment
NODE_ENV: development
TZ: "UTC"
DB_HOST: db
DB_USER: sa
DB_NAME: elcc_development
DB_PASS: &default-db-password DevPwd99!
DB_PORT: &default-db-port 1433
DB_TRUST_SERVER_CERTIFICATE: "true"
DB_HEALTH_CHECK_INTERVAL_SECONDS: 5
DB_HEALTH_CHECK_TIMEOUT_SECONDS: 10
DB_HEALTH_CHECK_RETRIES: 3
DB_HEALTH_CHECK_START_PERIOD_SECONDS: 5
services:
api:
build:
context: ./api
dockerfile: development.Dockerfile
env_file:
- ./api/.env.development
environment:
<<: *default-environment
RELEASE_TAG: ${RELEASE_TAG:-development}
GIT_COMMIT_HASH: ${GIT_COMMIT_HASH:-not-set}
tty: true # allows attaching debugger, equivalent of docker exec -t
init: true
# stdin_open: true # equivalent of docker exec -i
ports:
- "3000:3000"
volumes:
- ./api:/usr/src/api
- ./.gitignore:/usr/src/.gitignore
- ./.prettierrc.yaml:/usr/src/.prettierrc.yaml
depends_on:
- db
web:
build:
context: ./web
dockerfile: development.Dockerfile
environment:
<<: *default-environment
VITE_API_BASE_URL: "http://localhost:3000"
ports:
- "8080:8080"
volumes:
- ./web:/usr/src/web
- ./.gitignore:/usr/src/.gitignore
- ./.prettierrc.yaml:/usr/src/.prettierrc.yaml
depends_on:
- api
test_api:
build:
context: ./api
dockerfile: development.Dockerfile
command: /bin/true
env_file:
- ./api/.env.development
environment:
<<: *default-environment
NODE_ENV: test
DB_NAME: elcc_test
DB_HEALTH_CHECK_START_PERIOD_SECONDS: 0
tty: true
volumes:
- ./api:/usr/src/api
depends_on:
- db
test_web:
build:
context: ./web
dockerfile: development.Dockerfile
command: /bin/true
environment:
<<: *default-environment
NODE_ENV: test
tty: true
volumes:
- ./web:/usr/src/web
db:
image: mcr.microsoft.com/mssql/server:2019-CU28-ubuntu-20.04
user: root
environment:
<<: *default-environment
DB_HOST: "localhost"
MSSQL_SA_PASSWORD: *default-db-password
ACCEPT_EULA: "Y"
ports:
- "1433:1433"
volumes:
- db_data:/var/opt/mssql/data
# For easily generating large PlantUML diagrams
# Not relevant to production environment.
# Accessible at http://localhost:9999
plantuml:
image: plantuml/plantuml-server:jetty
ports:
- 9999:8080
environment:
PLANTUML_LIMIT_SIZE: 8192
profiles:
- design
volumes:
db_data:
docker compose up plantuml no longer works with the other services running. I can now only boot one profile at a time.
Previously, I could simply boot new services even if they had a different "profile".
docker compose up plantumlno longer works with the other services running. I can now only boot one profile at a time. Previously, I could simply boot new services even if they had a different "profile".
Just to verify as you haven't stated it, did you try with the profile for that service commented out so it's always up regardless? I assume it works then? What if you add another profile to a different service? Does it still work?
A simpler / smaller reproduction would be better to isolate that, if it's really the case. traefik/whoami is probably all you need on different ports with the multiple profiles.
docker-ce v29.0.1 seems to have fixed my networking issue so it wasnt related to compose at least in my case
docker compose up plantumlno longer works with the other services running. I can now only boot one profile at a time. Previously, I could simply boot new services even if they had a different "profile".Just to verify as you haven't stated it, did you try with the profile for that service commented out so it's always up regardless? I assume it works then? What if you add another profile to a different service? Does it still work?
A simpler / smaller reproduction would be better to isolate that, if it's really the case.
traefik/whoamiis probably all you need on different ports with the multiple profiles.
@polarathene Sorry! That was sloppy of me. Here is a minimal example using traefik/whoami. I'm not sure it actually recreates the same issue, but it still definitely fails in the same way.
- Create the file
docker-compose.yaml
services:
# Always running (no profile)
service-a:
image: traefik/whoami
ports:
- "8081:80"
container_name: service-a
# Profile: extras
service-b:
image: traefik/whoami
ports:
- "8082:80"
container_name: service-b
profiles:
- extras
- Boot the container via
docker compose up, it will build the first time. - Boot the secondary container via
docker compose up service-b. This will work the first time. - Stop the containers via
docker compose downorctrl+c. - Remove any trailing stuff via
docker compose down -vjust to make sure it's a clean setup. - Boot the app again with
docker compose up - Boot the second container with
docker compose up service-b. This will now fail with message:
Attaching to service-b
Error response from daemon: failed to set up container networking: network 86cb754f592cdb1e4cf71441a8d9e207f53e1aad20b188b998484ae5857eefa6 not found
It doesn't seem to happen 100% of the time, so maybe it's just that the network for the service-b profile isn't cleaned up by docker compose down since that's a different profile?
It appears as though doing a docker compose down service-b will clean up the conflicting network ... so maybe this is me just using docker compose profiles incorrectly?
In regards to
did you try with the profile for that service commented out so it's always up regardless? I assume it works then?
Yes, when I remove the profile entirely it works as normal.
@klondikemarlen We do not use profiles.
But: I was now able find the root-cause (for our problem).
We have a base service using a network alias:
site1_server.local:
networks:
default:
aliases:
- oss-0193244c-0285-75a8-96c5-eed41f6dd5db.local
and a second service extending the first one:
site2_server.local:
extends:
service: site1_server.local
networks: !override
default:
aliases:
- oss-01932034-41e5-75e1-9d5c-47cd2867ee5b.local
When running docker compose config, in docker compose v2.40.2 the service site2_server.local has the following aliases:
networks:
default:
aliases:
- oss-01932034-41e5-75e1-9d5c-47cd2867ee5b.local
In version v2.40.3 docker compose config outputs the following:
networks:
default:
aliases:
- oss-0193244c-0285-75a8-96c5-eed41f6dd5db.local
- oss-01932034-41e5-75e1-9d5c-47cd2867ee5b.local
This explains our problem, that other services trying to call oss-0193244c-0285-75a8-96c5-eed41f6dd5db.local are connecting to the wrong service.
👉 So the conclusion is, that !override behaves differently since v2.40.3.
@klondikemarlen I think that's a potentially valid bug to raise (if it changed with 2.40.3 it may have been an intentional fix for something else, such that you can't have both use-cases satisfied with default logic).
The actual cause you want to report is docker compose down removing a network that still has containers configured for it (such as one that is stopped from CTRL + C). You can avoid that bug by ensuring the stopped container is recreated and assigned the newly created network with docker compose up --force-recreate ....
8. Boot the second container with
docker compose up service-b. This will now fail with message:Attaching to service-b Error response from daemon: failed to set up container networking: network 86cb754f592cdb1e4cf71441a8d9e207f53e1aad20b188b998484ae5857eefa6 not foundIt doesn't seem to happen 100% of the time, so maybe it's just that the network for the
service-bprofile isn't cleaned up bydocker compose downsince that's a different profile?
This isn't so much an issue about profiles. You'll get the same problem if you use docker compose up with each service individually, similar to how you did for service-b.
The reason this happens is because both containers are brought up with the same network, but when you bring one of them down that network was removed. docker compose down will destroy the container and do cleanup operations.
However when you have scoped the these operations, such as with individual services (or profiles to filter them out from the default set), in the view of Docker Compose, all containers brought down associated to that network could be removed, it probably shouldn't have however as the other service container is still available (even if it's stopped and visible in docker container ls -a).
Anyway, because of this when you try to bring the stopped container back up it is still configured for the now removed network and fails because that doesn't exist anymore. You can avoid this with --force-recreate which will destroy the existing container and replace it with a new one.
Worth noting is that docker compose down service-a will still remove the network, even if you don't have a container associated to service-a it's still performing the cleanup task. When you start service-b you will notice a new network is created, but it fails without --force-recreate since the existing service-b container is still associated to the removed network (they share the same network name, but the actual network ID differs, which is what matters).
I learned about the importance of --force-recreate in other projects where some images have containers that are buggy to restarts with their entrypoints running container initialization with mutations to internal state. When you CTRL + C a container it is not equivalent to docker compose down (removing the container), you get the effects of restarting a container, where it keeps any internal changes (no volume required). Similar to the importance of docker compose down -v to remove any unexpected volumes (like images built with the VOLUME instruction to persist data across changes to a compose service via an implicit anonymous volume).
For added context, when you make changes to a network such as it's subnet in your compose config, just doing --force-recreate won't be sufficient when the network hasn't been removed (in a typical deployment, rather than the one we've been discussing where it was), --force-recreate only applies to containers not networks. In this scenario you do need to leverage docker compose down or similar to remove the existing network, so that it's recreated on the next up with your settings in place.