for-win icon indicating copy to clipboard operation
for-win copied to clipboard

Container fails to start because of missing network

Open travnick opened this issue 1 year ago • 2 comments

Description

I have issues with disappearing networks from time to time. Containers fail to start with an error message like this:

Error response from daemon: Cannot restart container xxx-ci-runner-1: network bfa07ea9ec2242cdbf06521090d8a8d1e39f6b1461718dade544babf5997ecb1 not found

There are three Windows 10 Pro (22H2) machines with Windows containers running (3-4 containers on each machine).

The issue appears randomly on each machine and also randomly for each container. So it's possible to have only one broken container, and tree running containers on a single machine.

The same issue was with external network

networks:
  default:
    external: true
    name: nat

and also with separate network (simply defaults, without explicitly defining networks section in docker-compose.yml). With the separate network, it feels like appearing less often.

It looks like similar or the same issue as https://github.com/docker/for-win/issues/3076

I'm facing this issue since very beginning of using docker for my solution. As far a remember it was 20. For sure affected versions are 26, 25, 24.

Reproduce

No idea, after some time (probably machine restart, but not sure if only) container is stopped, and it's network is missing.

Expected behavior

No response

docker version

Client:
 Version:           26.0.0
 API version:       1.45
 Go version:        go1.21.8
 Git commit:        2ae903e
 Built:             Wed Mar 20 15:18:56 2024
 OS/Arch:           windows/amd64
 Context:           default

Server: Docker Engine - Community
 Engine:
  Version:          26.0.0
  API version:      1.45 (minimum version 1.24)
  Go version:       go1.21.8
  Git commit:       8b79278
  Built:            Wed Mar 20 15:17:49 2024
  OS/Arch:          windows/amd64
  Experimental:     false

docker info

Client:
 Version:    26.0.0
 Context:    default
 Debug Mode: false
 Plugins:
  compose: Docker Compose (Docker Inc.)
    Version:  v2.26.1
    Path:     C:\Users\my_user\.docker\cli-plugins\docker-compose.exe

Server:
 Containers: 3
  Running: 2
  Paused: 0
  Stopped: 1
 Images: 5
 Server Version: 26.0.0
 Storage Driver: windowsfilter
  Windows:
 Logging Driver: json-file
 Plugins:
  Volume: local
  Network: ics internal l2bridge l2tunnel nat null overlay private transparent
  Log: awslogs etwlogs fluentd gcplogs gelf json-file local splunk syslog
 Swarm: inactive
 Default Isolation: hyperv
 Kernel Version: 10.0 19045 (19041.1.amd64fre.vb_release.191206-1406)
 Operating System: Microsoft Windows Version 22H2 (OS Build 19045.4291)
 OSType: windows
 Architecture: x86_64
 CPUs: 4
 Total Memory: 15.72GiB
 Name: FP-PC2800
 ID: 7a03009d-0d5d-4b92-8c23-376259f9a2f8
 Docker Root Dir: C:\ProgramData\docker
 Debug Mode: false
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false
 Product License: Community Engine

Diagnostics ID

no such option

Additional Info

No response

travnick avatar Apr 10 '24 14:04 travnick

I have the exact same issue on Windows Server 2022 with Windows containers and compose as well.

It seems to happen when you restart your machine. The nat network seems to get recreated on restart with a different network id in the backend. Because the containers were created before the reboot with a different network id, it now fails to boot because the network id changes after the reboot for some reason. The only way I can get the containers to start again is to manually do a docker-compose down first before doing a docker-compose up -d again. The problem with this is though you either have to script it via Task Scheduler (when it should just work), and if you have any state that's maintained in running containers, bringing them down and up again might remove it.

This issue needs to be fixed as its a pretty fundamental feature that is broken.

sikhness avatar May 25 '24 16:05 sikhness

Yeah, have the same issue. After reboot the custom made networks are gone. We are not using docker compose or stuff like this, normally creating the container for Business Central with the NavContainerHelper. That's really annoying. When will it be fixed?

DobbyNator94 avatar Mar 14 '25 09:03 DobbyNator94

Hi everyone,

I was able to solve the issue by restoring the HNS networks after a server reboot. This prevents Docker from losing its network configurations on restart, allowing containers to start as expected without any manual intervention.

As a result, any previous network-related workarounds are no longer necessary.

I've made this solution available as a Docker plugin, which can also automatically register itself with the Windows Task Scheduler to ensure the networks are restored automatically after every Windows boot.

You can find the plugin here: https://github.com/Masterwow3/docker-netrestore

Masterwow3 avatar May 28 '25 09:05 Masterwow3