cli icon indicating copy to clipboard operation
cli copied to clipboard

"endpoint with name XX already exists in network" can't disconnect container from bridge

Open browncrane opened this issue 5 years ago • 24 comments

Description

Firstly, mycontainer can't be stopped by docker stop mycontainer After docker rm -f mycontainer, the same container can't run with docker: Error response from daemon: endpoint with name mycontainer already exists in network bridge. Tried docker network disconnect bridge mycontainer but the error still exists, docker network inspect bridge will see mycontainer there

Steps to reproduce the issue: 1.docker rm -f somecontainer 2.try run the same one again 3. can't start it

Describe the results you received: docker: Error response from daemon: endpoint with name mycontainer already exists in network bridge.

Describe the results you expected: start normally

Additional information you deem important (e.g. issue happens only occasionally):

Output of docker version:

Client:
 Version:           18.09.5
 API version:       1.39
 Go version:        go1.10.8
 Git commit:        e8ff056
 Built:             Thu Apr 11 04:43:34 2019
 OS/Arch:           linux/amd64
 Experimental:      false

Server: Docker Engine - Community
 Engine:
  Version:          18.09.5
  API version:      1.39 (minimum version 1.12)
  Go version:       go1.10.8
  Git commit:       e8ff056
  Built:            Thu Apr 11 04:13:40 2019
  OS/Arch:          linux/amd64
  Experimental:     false

Output of docker info:

Containers: 2
 Running: 1
 Paused: 0
 Stopped: 1
Images: 12
Server Version: 18.09.5
Storage Driver: overlay2
 Backing Filesystem: xfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: bb71b10fd8f58240ca47fbb579b9d1028eea7c84
runc version: 2b18fe1d885ee5083ef9f0838fee39b62d653e30
init version: fec3683
Security Options:
 seccomp
  Profile: default
Kernel Version: 3.10.0-957.10.1.el7.x86_64
Operating System: CentOS Linux 7 (Core)
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 3.692GiB
Name: 35.localdomain
ID: LZQO:XNBR:HQBY:AUOJ:G4VG:4SYY:FQBB:SI2W:SPU2:4D56:GHAN:HHHU
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false
Product License: Community Engine

WARNING: bridge-nf-call-iptables is disabled
WARNING: bridge-nf-call-ip6tables is disabled

Additional environment details (AWS, VirtualBox, physical, etc.): VirtualBox

browncrane avatar May 17 '19 02:05 browncrane

After service docker restart, things back to normal.

browncrane avatar May 17 '19 02:05 browncrane

/cc @arkodg

thaJeztah avatar May 21 '19 16:05 thaJeztah

Hi @browncrane, would really appreciate it if you could share the exact steps to recreate this issue .

docker run --name somecontainer -d alpine
Step2
Step3
.........
docker rm -f somecontainer

arkodg avatar May 21 '19 16:05 arkodg

Hi @arkodg , the container is running quite long ago. So I may miss some info about it. I think it was running normally for a while. Later I replaced the jar file in it, and it can't be stopped, it should be my application's problem. So I have to rm -f. Then it goes here, Trying to disconnect a "ghost" container but not work.

The run command is: docker run -dit --restart always --name dems --mount type=bind,src=/home/dockerlogs/dems,target=/log -p 9132:9132 -p 5005:5005 dems

browncrane avatar May 22 '19 07:05 browncrane

$ history | grep dems
    2  docker cp dems-1.0-SNAPSHOT.jar dems:/app.jar
    3  sudo docker cp dems-1.0-SNAPSHOT.jar dems:/app.jar
    4  sudo docker start dems
    6  docker logs dems
    7  sudo docker logs dems
   14  sudo docker cp dems-1.0-SNAPSHOT.jar dems:/app.jar
   17  docker stop dems
   22  docker rm -f dems

my user account was not in docker group at first. But if docker network disconnect fail on the privilege reson, I would notice and rerun with sudo.

browncrane avatar May 22 '19 07:05 browncrane

Run in same issue, any update?

CrazyNash avatar Sep 30 '19 06:09 CrazyNash

@CrazyNash can you also please share the repro steps

arkodg avatar Sep 30 '19 21:09 arkodg

We are running into the same issue on multiple systems:

docker info
Client:
 Debug Mode: false

Server:
 Containers: 5
  Running: 4
  Paused: 0
  Stopped: 1
 Images: 49
 Server Version: 19.03.1
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: true
 Logging Driver: journald
 Cgroup Driver: cgroupfs
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 894b81a4b802e4eb2a91d1ce216b8817763c29fb
 runc version: 425e105d5a03fabd737a126ad93d62a9eeede87f
 init version: fec3683
 Security Options:
  apparmor
  seccomp
   Profile: default
 Kernel Version: 4.15.0-54-generic
 Operating System: Ubuntu 18.04.2 LTS
 OSType: linux
 Architecture: x86_64
 CPUs: 8
 Total Memory: 15.56GiB
 Name: <hostname withheld>
 ID: MX5M:CAI7:4IYC:ZZ5E:3VJH:MERS:OCHO:ZJN6:DOHZ:ESEP:OYHA:BU7X
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

WARNING: No swap limit support

Once the system is in this state. We are unable to start a container with the same name, even though the previous container was removed with docker rm -f <container-name>

The container was started with:

docker run -it --name=maint-gui --hostname=maint-gui --privileged=true --detach=true -p 4444:4444 -p 5900:5900 ubuntu-firefox-image:latest

This container only reaches a created state.

starting a second container with a different name results in:

docker run -it --name=maint-gui2 --hostname=maint-gui2 --privileged=true --detach=true -p 4444:4444 -p 5900:5900 ubuntu-firefox-image:latest
docker: Error response from daemon: driver failed programming external connectivity on endpoint maint-gui2 (e54af42ad0864a3d967c5402bdf2199c505fc716f12ad2c5ef9c32bee38b24f7): Bind for 0.0.0.0:5900 failed: port is already allocated.

Other containers are starting / stopping normally during this time.

docker daemon logs at this time:

Dec 04 02:28:37 <hostname withheld> dockerd[3143]: time="2019-12-04T02:28:37.494182240-07:00" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Dec 04 02:28:37 <hostname withheld> dockerd[3143]: time="2019-12-04T02:28:37.754479753-07:00" level=warning msg="f2044708ce36ced782ab74610b5e6c17dcc348564d0f9bf769a3ea9757310e8a cleanup: failed to unmount IPC: umount /var/lib/docker/container
s/f2044708ce36ced782ab74610b5e6c17dcc348564d0f9bf769a3ea9757310e8a/mounts/shm, flags: 0x2: no such file or directory"
Dec 04 02:28:40 <hostname withheld> dockerd[3143]: time="2019-12-04T02:28:40.421704364-07:00" level=warning msg="Error while looking up for image <image name withheld>:latest"
Dec 04 02:28:40 <hostname withheld> dockerd[3143]: time="2019-12-04T02:28:40.549398377-07:00" level=warning msg="Error while looking up for image <image name withheld>:latest"
Dec 04 02:28:40 <hostname withheld> dockerd[3143]: time="2019-12-04T02:28:40.680705407-07:00" level=warning msg="Error while looking up for image <image name withheld>:latest"
--
Dec 04 02:29:40 <hostname withheld> dockerd[3143]: time="2019-12-04T02:29:40.473362896-07:00" level=warning msg="Error while looking up for image <image name withheld>:latest"
Dec 04 02:29:40 <hostname withheld> dockerd[3143]: time="2019-12-04T02:29:40.612798129-07:00" level=warning msg="Error while looking up for image <image name withheld>:latest"
Dec 04 02:29:40 <hostname withheld> dockerd[3143]: time="2019-12-04T02:29:40.750933343-07:00" level=warning msg="Error while looking up for image <image name withheld>:latest"
Dec 04 02:29:43 <hostname withheld> dockerd[3143]: time="2019-12-04T02:29:43.727075980-07:00" level=warning msg="689c8a431bd32e6889588b2270bcbed0d0f4c485891029b8b55a54de6a516d2b cleanup: failed to unmount IPC: umount /var/lib/docker/container
s/689c8a431bd32e6889588b2270bcbed0d0f4c485891029b8b55a54de6a516d2b/mounts/shm, flags: 0x2: no such file or directory"
Dec 04 02:29:43 <hostname withheld> dockerd[3143]: time="2019-12-04T02:29:43.886848463-07:00" level=error msg="689c8a431bd32e6889588b2270bcbed0d0f4c485891029b8b55a54de6a516d2b cleanup: failed to delete container from containerd: no such container"
Dec 04 02:29:45 <hostname withheld> dockerd[3143]: time="2019-12-04T02:29:45.376542902-07:00" level=warning msg="Error while looking up for image <image name withheld>:latest"
Dec 04 02:29:45 <hostname withheld> dockerd[3143]: time="2019-12-04T02:29:45.515083420-07:00" level=warning msg="Error while looking up for image <image name withheld>:latest"
Dec 04 02:29:45 <hostname withheld> dockerd[3143]: time="2019-12-04T02:29:45.652682289-07:00" level=warning msg="Error while looking up for image <image name withheld>:latest"
--
Dec 04 02:29:58 <hostname withheld> dockerd[3143]: time="2019-12-04T02:29:58.811592136-07:00" level=info msg="Container 60eeb8b35919045b8a0e91952eb9a0a7b1107d9557e1d047e4e29751a9ad6e0b failed to exit within 10 seconds of signal 15 - using the
 force"

e2designs avatar Dec 05 '19 17:12 e2designs

@e2designs your error-message is different, and (as far as I can see) is expected, because port 5900 is already assigned to the first container;

Bind for 0.0.0.0:5900 failed: port is already allocated

thaJeztah avatar Dec 09 '19 12:12 thaJeztah

@thaJeztah I understand that the error message states that it is already allocated. The container that was using 5900 had been torn down and removed. There is no indication within the docker daemon that it is still allocated and no containers show in a docker ps -a. Starting a new container with a different port also fails.

It appears to be a partial cleanup issue.

e2designs avatar Dec 09 '19 15:12 e2designs

I have run into the same issue. My server running jwilder/nginx-proxy with docker-letsencrypt had run out of disk space, so it stopped updating certificates:

nginx-letsencrypt    | sed: can't create temp file '/etc/nginx/vhost.d/defaultXXXXXX': No space left on device
nginx-letsencrypt exited with code 0

I freed up a terabyte, but restart did not work:

docker-compose up nginx-letsencrypt
Starting nginx-letsencrypt ... error

ERROR: for nginx-letsencrypt  Cannot start service nginx-letsencrypt: b'endpoint with name nginx-letsencrypt already exists in network nginxproxy_default'

ERROR: for nginx-letsencrypt  Cannot start service nginx-letsencrypt: b'endpoint with name nginx-letsencrypt already exists in network nginxproxy_default'

sudo service docker restart helped.

Redsandro avatar Dec 25 '19 09:12 Redsandro

Ran into this after recreating a container after it was locked up and wasn't responding to stops or SIGKILL ERROR: for mongo Cannot start service mongo: endpoint with name integration-test-mongo already exists in network

Command history:

docker ps
docker stop e6cbee8efb39 6b05798cf949 a198a43ee674 e5d6084b4625 //stopping a bunch of stuff
docker ps
docker stop a198a43ee674 //huh, it didn't stop, this was the problem mongo container
docker ps
docker stop -f a198a43ee674 //thought f was an option to force
docker stop --help //figured out it wasn't
docker stop -t 1 a198a43ee674
docker ps
docker kill a198a43ee674 //trying to kill it
docker ps
docker kill a198a43ee674 //it was still there
docker kill --signal=SIGKILL a198a43ee674
docker ps //still there after SIGKILL
docker rm a198a43ee674
docker rm -f a198a43ee674 //finally was able to force remove

docker info

 Server Version: 19.03.5
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: true
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: b34a5c8af56e510852c35414db4c1f4fa6172339
 runc version: 3e425f80a8c931f88e6d94a8c831b9d5aa481657
 init version: fec3683
 Security Options:
  seccomp
   Profile: default
 Kernel Version: 4.9.184-linuxkit
 Operating System: Docker Desktop
 OSType: linux
 Architecture: x86_64

The container happened to be mongo, but I don't think it matters.

mongo:
  image: mongo:4.1.4
  container_name: integration-test-mongo
  hostname: mongo
  ports:
    - '27017:27017'

Restarting the service (obviously) resets the network, and then allows me to relaunch the container.

Might be something with the forced rm of a running container that doesn't clean up it's network connection?

aedelbro avatar Jan 07 '20 14:01 aedelbro

I was faced with same problem.

autostand-> docker ps -a
CONTAINER ID        IMAGE                     COMMAND                  CREATED             STATUS              PORTS                    NAMES
846c9a39652f        jarkt/docker-remote-api   "/bin/sh -c 'socat T…"   17 months ago       Up About an hour    0.0.0.0:8888->2375/tcp   remote_access

autostand-> docker run -d --network TestNet6 --name DstSF801_19 dummy_host
60e3e1891d0adab728d070c5a4763206cf59ad2f08d51c7c81d8349dda8a4f95
docker: Error response from daemon: endpoint with name DstSF801_19 already exists in network TestNet6.

As far as I understand it happens when docker force killed a container but for some reason this container still available into an information about a network

autostand-> docker network inspect TestNet6 
        "Containers": {
            "755c9aebc12682e7a3c171eaeaebc24909d921c9c62d073719ca2ad326818dff": {
                "Name": "DstSF801_19",
                "EndpointID": "c746007b03b8478fc22474064abd53929ef070b93c4dd53299448fc73cedd2a3",
                "MacAddress": "02:42:c0:a8:80:83",
                "IPv4Address": "192.168.128.131/24",
                "IPv6Address": "fd02:2b59:23af:1006::5/64"
            }

as well as a containerd-shim process

autostand-> ps ax |grep 755c9aebc12682e7a3c171eaeaebc24909d921c9c62d073719ca2ad326818dff
20145 ?        Sl     0:04 containerd-shim -namespace moby -workdir /var/lib/containerd/io.containerd.runtime.v1.linux/moby/755c9aebc12682e7a3c171eaeaebc24909d921c9c62d073719ca2ad326818dff -address /run/containerd/containerd.sock -containerd-binary /usr/bin/containerd -runtime-root /var/run/docker/runtime-runc

attempt to disconnect an endpoint from this network does nothing because container was deleted

autostand-> docker network disconnect TestNet6 DstSF801_19
Error response from daemon: No such container: DstSF801_19

I found only one way to resolve this problem -> restart docker

waterwarrior avatar Jan 23 '20 03:01 waterwarrior

After service docker restart, things back to normal.

highly under-rated comment.

5hanth avatar Jan 23 '20 09:01 5hanth

This helped for me (I used the docker-compose): First of all remove the invalid container docker container rm <container> or docker-compose rm <service> (if you are using the docker-compose)

Ensure that the container is gone: docker container ls | grep <container> or docker-compose ps | grep <container> there must be no such container in output

Now it may still be connected to the network, so disconnect it: docker network disconnect -f <network> <container> You must to use -f flag to enforce disconnecting the nonexistent container

Ensure that is ok: docker network inspect | grep <container> there must be no such container in output

And then you may to create and run the new container, for my docker-compose case I run docker-compose up -d <service> - this command creates container and automatically adds it the network

SavostinVladimir avatar Feb 03 '20 16:02 SavostinVladimir

Hello! I have same issue, after stopping container his network is still in network configuration, that produces the running error:

endpoint with name agitated_nobel already exists in network bridge

See docker inspect:

$ docker network inspect bridge
....
"ead2e4080f561c2212c00fb1340c2f4532c1bdec5e638671c7a09cad7c40f414": {
                "Name": "agitated_nobel",
                "EndpointID": "f36c06d17e206cbf932a52be0b80f06fd8621b57e98fa77caec83cbeb9584804",
                "MacAddress": "02:42:ac:11:00:0f",
                "IPv4Address": "172.17.0.15/16",
                "IPv6Address": ""
            }
....

And try to find a container:

$ docker ps -a | grep agitated_nobel

Nothing....

Removing network with -f flag:

$ docker network disconnect -f bridge agitated_nobel

agitated_nobel from network bridge disappeared.

My docker info:

$ docker info
Containers: 61
 Running: 31
 Paused: 0
 Stopped: 30
Images: 65
Server Version: 17.05.0-ce
Storage Driver: overlay2
 Backing Filesystem: extfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins: 
 Volume: local
 Network: bridge host macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 9048e5e50717ea4497b757314bad98ea3763c145
runc version: 9c2d8d184e5da67c95d601382adf14862e4f2228
init version: 949e6fa
Security Options:
 apparmor
Kernel Version: 4.9.164-0409164-generic
Operating System: Ubuntu 14.04.5 LTS
OSType: linux
Architecture: x86_64
CPUs: 16
Total Memory: 30.42GiB
Name: jenkins-agents-1
ID: QE4A:FTD5:X7KN:CV6G:5OEE:U26I:EDBH:X24J:GOW7:YN6J:CRIH:XL3E
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): true
 File Descriptors: 229
 Goroutines: 462
 System Time: 2020-02-04T06:53:02.844260111Z
 EventsListeners: 22
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
 registry:5000
 registry2:5000
 127.0.0.0/8
Live Restore Enabled: false

ghost avatar Feb 04 '20 06:02 ghost

Now it may still be connected to the network, so disconnect it: docker network disconnect -f <network> <container> You must to use -f flag to enforce disconnecting the nonexistent container

This is a very helpful addition! Thanks!

danyanya avatar Aug 26 '20 09:08 danyanya

I can across this issue when I started the docker container docker-compose up but stopped it with docker stop <id>. I restarted the docker service to remove/disconnect the network configuration. I used docker-compose up to start the container and then used docker-compose down. It did a clean stop and I did not get this issue when again starting the container.

screenpanda avatar Sep 10 '20 08:09 screenpanda

Guys, I've run through the same issue and the steps bellow worked for me.

1 - docker rm -f container_name

2 - docker network disconnect -f network_name container_name

( in my case the name of the network was Bridge, but you can find out all networks with this command docker network ls)

3 - (check if it was removed) - docker network inspect network_name | grep container_name

4 - (if it was removed) - just run the container again.

I Hope this information is usefull for more people.

antoniodesenvolvedor avatar Apr 29 '21 12:04 antoniodesenvolvedor

docker container prune may help recovery after partial cleanup

bfellman avatar Jun 06 '21 10:06 bfellman

$ docker ps -a | grep  xxxxxxxxx-certserver
$ docker-compose up -d
Creating xxxxxxxxx-certserver ... error

ERROR: for xxxxxxxxx-certserver  Cannot start service xxxxxxxxx-certserver: endpoint with name xxxxxxxxx-certserver already exists in network host

ERROR: for xxxxxxxxx-certserver  Cannot start service xxxxxxxxx-certserver: endpoint with name xxxxxxxxx-certserver already exists in network host
ERROR: Encountered errors while bringing up the project.
$ docker ps -a | grep  xxxxxxxxx-certserver
0f8709c76733        xxxxxxxxxx:5000/wxx/xxxxxxxxx-certserver:xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx         "/bin/sh /data/webof…"   3 seconds ago       Created                                              xxxxxxxxx-certserver
$ docker-compose down
Removing xxxxxxxxx-certserver ... done
$ docker network disconnect --force host xxxxxxxxx-certserver
$ docker-compose up -d
Creating xxxxxxxxx-certserver ... done
$ docker ps -a | grep  xxxxxxxxx-certserver
85443ce71285        xxxxxxxxxx:5000/wxx/xxxxxxxxx-certserver:xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx         "/bin/sh /data/webof…"   4 seconds ago       Up 3 seconds                                         xxxxxxxxx-certserver

zhangguanzhang avatar Nov 02 '21 07:11 zhangguanzhang

Why doesn't the Docker prune everything command, as documented here not actually prune everything?

timdonovanuk avatar Mar 23 '22 09:03 timdonovanuk

Is this assigned to the correct component: docker/cli? Is this really a bug in the cli, or is rather a bug in the big fat daemon( as Red Hat puts it)?

Yes, we are also seeing this.

karniemi avatar Jun 08 '22 05:06 karniemi

I'm experiencing this issue with docker 20.10.17, build 100c701 on the latest ubuntu 22.04

naioja avatar Jul 01 '22 16:07 naioja