moby icon indicating copy to clipboard operation
moby copied to clipboard

Docker Overlay Network is not showing driver

Open zerowebcorp opened this issue 8 years ago • 11 comments

Background

I had the overlay network with the same name created before, and then I removed it. And after few minutes I re-created it with a different subnet, and that's when this issue started happening.

The "driver" column in docker network ls is not displaying a driver for this overlay network.

[root@dev-swarm-1 ~]# docker network create --driver overlay --subnet 10.0.0.0/16 proxy
retqaqtrgpy0jb8gv95yi9tnd
[root@dev-swarm-1 ~]# docker network ls
NETWORK ID          NAME                DRIVER              SCOPE
6aoxjivzciy5        appcenter           overlay             swarm
e589d6edf84a        bridge              bridge              local
9a09c998ddc7        docker_gwbridge     bridge              local
b4e4fd1f88d3        host                host                local
5pftkoze5ghf        ingress             overlay             swarm
25b0807c3054        none                null                local
retqaqtrgpy0        proxy                                   swarm
ijb20hyt6412        proxy2              overlay             swarm
[root@dev-swarm-1 ~]# 

[root@dev-swarm-1 ~]# docker network inspect proxy
[
    {
        "Name": "proxy",
        "Id": "retqaqtrgpy0jb8gv95yi9tnd",
        "Created": "0001-01-01T00:00:00Z",
        "Scope": "swarm",
        "Driver": "",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": null,
            "Config": [
                {
                    "Subnet": "10.0.0.0/16"
                }
            ]
        },
        "Internal": false,
        "Attachable": false,
        "Ingress": false,
        "Containers": null,
        "Options": null,
        "Labels": null
    }
]

Describe the results you received:

As seen above, the "Driver" is showing as empty in the docker network ls and "default" in docker network inspect

The Created date is also showing incorrect

Describe the results you expected: I expect this to be overlay

Output of docker version:

[root@dev-swarm-1 ~]# docker --version Docker version 17.05.0-ce, build 89658be [root@dev-swarm-1 ~]#

Output of docker info

[root@dev-swarm-1 ~]# docker info
Containers: 11
 Running: 5
 Paused: 0
 Stopped: 6
Images: 21
Server Version: 17.05.0-ce
Storage Driver: overlay
 Backing Filesystem: extfs
 Supports d_type: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins: 
 Volume: local
 Network: bridge host macvlan null overlay
Swarm: active
 NodeID: 7jf67652x14t547tz07pflj31
 Is Manager: true
 ClusterID: 68enz5gw9isvo6kfuoalohypf
 Managers: 1
 Nodes: 6
 Orchestration:
  Task History Retention Limit: 5
 Raft:
  Snapshot Interval: 10000
  Number of Old Snapshots to Retain: 0
  Heartbeat Tick: 1
  Election Tick: 3
 Dispatcher:
  Heartbeat Period: 5 seconds
 CA Configuration:
  Expiry Duration: 3 months
 Node Address: 172.24.19.100
 Manager Addresses:
  172.24.19.100:2377
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 9048e5e50717ea4497b757314bad98ea3763c145
runc version: 9c2d8d184e5da67c95d601382adf14862e4f2228
init version: 949e6fa
Security Options:
 seccomp
  Profile: default
Kernel Version: 3.10.0-514.26.2.el7.x86_64
Operating System: CentOS Linux 7 (Core)
OSType: linux
Architecture: x86_64
CPUs: 1
Total Memory: 1.795GiB
Name: dev-swarm-1.tdmarketplace.net
ID: AFNQ:BDRT:PIZS:K557:LFR3:MMJS:QKFT:32OG:MGFN:FS32:5URF:PA3X
Docker Root Dir: /mnt/volume/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

This is also causing containers not to start up if attached to this network.

zerowebcorp avatar Aug 12 '17 15:08 zerowebcorp

I suspect that this is due to the fact that the network is not yet completely removed when you try to recreate it. The overlay network involves coordinating multiple network sandboxes on all your involved nodes. I haven't evidence of slow downs even in modest installations in my experience.

What you can do to help us ?:

You can start the daemon in debug mode and post here what happens in the logs journalctl -u docker when you remove the network. If there's something wrong it must be there.

fntlnz avatar Aug 14 '17 04:08 fntlnz

[root@dev-swarm-1 ~]# docker network ls 
NETWORK ID          NAME                DRIVER              SCOPE
6aoxjivzciy5        appcenter           overlay             swarm
e589d6edf84a        bridge              bridge              local
9a09c998ddc7        docker_gwbridge     bridge              local
b4e4fd1f88d3        host                host                local
5pftkoze5ghf        ingress             overlay             swarm
25b0807c3054        none                null                local
retqaqtrgpy0        proxy                                   swarm
ijb20hyt6412        proxy2              overlay             swarm
[root@dev-swarm-1 ~]# docker network rm proxy
proxy
[root@dev-swarm-1 ~]# docker network ls 
NETWORK ID          NAME                DRIVER              SCOPE
6aoxjivzciy5        appcenter           overlay             swarm
e589d6edf84a        bridge              bridge              local
9a09c998ddc7        docker_gwbridge     bridge              local
b4e4fd1f88d3        host                host                local
5pftkoze5ghf        ingress             overlay             swarm
25b0807c3054        none                null                local
ijb20hyt6412        proxy2              overlay             swarm
[root@dev-swarm-1 ~]# 



Aug 15 05:20:50 dev-swarm-1 dockerd[828]: time="2017-08-15T05:20:50.228951248Z" level=error msg="Failed during network free for network retqaqtrgpy0jb8gv95yi9tnd" error="could not get networker state for network retqaqtrgpy0jb8gv95yi9tnd" module=node node.id=7jf67652x14t547tz07pflj31
Aug 15 05:20:50 dev-swarm-1 dockerd[828]: time="2017-08-15T05:20:50.229054349Z" level=debug msg="task allocation failure" error="failed to retrieve network a1m7wxbyxhe9wsfh6u00c5yax while allocating task bxtlr6j0a9ehziy3gy6vw8628" module=node node.id=7jf67652x14t547tz07pflj31
Aug 15 05:20:50 dev-swarm-1 dockerd[828]: time="2017-08-15T05:20:50.229108962Z" level=debug msg="task allocation failure" error="failed to retrieve network a1m7wxbyxhe9wsfh6u00c5yax while allocating task 6udtw98aynxxgarlhk3mqxrdy" module=node node.id=7jf67652x14t547tz07pflj31
Aug 15 05:20:50 dev-swarm-1 dockerd[828]: time="2017-08-15T05:20:50.229151748Z" level=debug msg="task allocation failure" error="failed to retrieve network a1m7wxbyxhe9wsfh6u00c5yax while allocating task d1jrwoe1pl96nicnm6as2wjdg" module=node node.id=7jf67652x14t547tz07pflj31

The error is

could not get networker state for network

Is there any way we can do a clean up of this network?

zerowebcorp avatar Aug 15 '17 05:08 zerowebcorp

I see you're running docker 17.05; docker 17.06 contains many fixes for networking; could you try updating to 17.06 to see if it's still an issue there?

thaJeztah avatar Sep 25 '17 17:09 thaJeztah

having same issue with 18.09.3 :cry:

goetas avatar Jun 05 '19 08:06 goetas

I had the same behavior when attempting to create a network with perhaps too large a subnet, though I did not get an error/warning when doing so. AS soon as I created a lower subnet range, the driver appeared and it was able to be used.

gudlyf avatar Jun 24 '19 13:06 gudlyf

/cc @arkodg

thaJeztah avatar Jun 24 '19 15:06 thaJeztah

By default the max SubnetSize is 24

docker info | grep Subnet
  SubnetSize: 24

This value can be changed by reinitializing the swarm manager

docker swarm init --default-addr-pool 10.0.0.0/8 --default-addr-pool-mask-length 16

But I agree, the error reporting is completely missing for this case and needs to be fixed

arkodg avatar Jun 24 '19 17:06 arkodg

This came to my attention only today. Little bit background: Docker has something called default address pool initialized internally( by default we use 10.0.0.0/8 and 24 as subnetsize) So what you are trying to do is : creating network with subnet range which is( already) internally got allocated to subnesize length with /24. but you are trying to create /16 subnet.

There are two ways to fix this issue.:

  1. Use default address pool feature and create swarm with whatever range you like ( 10.0.0.0/8 , subnetsize 16) or 20.0.0.0, etc
  2. or pick up different subnet range which is not part of default range while creating subnetwork ex: docker network create --driver overlay --subnet 20.0.0.0/16 proxy ( note I am using 20.0.0.0 instead of 10.0.0.0).

Kindly let me or @arkodg know if you still having issue while creating overlay network.

selansen avatar Jun 28 '19 16:06 selansen

Thanks a lot for the hint @selansen - option 2 totally worked for me (using 19.3.12).

DMIII avatar Aug 05 '20 15:08 DMIII

This is old issue already but just commenting same thing which I just needed to clarify for colleague:

create swarm with whatever range you like ( 10.0.0.0/8 , subnetsize 16) or 20.0.0.0, etc

Do NOT use 20.0.0.0 or anything else outside of private IPv4 addresses or public IP addresses which you own and dedicate for this use case.

20.0.0.0 is owned by Microsoft so using it inside of your Swarm you are potentially breaking connectivity to Azure services (in case some of those which you need have IP from that range).

There is plenty of room inside of 10.0.0.0/8 network so if you want to use /16 network then I highly recommend to pickup network 10.x.0.0/16 where x can be anything between 0 and 255. Just make sure that it does not conflict with any network inside and outside of swarm on your environment. It means that any of those cannot have any IP from range IP addresses 10.x.0.0 - 10.x.255.255 in use.

olljanat avatar Feb 01 '23 09:02 olljanat

Still the same issue in 2024. I have initialized two swarm clusters: the first one is default (just docker swarm init) and the second one by docker swarm init --advertise-addr 10.0.1.70 --cert-expiry 917490h0m0s --default-addr-pool 10.10.0.0/16. docker network create --driver overlay new_overlay command worked on the first cluster, but on the second no driver shown as described above.

I tried:

docker network create --driver overlay \
  --subnet 10.10.0.0/16 \
  --gateway 10.10.0.1 \
  --opt IPAM.Driver=default \
  --opt com.docker.network.driver.overlay.vxlanid_list=4097 \
  new_overlay

but still no driver:

# docker network ls
NETWORK ID     NAME              DRIVER    SCOPE
61f4c48ffb74   bridge            bridge    local
47eeb091618a   docker_gwbridge   bridge    local
3a20ecc314c4   host              host      local
19as8z8fw0kx   ingress           overlay   swarm
mxun9kmxzfmu   new_overlay                 swarm
2445b425e4d7   none              null      local

and docker inspect info is a bit empty:

[
    {
        "Name": "new_overlay",
        "Id": "mxun9kmxzfmuv2r4mk8s8nm89",
        "Created": "2024-06-23T23:46:37.204788875Z",
        "Scope": "swarm",
        "Driver": "",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "",
            "Options": null,
            "Config": null
        },
        "Internal": false,
        "Attachable": false,
        "Ingress": false,
        "ConfigFrom": {
            "Network": ""
        },
        "ConfigOnly": false,
        "Containers": null,
        "Options": null,
        "Labels": null
    }
]
# docker info                                                                                                                                                                                                                                [53/231]
Client: Docker Engine - Community                                                                                                       
 Version:    26.1.4   
 Context:    default               
 Debug Mode: false
 Plugins:                             
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.14.1
    Path:     /usr/libexec/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.27.1 
    Path:     /usr/libexec/docker/cli-plugins/docker-compose
                                                                    
Server:               
 Containers: 0             
  Running: 0                             
  Paused: 0                      
  Stopped: 0        
 Images: 0          
 Server Version: 26.1.4       
 Storage Driver: overlay2
  Backing Filesystem: btrfs  
  Supports d_type: true
  Using metacopy: false       
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
Swarm: active                                                                                                                                                                                                                                                           [20/231]
  NodeID: 6edunzm5ry5l0hm3f3hksci71
  Is Manager: true
  ClusterID: l3riavuuaorkkr1rgc05729am
  Managers: 1
  Nodes: 1
  Default Address Pool: 10.10.0.0/16  
  SubnetSize: 24
  Data Path Port: 4789
  Orchestration:
   Task History Retention Limit: 5
  Raft:
   Snapshot Interval: 10000
   Number of Old Snapshots to Retain: 0
   Heartbeat Tick: 1
   Election Tick: 10
  Dispatcher:
   Heartbeat Period: 5 seconds
  CA Configuration:
   Expiry Duration: 104 years
   Force Rotate: 0
  Autolock Managers: false
  Root Rotation In Progress: false
  Node Address: 10.0.1.70
  Manager Addresses:
   10.0.1.70:2377
 Runtimes: runc io.containerd.runc.v2
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: d2d58213f83a351ca8f528a95fbd145f5654e957
 runc version: v1.1.12-0-g51d5e94
 init version: de40ad0
 Security Options:
  apparmor
  seccomp
   Profile: builtin
  cgroupns
 Kernel Version: 5.15.0-112-generic
 Operating System: Ubuntu 22.04.4 LTS
 OSType: linux
 Architecture: x86_64
 CPUs: 4
 Total Memory: 3.82GiB
 Name: tswarm0
 ID: e57a80d8-2f5e-44f0-8d36-e30479e4eb27
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

I can't use IP range because swarm nodes are in 10.0.0.0/16.

Any ideas to fix? Thanks in advance.

alexanderbazhenoff avatar Jun 23 '24 23:06 alexanderbazhenoff