cleanup duplicate ipvlan and macvlan network IDs during createNetwork
cleanup duplicate ipvlan and macvlan network IDs during createNetwork to prevent collisions
- check if the new network uses the same interface as an existing network, as different networks of this type cannot share the same interface
- if the new network ID does not match an existing network ID then we must return with an error, as the new network is not the same and cannot co-exist with the existing
- attempt to delete the old duplicate network, and return an error if unsuccessful, or log info if successful
This resolves the issue shown below, where an existing copy of a network has not been removed on Docker Engine shutdown, and still persists when the config-only network is created once again on a subsequent createNetwork call
existing network config = (*ipvlan.configuration)(0xc420322e80)({\n
ID: (string) (len=25) \"hwqzu5i4kmrd2uzi34zpnpy1m\",\n
Mtu: (int) 0,\n
dbIndex: (uint64) 6,\n
dbExists: (bool) true,\n
Internal: (bool) false,\n
Parent: (string) (len=4) \"eth0\",\n
IpvlanMode: (string) (len=2) \"l2\",\n
CreatedSlaveLink: (bool) false,\n
Ipv4Subnets: ([]*ipvlan.ipv4Subnet) (len=1 cap=4) {\n
(*ipvlan.ipv4Subnet)(0xc4206413a0)({\n
SubnetIP: (string) (len=13) \"10.100.0.0/16\",\n
GwIP: (string) (len=13) \"10.100.0.1/16\"\n
})\n
},\n
Ipv6Subnets: ([]*ipvlan.ipv6Subnet) <nil>\n})\n
new network config = (*ipvlan.configuration)(0xc4203c5b80)({\n
ID: (string) (len=25) \"hwqzu5i4kmrd2uzi34zpnpy1m\",\n
Mtu: (int) 0,\n
dbIndex: (uint64) 0,\n
dbExists: (bool) false,\n
Internal: (bool) false,\n
Parent: (string) (len=4) \"eth0\",\n
IpvlanMode: (string) (len=2) \"l2\",\n
CreatedSlaveLink: (bool) false,\n
Ipv4Subnets: ([]*ipvlan.ipv4Subnet) (len=1 cap=1) {\n
(*ipvlan.ipv4Subnet)(0xc42149af00)({\n
SubnetIP: (string) (len=13) \"10.100.0.0/16\",\n
GwIP: (string) (len=13) \"10.100.0.1/16\"\n
})\n
},\n
Ipv6Subnets: ([]*ipvlan.ipv6Subnet) <nil>\n})\n
The symptom of an existing and now-conflicting network is usually as follows, including showing up as a rejected task's error -
docker.err.log:time="2018-01-15T23:32:21.993891743Z" level=error msg="fatal task error" error="network di-hwqzu5i4kmrd is already using parent interface eth0
Mentioned this to @mavenugo quite a while ago. Not sure if cleanup can be done on the shutdown side, but I think this may be safer, to cleanup if needed on creation instead.
Codecov Report
:exclamation: No coverage uploaded for pull request base (
master@fcf1c3b). Click here to learn what that means. The diff coverage is0%.
@@ Coverage Diff @@
## master #2055 +/- ##
========================================
Coverage ? 40.5%
========================================
Files ? 138
Lines ? 22185
Branches ? 0
========================================
Hits ? 8985
Misses ? 11877
Partials ? 1323
| Impacted Files | Coverage Δ | |
|---|---|---|
| drivers/macvlan/macvlan_network.go | 0% <0%> (ø) |
|
| drivers/ipvlan/ipvlan_network.go | 0% <0%> (ø) |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact),ø = not affected,? = missing dataPowered by Codecov. Last update fcf1c3b...78df1c0. Read the comment docs.
@eyz just a clarification, what do you mean with the existing copy not being deleted on daemon shutdown? I guess the state is persisted in the local store.
Hi, @fcrisciani , and thanks for looking at this issue! I can attempt to clarify further if needed, but the typical behavior is that - intermittently, when I stop Docker Engine with Swarm nodes that have instantiated a config-only ipvlan network in global/swarm mode, there is often a collision when the service which uses that network comes back up again -- when the container assigned to that service (and thus using that network from before) comes up, the network occasionally is apparently being created once again -- it appears there is a collision in createNetwork when this occurs. The existing behavior, prior to my PR, will simply throw an error if the network is being re-created -- the same network ID is already presumably in the store, and now the executor is trying to create the network once again, and so a fatal error occurs in createNetwork, which results in a task-scoped error to be presented, and the container cannot be started from the task.
With this PR, if the existing network ID is found, and is being created a second time, then the "old" network ID -- the duplicate from before -- is removed first, and then createNetwork can proceed with creation of the same network, which is also using the same network ID, as confirmed during my testing.
I fully expect that there is another side of this solution, where perhaps there can be proper cleanup at shutdown. I spoke with Madhu (@mavenugo) quite a while ago, and we found that there was a workaround for the issue by deleting the network store, and re-creating, but that approach isn't sustainable. I found that the linked PR will allow createNetwork to proceed without a fatal error, and thus the containers can use the existing network once again. A more creative solution may be the actual best answer, but this PR seems to work as a solution as well.
Is that enough detail to investigate this PR as a potential desired solution?
So generally speaking I would prefer an approach similar to bridge driver, where simplistically there, if the bridge ifc is already created, will just simply use it.
Here the same way I would consider to start changing the if condition like this:
if config.Parent == nw.config.Parent && config.ID != nw.config.ID { ==> then error
Else we continue with the normal processing knowing that the network being created is actually the same.
Now arrives the tricky part, we have to check that the methods like: createDummyLink, createVlanLink, are not bailing out if the interface that are creating is already there.
@fcrisciani my organization is probably moving to Kubernetes, so please adopt or close as applicable. Thanks! That said, let me know if you need any more clarification, but I probably won't be submitting code updates to this PR
@fcrisciani Did you mean something like this? This isssue is very predictable.
https://gist.github.com/rnataraja/d987b61d8acc8ac738641f55ebe5af0d
@fcrisciani and @mavenugo, is anyone available to take this over?