cleanup duplicate ipvlan and macvlan network IDs during createNetwork to prevent collisions

check if the new network uses the same interface as an existing network, as different networks of this type cannot share the same interface
if the new network ID does not match an existing network ID then we must return with an error, as the new network is not the same and cannot co-exist with the existing
attempt to delete the old duplicate network, and return an error if unsuccessful, or log info if successful

This resolves the issue shown below, where an existing copy of a network has not been removed on Docker Engine shutdown, and still persists when the config-only network is created once again on a subsequent createNetwork call

existing network config = (*ipvlan.configuration)(0xc420322e80)({\n
  ID: (string) (len=25) \"hwqzu5i4kmrd2uzi34zpnpy1m\",\n
  Mtu: (int) 0,\n
  dbIndex: (uint64) 6,\n
  dbExists: (bool) true,\n
  Internal: (bool) false,\n
  Parent: (string) (len=4) \"eth0\",\n
  IpvlanMode: (string) (len=2) \"l2\",\n
  CreatedSlaveLink: (bool) false,\n
  Ipv4Subnets: ([]*ipvlan.ipv4Subnet) (len=1 cap=4) {\n
    (*ipvlan.ipv4Subnet)(0xc4206413a0)({\n
      SubnetIP: (string) (len=13) \"10.100.0.0/16\",\n
      GwIP: (string) (len=13) \"10.100.0.1/16\"\n
    })\n
  },\n
  Ipv6Subnets: ([]*ipvlan.ipv6Subnet) <nil>\n})\n

new network config = (*ipvlan.configuration)(0xc4203c5b80)({\n
  ID: (string) (len=25) \"hwqzu5i4kmrd2uzi34zpnpy1m\",\n
  Mtu: (int) 0,\n
  dbIndex: (uint64) 0,\n
  dbExists: (bool) false,\n
  Internal: (bool) false,\n
  Parent: (string) (len=4) \"eth0\",\n
  IpvlanMode: (string) (len=2) \"l2\",\n
  CreatedSlaveLink: (bool) false,\n
  Ipv4Subnets: ([]*ipvlan.ipv4Subnet) (len=1 cap=1) {\n
    (*ipvlan.ipv4Subnet)(0xc42149af00)({\n
      SubnetIP: (string) (len=13) \"10.100.0.0/16\",\n
      GwIP: (string) (len=13) \"10.100.0.1/16\"\n
    })\n
  },\n
  Ipv6Subnets: ([]*ipvlan.ipv6Subnet) <nil>\n})\n

The symptom of an existing and now-conflicting network is usually as follows, including showing up as a rejected task's error -

docker.err.log:time="2018-01-15T23:32:21.993891743Z" level=error msg="fatal task error" error="network di-hwqzu5i4kmrd is already using parent interface eth0

Jan 16 '18 21:01 eyz

Mentioned this to @mavenugo quite a while ago. Not sure if cleanup can be done on the shutdown side, but I think this may be safer, to cleanup if needed on creation instead.

Jan 16 '18 21:01 eyz

Codecov Report

:exclamation: No coverage uploaded for pull request base (master@fcf1c3b). Click here to learn what that means. The diff coverage is 0%.

@@           Coverage Diff            @@
##             master   #2055   +/-   ##
========================================
  Coverage          ?   40.5%           
========================================
  Files             ?     138           
  Lines             ?   22185           
  Branches          ?       0           
========================================
  Hits              ?    8985           
  Misses            ?   11877           
  Partials          ?    1323

Impacted Files	Coverage Δ
drivers/macvlan/macvlan_network.go	`0% <0%> (ø)`
drivers/ipvlan/ipvlan_network.go	`0% <0%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update fcf1c3b...78df1c0. Read the comment docs.

Jan 16 '18 21:01 codecov-io

@eyz just a clarification, what do you mean with the existing copy not being deleted on daemon shutdown? I guess the state is persisted in the local store.

Jan 23 '18 22:01 fcrisciani

Hi, @fcrisciani , and thanks for looking at this issue! I can attempt to clarify further if needed, but the typical behavior is that - intermittently, when I stop Docker Engine with Swarm nodes that have instantiated a config-only ipvlan network in global/swarm mode, there is often a collision when the service which uses that network comes back up again -- when the container assigned to that service (and thus using that network from before) comes up, the network occasionally is apparently being created once again -- it appears there is a collision in createNetwork when this occurs. The existing behavior, prior to my PR, will simply throw an error if the network is being re-created -- the same network ID is already presumably in the store, and now the executor is trying to create the network once again, and so a fatal error occurs in createNetwork, which results in a task-scoped error to be presented, and the container cannot be started from the task.

With this PR, if the existing network ID is found, and is being created a second time, then the "old" network ID -- the duplicate from before -- is removed first, and then createNetwork can proceed with creation of the same network, which is also using the same network ID, as confirmed during my testing.

I fully expect that there is another side of this solution, where perhaps there can be proper cleanup at shutdown. I spoke with Madhu (@mavenugo) quite a while ago, and we found that there was a workaround for the issue by deleting the network store, and re-creating, but that approach isn't sustainable. I found that the linked PR will allow createNetwork to proceed without a fatal error, and thus the containers can use the existing network once again. A more creative solution may be the actual best answer, but this PR seems to work as a solution as well.

Is that enough detail to investigate this PR as a potential desired solution?

Jan 23 '18 22:01 eyz

So generally speaking I would prefer an approach similar to bridge driver, where simplistically there, if the bridge ifc is already created, will just simply use it. Here the same way I would consider to start changing the if condition like this: if config.Parent == nw.config.Parent && config.ID != nw.config.ID { ==> then error Else we continue with the normal processing knowing that the network being created is actually the same. Now arrives the tricky part, we have to check that the methods like: createDummyLink, createVlanLink, are not bailing out if the interface that are creating is already there.

Jan 23 '18 23:01 fcrisciani

@fcrisciani my organization is probably moving to Kubernetes, so please adopt or close as applicable. Thanks! That said, let me know if you need any more clarification, but I probably won't be submitting code updates to this PR

Feb 09 '18 04:02 eyz

@fcrisciani Did you mean something like this? This isssue is very predictable.

https://gist.github.com/rnataraja/d987b61d8acc8ac738641f55ebe5af0d

Oct 30 '18 02:10 rnataraja

@fcrisciani and @mavenugo, is anyone available to take this over?

Jan 10 '19 00:01 eyz