libnetwork icon indicating copy to clipboard operation
libnetwork copied to clipboard

Fix HNS policylist error "network not found" during network removal

Open trapier opened this issue 4 years ago • 5 comments

Removal of PolicyLists from Windows VFP must be performed prior to removing the corresponding HNS network. Otherwise PolicyList removal fails with HNS error "The network was not found."

This ordering requirement was introduced to Windows Server 2019 in an update some time in 2020. Have reached out to Microsoft to request additional context with respect to what OS version(s) in the change was shipped with and the rationale for the change.

Accommodate the OS sequencing requirement by delaying network deletion until after cleaning up service bindings.

This PR also:

  • Introduces error logging for the affected HNS calls.
  • Provides a performance improvement to Linux overlay network cleanup by making explicit service loadbalancer dataplane cleanup logic conditional to Windows (where it is required). On Linux, the analogous ipvs dataplane cleanup occurs implicitly with the deletion of the loadbalancer net namespace.

Opening as draft to take a look at the following kv error I'm seeing during deletion on Windows. Am also seeing this on 55e924b8a84231a065879156c0de95aefc5f5435 from bump_19.03, so it's likely not related to this PR.

2/3/2021 11:04:10 PM Error (Unable to complete atomic operation, key modified) deleting object [endpoint qyuvcy6qzwnsj09ybf9cvs19h c29da54656238a07de515f732e424d9f72e0b466bd83ff0b622ad632cdc2239d], retrying...

Mirantis Ref: FIELD-3310

trapier avatar Feb 04 '21 09:02 trapier

Looking forward to this fix.

Good work!

JohanSpannare avatar Feb 04 '21 17:02 JohanSpannare

Fixes moby/moby#41354

This ordering requirement was introduced to Windows Server 2019 in an update some time in 2020.

I think that it was KB4551853 + fixes done to that one which also caused https://github.com/moby/moby/issues/40998

Have reached out to Microsoft to request additional context with respect to what OS version(s) in the change was shipped with and the rationale for the change.

FYI. I also opened question to https://github.com/microsoft/hcsshim/issues/988 about how HNS changes are tested as part of Windows build process. Would be nice to get some responses to that one too from your Microsoft contacts.

olljanat avatar Mar 30 '21 19:03 olljanat

Note we have migrated this codebase over to github.com/moby/moby/libnetwork. We are not accepting PR's on this repo anymore except for backports to be included in moby 20.10

cpuguy83 avatar Jun 18 '21 22:06 cpuguy83

We are not accepting PR's on this repo anymore except for backports to be included in moby 20.10

FYI for everyone who is waiting for this fix. It looks that Mirantis did backport this on using their own copy of this repo with https://github.com/Mirantis/libnetwork/pull/3 and https://github.com/Mirantis/libnetwork/pull/4 and if I understand correctly it is already included Docker EE versions 19.03.16 and 20.10.5

olljanat avatar Jun 24 '21 09:06 olljanat

@thaJeztah this one was included to moby by my PR https://github.com/moby/moby/pull/43502 before 22.06 split so perhaps this original can be them merged as "backport to 20.10" to here?

olljanat avatar Jul 25 '22 05:07 olljanat

This LGTM to me and should likely be merged as we accepted it on the 20.10 branch. This would additionally let Mirantis stop maintaining a fork of libnetwork and better align what we are doing with upstream.

neersighted avatar Jan 18 '23 04:01 neersighted