netplugin icon indicating copy to clipboard operation
netplugin copied to clipboard

Support for adding and remove rules without service disruptions

Open nick-oconnor opened this issue 7 years ago • 7 comments

Description

Currently, adding or removing rules that are linked to an ANP (policy => EPG => ANP) cause that ANP to be deleted and recreated. While this has little effect on network connectivity in Contiv-only setups, the effect is much more prominent when using Contiv with ACI and can lead to noticeable downtimes.

Expected Behavior

When using Contiv with ACI, adding and removing a rule creates the required contract in ACI and links it to the specified EPGs without removing the associated ANP.

Use case

You have an EPG A that is currently communicating with EPG B in a Contiv with ACI setup. You would like to add a new EPG C. You would like to allow EPG A to access EPG C and for EPG C to access EPG A without disrupting communication between EPGs A and B.

nick-oconnor avatar Aug 18 '17 21:08 nick-oconnor

Ngpitt, Are you seeing service disruptions during any EPG rules update in contiv-ACI setup? By looking your description, it seems that any rule update in contiv system causing service disruption. if possible, can you give netplugin/netmaster logs from your setup during error state?

g1rana avatar Feb 12 '18 16:02 g1rana

@g1rana I don't think any errors were thrown. The issue stems from the fact that ANP's are completely deleted and re-created in ACI when any of their associated objects are changed in Contiv (i.e. rule/epg creation/deletion). This causes any container using that ANP to loose network connectivity until the sync completes. This behavior is confirmed by the code here: https://github.com/contiv/netplugin/blob/1146f6875894706998d2d671aaa730e5e92f1726/netmaster/objApi/apiController.go#L1516 Let me know if you need any more information.

nick-oconnor avatar Feb 12 '18 17:02 nick-oconnor

@ngpitt Can you tell us following info ? 1.Docker Version . Is this Docker EE or just Docker CE ? 2. ACI version 3. is this with Docker Swarm ?

g1rana avatar Feb 20 '18 20:02 g1rana

  1. Docker CE 17.06.1 (CentOS)
  2. ACI 2.3(1f)
  3. Kubernetes

Unfortunately I no longer have access to a functioning Contiv setup with these versions (we've switched to the native ACI CNI and updated Docker, ACI, and K8s versions). I'll try to provide as much info as possible.

nick-oconnor avatar Feb 20 '18 20:02 nick-oconnor

This is an flaw in the design of netplugin, because it doesn't contemplate the scenario described in the OP, wherein you have two ANPs in communication (A and B) with AB contracts in use. Adding a new ANP C with AC contracts will delete A and re-create it, causing AB contracts to lose connectivity for the duration of the update process.

This is independent of the versions of the components being orchestrated.

TetsujinOni avatar Feb 20 '18 22:02 TetsujinOni

Yes, I agree with your description but want to reproduce it more less in same version what you have before making changes in code/design.

g1rana avatar Feb 22 '18 18:02 g1rana

ngpitt, Looks like you don't have any issue with Native ACI CNI plugin in k8s. Let me know if you still want me to work on it since this issue is related to original Contiv ACI gw plugin which i believe you are not using anymore .

g1rana avatar Feb 24 '18 00:02 g1rana