netplugin
netplugin copied to clipboard
Support for adding and remove rules without service disruptions
Description
Currently, adding or removing rules that are linked to an ANP (policy => EPG => ANP) cause that ANP to be deleted and recreated. While this has little effect on network connectivity in Contiv-only setups, the effect is much more prominent when using Contiv with ACI and can lead to noticeable downtimes.
Expected Behavior
When using Contiv with ACI, adding and removing a rule creates the required contract in ACI and links it to the specified EPGs without removing the associated ANP.
Use case
You have an EPG A that is currently communicating with EPG B in a Contiv with ACI setup. You would like to add a new EPG C. You would like to allow EPG A to access EPG C and for EPG C to access EPG A without disrupting communication between EPGs A and B.
Ngpitt, Are you seeing service disruptions during any EPG rules update in contiv-ACI setup? By looking your description, it seems that any rule update in contiv system causing service disruption. if possible, can you give netplugin/netmaster logs from your setup during error state?
@g1rana I don't think any errors were thrown. The issue stems from the fact that ANP's are completely deleted and re-created in ACI when any of their associated objects are changed in Contiv (i.e. rule/epg creation/deletion). This causes any container using that ANP to loose network connectivity until the sync completes. This behavior is confirmed by the code here: https://github.com/contiv/netplugin/blob/1146f6875894706998d2d671aaa730e5e92f1726/netmaster/objApi/apiController.go#L1516 Let me know if you need any more information.
@ngpitt Can you tell us following info ? 1.Docker Version . Is this Docker EE or just Docker CE ? 2. ACI version 3. is this with Docker Swarm ?
- Docker CE 17.06.1 (CentOS)
- ACI 2.3(1f)
- Kubernetes
Unfortunately I no longer have access to a functioning Contiv setup with these versions (we've switched to the native ACI CNI and updated Docker, ACI, and K8s versions). I'll try to provide as much info as possible.
This is an flaw in the design of netplugin, because it doesn't contemplate the scenario described in the OP, wherein you have two ANPs in communication (A
and B
) with AB
contracts in use. Adding a new ANP C
with AC
contracts will delete A
and re-create it, causing AB
contracts to lose connectivity for the duration of the update process.
This is independent of the versions of the components being orchestrated.
Yes, I agree with your description but want to reproduce it more less in same version what you have before making changes in code/design.
ngpitt, Looks like you don't have any issue with Native ACI CNI plugin in k8s. Let me know if you still want me to work on it since this issue is related to original Contiv ACI gw plugin which i believe you are not using anymore .