deployments-k8s
deployments-k8s copied to clipboard
After component restart there are more interfaces in NSE than expected
Expected Behavior
We are testing restart of different combination of NSM components with traffic. It would be desirable besides the traffic recovery after the components are up and running again that the same number of interfaces exist in NSE.
Current Behavior
I attached logs from a case where we restarted an NSC and a forwarder-vpp in the system and there was one more interface (5) in the NSE than the expected 4.
We observed the same behavior several times with different combination of the restarted components but NSC was always one of them. So, it seems that the NSC restart has influence on this.
Failure Information (for bugs)
Described above in the Current behavior section.
Steps to Reproduce
-
The setup is based on NSM v1.11.2 basic kernel to ethernet to kernel example with 4 worker nodes. There are an NSC on each node and 2 NSEs (one for IPv4 and one for IPv6).
-
Started traffic between the NSCs and NSEs.
-
An NSC and a forwarder-vpp were restarted. After both came up, there were an additional interface in the NSE (nse-ipv4-c9cd8cf77-gwnth) besides the original 4.
Context
- Kubernetes Version: 1.27.1
- NSM Version: v1.11.2 / v 1.12.1-rc.1
Failure Logs
Related to https://github.com/networkservicemesh/sdk/issues/1020
@edwarnicke, @denis-tingaikin: I got the information that the extra allocated IPs are not cleaned up even after 10 minutes.
@szvincze It is not expected and looks unhealthy. So the problem will be considered. Also, it would be nice to have logs from the NSE right after the request and after 10 minutes when the forwarder died if possible.
I have tried several cluster configurations on the Kubernetes Version: 1.27.1 and NSM v1.13.2 Some results: 1 NSE 1 NSC
- local - on restart cleanup
- remote - reproduced original issue, but interface cleaned up after timeout
1 NSC 2 NSE
- local - on restart cleanup
- remote - on restart cleanup
2 NSC 2 NSE
- local - on restart cleanup
- remote - on restart cleanup
So, the most similar steps for me were:
- Setup 1 NSC and NSE on different nodes
- Restart forwarder on the same node with client
- Right after restart NSC
- Check NSE interfaces
As a result: an additional interface was on the NSE, but there was no additional refresh requests for it and it has been cleaned up after 10 mins. And it looks like expected.
So, it is required to get additional steps to reproduce the issue, when the leaking interface is not cleaning up after 10 mins.
@Ex4amp1e: I have checked the current situation with my colleagues if the same behavior can be reproduced using NSM v1.13.2 release. It seems the leaking interface is always removed within 10 mins. We will have a meeting to clarify and validate the used test cases. I will come back with the feedback soon, so please keep this issue open until that.
Hi @Ex4amp1e, I have checked the situation with my colleagues and it seems there are cases when the whole thing is working as expected and deletes the extra interface within 10 minutes but they still have seen some occurrences when the interface remains. The last one that they mentioned was when the NSE and the registry-k8s pods were restarted. No any particular trick was needed just run this test several times. We check the frequency of the occurrences and I will share it soon.
As far as the test results show in the vast majority of the successful reproductions the combination of the components was registry-k8s
and NSE
, in few cases registry-k8s
and NSC
.
Hi @szvincze I tried to restart NSE and registry-k8s pods many times and haven’t reproduced the issue. Also, restart of NSE/Registry is more about healing, are you sure, that in such case we are getting interface leak? We need to know what is overall use case to get more details of reproducing, may be we are missing some preconditions or everything is going as expected. We are still blocked, so could you please provide more details:
- exact steps to reproduce
- observed behaviour
- expected behaviour
Here I send the reproduction, the respective logs and analysis of the issue.
Note that there are several different component restart combinations in the tests, but as we experienced the failing cases happened when registry-k8s
was one of the components and it was the first in the row.
OK - "After 10 min, TC shows the expected num of interfaces"
FAIL - "After 10 min, TC num of interfaces still doesn't match"
Test #1363:
robustness-multiple-component-restart.sh nse + registry-k8s FAIL
robustness-multiple-component-restart.sh nsc + forwarder-vpp OK
robustness-multiple-component-restart.sh nsc + nsmgr OK
Investigation of "nse + registry-k8s FAIL" scenario
The test VM had 6 worker nodes between "n4-n9":
NAME STATUS ROLES AGE VERSION INTERNAL-IP
pool1-n185-vpod4-pool1-n4 Ready worker 325d v1.27.1 10.0.40.104
pool1-n185-vpod4-pool1-n5 Ready worker 325d v1.27.1 10.0.40.105
pool1-n185-vpod4-pool1-n6 Ready worker 325d v1.27.1 10.0.40.103
pool1-n185-vpod4-pool1-n7 Ready worker 325d v1.27.1 10.0.40.107
pool1-n185-vpod4-pool1-n8 Ready worker 325d v1.27.1 10.0.40.106
pool1-n185-vpod4-pool1-n9 Ready worker 318d v1.27.1 10.0.40.108
Registry-k8s had 2 replicas running on node "n4" and "n8":
[2024-08-28T02:21:36.242Z] NAME READY STATUS RESTARTS AGE IP NODE
[2024-08-28T02:21:36.242Z] registry-k8s-cc57559db-h7cqw 2/2 Running 1 (20m ago) 58m 192.168.167.28 pool1-n185-vpod4-pool1-n8
[2024-08-28T02:21:36.242Z] registry-k8s-cc57559db-pt5gj 2/2 Running 1 (23m ago) 58m 192.168.172.49 pool1-n185-vpod4-pool1-n4
There were 6-NSC and 2-NSE endpoints were running:
[2024-08-28T02:21:36.497Z] NAME READY STATUS RESTARTS AGE IP NODE
[2024-08-28T02:21:36.497Z] nsc-8dbb45d97-9d9wp 1/1 Running 0 57m 192.168.172.76 pool1-n185-vpod4-pool1-n6
[2024-08-28T02:21:36.497Z] nsc-8dbb45d97-hxv9s 1/1 Running 0 57m 192.168.5.86 pool1-n185-vpod4-pool1-n9
[2024-08-28T02:21:36.497Z] nsc-8dbb45d97-mbh92 1/1 Running 1 (23m ago) 57m 192.168.172.54 pool1-n185-vpod4-pool1-n4
[2024-08-28T02:21:36.497Z] nsc-8dbb45d97-nhstj 1/1 Running 1 (20m ago) 57m 192.168.167.55 pool1-n185-vpod4-pool1-n8
[2024-08-28T02:21:36.497Z] nsc-8dbb45d97-spgkc 1/1 Running 0 57m 192.168.71.242 pool1-n185-vpod4-pool1-n5
[2024-08-28T02:21:36.497Z] nsc-8dbb45d97-vl9zq 1/1 Running 0 57m 192.168.235.18 pool1-n185-vpod4-pool1-n7
[2024-08-28T02:21:36.497Z] nse-ipv4-77c55d5b94-s8r87 1/1 Running 1 (28m ago) 57m 192.168.172.113 pool1-n185-vpod4-pool1-n6
[2024-08-28T02:21:36.497Z] nse-ipv6-6ccb65cdf7-djvvt 1/1 Running 1 (25m ago) 57m 192.168.167.15 pool1-n185-vpod4-pool1-n8
1 NSE and 1 Registry-k8s pod running on the node was chosen:
[2024-08-28T02:21:37.315Z] robustness-multiple-component-restart.sh: The following pods will be killed:
nse-ipv6-6ccb65cdf7-djvvt and registry-k8s-cc57559db-h7cqw which are located on the node: pool1-n185-vpod4-pool1-n8
NSC that also on the same node "n8":
[2024-08-28T02:21:37.315Z] robustness-multiple-component-restart.sh: These endpoint pods share the same worker:
[2024-08-28T02:21:37.315Z] nsc-8dbb45d97-nhstj
[2024-08-28T02:21:37.315Z] nse-ipv6-6ccb65cdf7-djvvt
[2024-08-28T02:21:37.315Z] robustness-multiple-component-restart.sh: Starting traffic
[2024-08-28T02:21:38.640Z] robustness-multiple-component-restart.sh: IPv6 server IP in nsc-8dbb45d97-9d9wp: 100:100::7
[2024-08-28T02:21:39.199Z] robustness-multiple-component-restart.sh: kubectl exec -n nsm-ep nsc-8dbb45d97-hxv9s -- ip -o addr show dev nsm-ipv4 scope global
[2024-08-28T02:21:39.474Z] robustness-multiple-component-restart.sh: kubectl exec -n nsm-ep nsc-8dbb45d97-hxv9s -- ip -o addr show dev nsm-ipv6 scope global
[2024-08-28T02:21:39.749Z] robustness-multiple-component-restart.sh: IPv4 server IP in nsc-8dbb45d97-hxv9s: 172.16.1.99
[2024-08-28T02:21:40.314Z] robustness-multiple-component-restart.sh: IPv6 server IP in nsc-8dbb45d97-hxv9s: 100:100::1
[2024-08-28T02:21:42.502Z] robustness-multiple-component-restart.sh: kubectl exec -n nsm-ep nsc-8dbb45d97-nhstj -- ip -o addr show dev nsm-ipv4 scope global
[2024-08-28T02:21:42.757Z] robustness-multiple-component-restart.sh: kubectl exec -n nsm-ep nsc-8dbb45d97-nhstj -- ip -o addr show dev nsm-ipv6 scope global
[2024-08-28T02:21:43.012Z] robustness-multiple-component-restart.sh: IPv4 server IP in nsc-8dbb45d97-nhstj: 172.16.1.101
[2024-08-28T02:21:43.571Z] robustness-multiple-component-restart.sh: IPv6 server IP in nsc-8dbb45d97-nhstj: 100:100::1
[2024-08-28T02:21:45.316Z] robustness-multiple-component-restart.sh: IPv6 server IP in nsc-8dbb45d97-spgkc: 100:100::5
[2024-08-28T02:21:46.946Z] robustness-multiple-component-restart.sh: IPv6 server IP in nsc-8dbb45d97-vl9zq: 100:100::3
...
[2024-08-28T02:21:48.429Z] robustness-multiple-component-restart.sh: sending traffic to nsc-8dbb45d97-9d9wp(100:100::7) from nse-ipv6-6ccb65cdf7-djvvt
[2024-08-28T02:21:48.989Z] robustness-multiple-component-restart.sh: sending traffic to nsc-8dbb45d97-hxv9s(172.16.1.99) from nse-ipv4-77c55d5b94-s8r87
[2024-08-28T02:21:49.549Z] robustness-multiple-component-restart.sh: sending traffic to nsc-8dbb45d97-hxv9s(100:100::1) from nse-ipv6-6ccb65cdf7-djvvt
[2024-08-28T02:21:51.596Z] robustness-multiple-component-restart.sh: sending traffic to nsc-8dbb45d97-nhstj(172.16.1.101) from nse-ipv4-77c55d5b94-s8r87
[2024-08-28T02:21:52.156Z] robustness-multiple-component-restart.sh: sending traffic to nsc-8dbb45d97-nhstj(100:100::1) from nse-ipv6-6ccb65cdf7-djvvt
[2024-08-28T02:21:53.641Z] robustness-multiple-component-restart.sh: sending traffic to nsc-8dbb45d97-spgkc(100:100::5) from nse-ipv6-6ccb65cdf7-djvvt
[2024-08-28T02:21:55.125Z] robustness-multiple-component-restart.sh: sending traffic to nsc-8dbb45d97-vl9zq(100:100::3) from nse-ipv6-6ccb65cdf7-djvvt
...
[2024-08-28T02:21:55.685Z] ctraffic: starting(nsc-8dbb45d97-9d9wp_nsc_86b): ./ctraffic -address [100:100::7]:5003 -server
[2024-08-28T02:21:55.940Z] ctraffic: server(nsc-8dbb45d97-9d9wp_nsc_86b): 2024/08/28 02:21:55 Listen on address; [100:100::7]:5003
[2024-08-28T02:21:55.940Z] ctraffic: starting(nsc-8dbb45d97-hxv9s_nsc_f43): ./ctraffic -address 172.16.1.99:5003 -server
[2024-08-28T02:21:55.940Z] ctraffic: starting(nsc-8dbb45d97-hxv9s_nsc_1d0): ./ctraffic -address [100:100::1]:5003 -server
[2024-08-28T02:21:56.195Z] ctraffic: server(nsc-8dbb45d97-hxv9s_nsc_f43): 2024/08/28 02:21:55 Listen on address; 172.16.1.99:5003
[2024-08-28T02:21:56.195Z] ctraffic: server(nsc-8dbb45d97-hxv9s_nsc_1d0): 2024/08/28 02:21:56 Listen on address; [100:100::1]:5003
[2024-08-28T02:21:56.195Z] ctraffic: starting(nsc-8dbb45d97-nhstj_nsc_b2b): ./ctraffic -address 172.16.1.101:5003 -server
[2024-08-28T02:21:56.449Z] ctraffic: starting(nsc-8dbb45d97-nhstj_nsc_01a): ./ctraffic -address [100:100::1]:5003 -server
[2024-08-28T02:21:56.449Z] ctraffic: server(nsc-8dbb45d97-nhstj_nsc_b2b): 2024/08/28 02:21:56 Listen on address; 172.16.1.101:5003
[2024-08-28T02:21:56.449Z] ctraffic: server(nsc-8dbb45d97-nhstj_nsc_01a): 2024/08/28 02:21:56 Listen on address; [100:100::1]:5003
[2024-08-28T02:21:56.704Z] ctraffic: starting(nsc-8dbb45d97-vl9zq_nsc_2c7): ./ctraffic -address [100:100::3]:5003 -server
[2024-08-28T02:21:56.704Z] ctraffic: starting(nsc-8dbb45d97-spgkc_nsc_5a0): ./ctraffic -address [100:100::5]:5003 -server
[2024-08-28T02:21:56.704Z] ctraffic: server(nsc-8dbb45d97-spgkc_nsc_5a0): 2024/08/28 02:21:56 Listen on address; [100:100::5]:5003
[2024-08-28T02:21:56.961Z] ctraffic: server(nsc-8dbb45d97-vl9zq_nsc_2c7): 2024/08/28 02:21:56 Listen on address; [100:100::3]:5003
[2024-08-28T02:21:57.473Z] ctraffic: starting(nse-ipv6-6ccb65cdf7-djvvt_nse_30c): ./ctraffic -address [100:100::7]:5003 -nconn 100 -rate 50 -timeout 240s -monitor -stats all
[2024-08-28T02:21:57.727Z] ctraffic: starting(nse-ipv6-6ccb65cdf7-djvvt_nse_070): ./ctraffic -address [100:100::1]:5003 -nconn 100 -rate 50 -timeout 240s -monitor -stats all
[2024-08-28T02:21:57.982Z] ctraffic: starting(nse-ipv4-77c55d5b94-s8r87_nse_415): ./ctraffic -address 172.16.1.101:5003 -nconn 100 -rate 50 -timeout 240s -monitor -stats all
[2024-08-28T02:21:58.237Z] ctraffic: starting(nse-ipv4-77c55d5b94-s8r87_nse_32e): ./ctraffic -address 172.16.1.101:5003 -nconn 100 -rate 50 -timeout 240s -monitor -stats all
[2024-08-28T02:21:58.497Z] ctraffic: starting(nse-ipv6-6ccb65cdf7-djvvt_nse_0e7): ./ctraffic -address [100:100::1]:5003 -nconn 100 -rate 50 -timeout 240s -monitor -stats all
[2024-08-28T02:21:58.752Z] ctraffic: starting(nse-ipv6-6ccb65cdf7-djvvt_nse_6df): ./ctraffic -address [100:100::5]:5003 -nconn 100 -rate 50 -timeout 240s -monitor -stats all
[2024-08-28T02:21:59.006Z] ctraffic: starting(nse-ipv6-6ccb65cdf7-djvvt_nse_53e): ./ctraffic -address [100:100::3]:5003 -nconn 100 -rate 50 -timeout 240s -monitor -stats all
Kill the chosen pods at the same time:
[2024-08-28T02:22:04.637Z] robustness-multiple-component-restart.sh: kubectl exec nse-ipv6-6ccb65cdf7-djvvt -n nsm-ep -c nse -- kill 1
[2024-08-28T02:22:05.148Z] robustness-multiple-component-restart.sh: kubectl exec registry-k8s-cc57559db-h7cqw -n nsm -c registry-k8s -- bash -c 'kill -9 14'
[2024-08-28T02:22:05.403Z] robustness-multiple-component-restart.sh: kubectl wait --for=condition=ready --timeout=3m pod -n nsm-ep -l endpoint-type=nse
[2024-08-28T02:22:06.679Z] robustness-multiple-component-restart.sh: kubectl wait --for=condition=ready --timeout=3m pod -n nsm -l app=registry-k8s
NSE was up after 1 sec of kill:
[2024-08-28T02:22:05.658Z] robustness-multiple-component-restart.sh: nse-ipv6-6ccb65cdf7-djvvt latest container was started at null
[2024-08-28T02:22:06.679Z] robustness-multiple-component-restart.sh: nse-ipv6-6ccb65cdf7-djvvt latest container was started at 2024-08-28T02:22:05Z
Registry-k8s was up after 6 sec of kill:
[2024-08-28T02:22:11.872Z] pod/registry-k8s-cc57559db-h7cqw condition met
[2024-08-28T02:22:11.872Z] pod/registry-k8s-cc57559db-pt5gj condition met
[2024-08-28T02:22:11.872Z] robustness-multiple-component-restart.sh: registry-k8s-cc57559db-h7cqw latest container was started at 2024-08-28T02:22:05Z
Traffic stopped smoothly:
[2024-08-28T02:22:11.872Z] robustness-multiple-component-restart.sh: Stopping traffic due to NSE/NSC was rebooted, currently the evaluation is not advanced enough
[2024-08-28T02:22:19.125Z] robustness-multiple-component-restart.sh: Executing "check_num_of_endpoint_IPs nsm-ep 2>&1" until return is 0 or 150 second passed
[2024-08-28T02:25:10.516Z] robustness-multiple-component-restart.sh: WARNING: Time is up, giving up (after 11 retries)
[2024-08-28T02:37:38.499Z] DEBUG: nse-ipv6-6ccb65cdf7-djvvt
[2024-08-28T02:37:38.499Z] 100:100::
[2024-08-28T02:37:38.499Z] 100:100::2
[2024-08-28T02:37:38.499Z] 100:100::
[2024-08-28T02:37:38.499Z] 100:100::4
[2024-08-28T02:37:38.499Z] 100:100::2
[2024-08-28T02:37:38.499Z] 100:100::6
[2024-08-28T02:37:38.499Z] 100:100::4
[2024-08-28T02:37:38.499Z] 100:100::8
[2024-08-28T02:37:38.499Z] 100:100::a
[2024-08-28T02:37:38.499Z] 100:100::6
[2024-08-28T02:25:10.516Z] robustness-multiple-component-restart.sh: Errors detected:
[2024-08-28T02:25:10.516Z] nse-ipv6-6ccb65cdf7-djvvt has not the same number of IPs as the number of NSCs (6)
Following shows which ip was added to which interface. interfaces_nsm-ep_nse-ipv6-6ccb65cdf7-djvvt.txt
3: eth0@if2518: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP group default
link/ether 5e:31:de:48:97:64 brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet 192.168.167.15/32 scope global eth0
valid_lft forever preferred_lft forever
inet6 fd00:eccd:18:ffff:6c5e:414c:2f14:a70f/128 scope global
valid_lft forever preferred_lft forever
46: icmp-respo-9083: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UNKNOWN group default qlen 1000
link/ether 02:fe:38:33:ea:99 brd ff:ff:ff:ff:ff:ff
inet6 100:100::/128 scope global nodad
valid_lft forever preferred_lft forever
47: icmp-respo-87f4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 8946 qdisc mq state UNKNOWN group default qlen 1000
link/ether 02:fe:10:63:51:9e brd ff:ff:ff:ff:ff:ff
inet6 100:100::2/128 scope global nodad
valid_lft forever preferred_lft forever
inet6 100:100::/128 scope global nodad
valid_lft forever preferred_lft forever
48: icmp-respo-0c07: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 8946 qdisc mq state UNKNOWN group default qlen 1000
link/ether 02:fe:fc:c6:63:4f brd ff:ff:ff:ff:ff:ff
inet6 100:100::4/128 scope global nodad
valid_lft forever preferred_lft forever
inet6 100:100::2/128 scope global nodad
valid_lft forever preferred_lft forever
49: icmp-respo-7315: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1446 qdisc mq state UNKNOWN group default qlen 1000
link/ether 02:fe:de:83:d8:12 brd ff:ff:ff:ff:ff:ff
inet6 100:100::6/128 scope global nodad
valid_lft forever preferred_lft forever
inet6 100:100::4/128 scope global nodad
valid_lft forever preferred_lft forever
50: icmp-respo-54b6: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 8946 qdisc mq state UNKNOWN group default qlen 1000
link/ether 02:fe:d6:25:5a:87 brd ff:ff:ff:ff:ff:ff
inet6 100:100::8/128 scope global nodad
valid_lft forever preferred_lft forever
51: icmp-respo-d2d2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 8946 qdisc mq state UNKNOWN group default qlen 1000
link/ether 02:fe:84:03:76:9d brd ff:ff:ff:ff:ff:ff
inet6 100:100::a/128 scope global nodad
valid_lft forever preferred_lft forever
inet6 100:100::6/128 scope global nodad
valid_lft forever preferred_lft forever
The expectation is to have 6 ip addresses in total as indicated in the error.
Part 2:
Followings NSCs were also failed with their ifconfigs:
NSC #1
ctraffic: nse-ipv6-6ccb65cdf7-djvvt_nse_30c was not running
[2024-08-28T02:37:38.499Z] DEBUG: nsc-8dbb45d97-9d9wp
[2024-08-28T02:37:38.499Z] 172.16.1.97
[2024-08-28T02:37:38.499Z] 100:100::b
[2024-08-28T02:37:38.499Z] 100:100::7
[2024-08-28T02:25:10.516Z] nsc-8dbb45d97-9d9wp has not the same number of IPs as the number of NSEs (2)
pnes/endpoint_pod_info//interfaces_nsm-ep_nsc-8dbb45d97-9d9wp_nsc.txt
3: eth0@if8538: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP group default
link/ether be:ec:c2:ca:a5:93 brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet 192.168.172.76/32 scope global eth0
valid_lft forever preferred_lft forever
inet6 fd00:eccd:18:ffff:552e:f937:bfc6:3cee/128 scope global
valid_lft forever preferred_lft forever
13: nsm-ipv4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UNKNOWN group default qlen 1000
link/ether 02:fe:4e:43:00:59 brd ff:ff:ff:ff:ff:ff
inet 172.16.1.97/32 scope global nsm-ipv4
valid_lft forever preferred_lft forever
16: nsm-ipv6: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 8946 qdisc mq state UNKNOWN group default qlen 1000
link/ether 02:fe:58:8f:be:41 brd ff:ff:ff:ff:ff:ff
inet6 100:100::b/128 scope global nodad
valid_lft forever preferred_lft forever
inet6 100:100::7/128 scope global nodad
valid_lft forever preferred_lft forever
On nsm-ipv6 interface address 100:100::7, it was receiving traffic from restarted nse: nse-ipv6-6ccb65cdf7-djvvt
NSC #2
[2024-08-28T02:37:38.499Z] DEBUG: nsc-8dbb45d97-hxv9s
[2024-08-28T02:37:38.499Z] 172.16.1.99
[2024-08-28T02:37:38.499Z] 100:100::3
[2024-08-28T02:37:38.499Z] 100:100::1
[2024-08-28T02:25:10.516Z] nsc-8dbb45d97-hxv9s has not the same number of IPs as the number of NSEs (2)
pnes/endpoint_pod_info/interfaces_nsm-ep_nsc-8dbb45d97-hxv9s_nsc.txt
3: eth0@if22674: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP group default
link/ether fa:eb:95:7c:33:ac brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet 192.168.5.86/32 scope global eth0
valid_lft forever preferred_lft forever
inet6 fd00:eccd:18:ffff:7a1b:ca35:be07:141b/128 scope global
valid_lft forever preferred_lft forever
14: nsm-ipv4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 8946 qdisc mq state UNKNOWN group default qlen 1000
link/ether 02:fe:24:98:49:19 brd ff:ff:ff:ff:ff:ff
inet 172.16.1.99/32 scope global nsm-ipv4
valid_lft forever preferred_lft forever
17: nsm-ipv6: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 8946 qdisc mq state UNKNOWN group default qlen 1000
link/ether 02:fe:d0:db:49:6d brd ff:ff:ff:ff:ff:ff
inet6 100:100::3/128 scope global nodad
valid_lft forever preferred_lft forever
inet6 100:100::1/128 scope global nodad
valid_lft forever preferred_lft forever
On nsm-ipv6 interface address 100:100::1, it was receiving traffic from restarted nse: nse-ipv6-6ccb65cdf7-djvvt
NSC #3
[2024-08-28T02:37:38.499Z] DEBUG: nsc-8dbb45d97-spgkc
[2024-08-28T02:37:38.499Z] 172.16.1.107
[2024-08-28T02:37:38.499Z] 100:100::7
[2024-08-28T02:37:38.499Z] 100:100::5
[2024-08-28T02:25:10.516Z] nsc-8dbb45d97-spgkc has not the same number of IPs as the number of NSEs (2)
pnes/endpoint_pod_info/interfaces_nsm-ep_nsc-8dbb45d97-spgkc_nsc.txt
3: eth0@if332: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP group default
link/ether 36:cc:81:b5:81:2e brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet 192.168.71.242/32 scope global eth0
valid_lft forever preferred_lft forever
inet6 fd00:eccd:18:ffff:4ab8:5e1b:ecf1:cbb2/128 scope global
valid_lft forever preferred_lft forever
14: nsm-ipv4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1446 qdisc mq state UNKNOWN group default qlen 1000
link/ether 02:fe:04:c6:94:7a brd ff:ff:ff:ff:ff:ff
inet 172.16.1.107/32 scope global nsm-ipv4
valid_lft forever preferred_lft forever
17: nsm-ipv6: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1446 qdisc mq state UNKNOWN group default qlen 1000
link/ether 02:fe:88:34:a0:80 brd ff:ff:ff:ff:ff:ff
inet6 100:100::7/128 scope global nodad
valid_lft forever preferred_lft forever
inet6 100:100::5/128 scope global nodad
valid_lft forever preferred_lft forever
On nsm-ipv6 interface address 100:100::5, it was receiving traffic from restarted nse: nse-ipv6-6ccb65cdf7-djvvt
NSC #4
[2024-08-28T02:37:38.499Z] DEBUG: nsc-8dbb45d97-vl9zq
[2024-08-28T02:37:38.499Z] 172.16.1.103
[2024-08-28T02:37:38.499Z] 100:100::5
[2024-08-28T02:37:38.499Z] 100:100::3
[2024-08-28T02:25:10.516Z] nsc-8dbb45d97-vl9zq has not the same number of IPs as the number of NSEs (2)
pnes/endpoint_pod_info/interfaces_nsm-ep_nsc-8dbb45d97-vl9zq_nsc.txt
3: eth0@if9490: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP group default
link/ether ba:da:a7:f3:e6:65 brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet 192.168.235.18/32 scope global eth0
valid_lft forever preferred_lft forever
inet6 fd00:eccd:18:ffff:10a:1ba0:a209:2fa8/128 scope global
valid_lft forever preferred_lft forever
14: nsm-ipv4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 8946 qdisc mq state UNKNOWN group default qlen 1000
link/ether 02:fe:a2:86:b9:eb brd ff:ff:ff:ff:ff:ff
inet 172.16.1.103/32 scope global nsm-ipv4
valid_lft forever preferred_lft forever
17: nsm-ipv6: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 8946 qdisc mq state UNKNOWN group default qlen 1000
link/ether 02:fe:5c:47:e1:4a brd ff:ff:ff:ff:ff:ff
inet6 100:100::5/128 scope global nodad
valid_lft forever preferred_lft forever
inet6 100:100::3/128 scope global nodad
valid_lft forever preferred_lft forever
On nsm-ipv6 interface address 100:100::3, it was receiving traffic from restarted nse: nse-ipv6-6ccb65cdf7-djvvt
SUMMARY:
On the NSC side these were the interfaces that were expected to be removed after the NSE (nse-ipv6-6ccb65cdf7-djvvt) went down
nsc-8dbb45d97-hxv9s 100:100::1
nsc-8dbb45d97-vl9zq 100:100::3
nsc-8dbb45d97-spgkc 100:100::5
nsc-8dbb45d97-9d9wp 100:100::7
On the NSE side the expected interface addresses were followings (taken from successfull result in previous tc)
[2024-08-28T01:24:15.428Z] 100:100::
[2024-08-28T01:24:15.428Z] 100:100::2
[2024-08-28T01:24:15.428Z] 100:100::4
[2024-08-28T01:24:15.428Z] 100:100::6
[2024-08-28T01:24:15.428Z] 100:100::8
[2024-08-28T01:24:15.428Z] 100:100::a
But the ipv6 addresses were neither removed nor replaced.
nse-ipv6-6ccb65cdf7-djvvt ifconfig:
46: icmp-respo-9083: ( 100:100:: )
47: icmp-respo-87f4: ( 100:100::2 100:100:: )
48: icmp-respo-0c07: ( 100:100::4 100:100::2 )
49: icmp-respo-7315: ( 100:100::6 100:100::4 )
50: icmp-respo-54b6: ( 100:100::8 )
51: icmp-respo-d2d2: ( 100:100::a 100:100::6 )