frr
frr copied to clipboard
nhrpd: fixes core dump on shutdown
When nhrpd is shutdown via nhrp_request_stop() the shutdown sequence was not handling the case where there are active shortcut routes installed. The zebra client and shortcut rib were being cleaned up before vrf_terminate() had an opportunity to delete the active routes.
would be nice to have a topotest for this
Having a coredump inside the commit would be helpful too.
Having a coredump inside the commit would be helpful too.
core.txt link has the backtrace. Next time I'll include in line.
Having a coredump inside the commit would be helpful too.
core.txt link has the backtrace. Next time I'll include in line.
Why not this time?
@Mergifyio backport stable/10.0 stable/9.1 stable/9.0 stable/8.5
backport stable/10.0 stable/9.1 stable/9.0 stable/8.5
✅ Backports have been created
-
#16116 nhrpd: fixes core dump on shutdown (backport #15879) has been created for branch
stable/10.0
-
#16117 nhrpd: fixes core dump on shutdown (backport #15879) has been created for branch
stable/9.1
but encountered conflicts -
#16118 nhrpd: fixes core dump on shutdown (backport #15879) has been created for branch
stable/9.0
but encountered conflicts -
#16119 nhrpd: fixes core dump on shutdown (backport #15879) has been created for branch
stable/8.5
but encountered conflicts
Having a coredump inside the commit would be helpful too.
core.txt link has the backtrace. Next time I'll include in line.
Why not this time?
Hi Donatas, I'd be happy to add coredump and associated nhrpd. I looked for a previous pull request example of best practices how to do that and couldn't find another PR with a coredump. Can you suggest how FRR community would like that done. Thanks.
The zebra client and shortcut rib were being cleaned up before vrf_terminate() had an opportunity to delete the active routes.
(This is non-blocking re. this PR) — we may have a more general problem here with ordering shutdown operations. vrf_init
is called before any NHRP initialization (not counting nhrp_error_init
), and normally the shutdown calls should be in reverse order to init calls. If we need to clear out routes first, don't do that, and that causes problems… something is wrong with our setup of operations.
The topotest added by this commit is failing on Ubuntu 18.04 arm8 because 'iptables' is not installed on the image. The test leverages 'iptables' in order to be able to test nhrp shortcuts
2024/05/02-00:05:51 2024-05-02 00:05:51,225 INFO: topo: input: iptables -A FORWARD -i r1-gre0 -o r1-gre0 -m hashlimit --hashlimit-upto 4/minute --hashlimit-burst 1 --hashlimit-mode srcip,dstip --hashlimit-srcmask 24 --hashlimit-dstmask 24 --hashlimit-name loglimit-0 -j NFLOG --nflog-group 1 --nflog-range 128 2024/05/02-00:05:51 2024-05-02 00:05:51,237 WARNING: r1: Router(r1): proc failed: rc 127 pid 51424 2024/05/02-00:05:51 args: /usr/bin/nsenter --mount=/proc/50823/ns/mnt --net=/proc/50823/ns/net --uts=/proc/50823/ns/uts -F /bin/bash -c iptables -A FORWARD -i r1-gre0 -o r1-gre0 -m hashlimit --hashlimit-upto 4/minute --hashlimit-burst 1 --hashlimit-mode srcip,dstip --hashlimit-srcmask 24 --hashlimit-dstmask 24 --hashlimit-name loglimit-0 -j NFLOG --nflog-group 1 --nflog-range 128 2024/05/02-00:05:51 stdout: /bin/bash: iptables: command not found 2024/05/02-00:05:51 stderr: empty 2024/05/02-00:05:51 2024-05-02 00:05:51,238 INFO: topo: output: /bin/bash: iptables: command not found
If you are adding usage of iptables and it is not present you should skip the test on that platform. See the bgp_as_allow_in/test_bgp_as_allow_in.py script and look at setup_module and add a similiar test to the kernel version and pytest.skip(...) if its not available.
If you are adding usage of iptables and it is not present you should skip the test on that platform. See the bgp_as_allow_in/test_bgp_as_allow_in.py script and look at setup_module and add a similiar test to the kernel version and pytest.skip(...) if its not available.
Thanks Donald. I pushed the fix to ignore unsupported platforms already. Now just trying to get all the unrelated basics tests to pass. i.e.
🚫 ospf6_ecmp_inter_area.test_ospf6_ecmp_inter_area test_ecmp_inter_area
E AssertionError: 'r1' wrong number of route nexthops assert [1, 1, 1, 1, 1, 1, 2, 2, 2] == [1, 1, 1, 1, 1, 2, 2, 2, 2] At index 5 diff: 1 != 2 Full diff: - [1, 1, 1, 1, 1, 2, 2, 2, 2] ? --- + [1, 1, 1, 1, 1, 1, 2, 2, 2] ? +++
@mergifyio backport stable/8.4
backport stable/8.4
✅ Backports have been created
-
#16120 nhrpd: fixes core dump on shutdown (backport #15879) has been created for branch
stable/8.4
but encountered conflicts