frr
frr copied to clipboard
zebra: fix removing kernel and connected routes on interface linkdown
Hello, this is an attempt to fix the issue https://github.com/FRRouting/frr/issues/7299.
TLDR of the problem is that currently FRR treats a link down and administrative down for an interface as the same events, if you don't disable link detection completely, but kernel (at least Linux) deletes routes on administrative down and does not delete them on link down. This causes "kernel" routes to disappear (connected routes are recreated in RIB on if_up).
I found that the code related to my simple test scenario could be guarded with a few conditions, but it seems too easy and probably ignores some edge cases.
Notably the one thing that my fix ignores is the procfs option mentioned here https://github.com/FRRouting/frr/issues/7299#issuecomment-1095858414.
I have considered additionally setting some flag (new or existing) to indicate that the route is linkdown/blackhole, but I don't know which option to choose as I am not very familiar with the code base.
I also thought about splitting if_down
into if_linkdown
and if_down
, if there are more things that should only be done on an administrative down.
This is the test scenario that I used (Linux 5.10.70, x86_64, Openwrt 21+, FRR 8.1.0)
Starting configuration
root@root:~# ip r
default via 192.168.56.1 dev eth3 proto static src 192.168.56.231 metric 4261412865
10.10.10.0/24 dev eth0 proto kernel scope link src 10.10.10.1
10.10.20.0/24 via 10.10.10.2 dev eth0 metric 16777216
192.168.56.0/24 dev eth3 proto kernel scope link src 192.168.56.231
root@root:~# vtysh -c "sh ip route"
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, F - PBR,
f - OpenFabric,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup
t - trapped, o - offload failure
K>* 0.0.0.0/0 [254/1] via 192.168.56.1, eth3, src 192.168.56.231, 00:46:56
C>* 10.10.10.0/24 is directly connected, eth0, 00:46:56
K>* 10.10.20.0/24 [1/0] via 10.10.10.2, eth0, 00:46:56
C>* 192.168.56.0/24 is directly connected, eth3, 00:46:56
root@root:~# ip -br l show dev eth0
eth0 UP 0c:13:fa:95:00:00 <BROADCAST,MULTICAST,UP,LOWER_UP>
root@root:~# ip -br a show dev eth0
eth0 UP 10.10.10.1/24 fd88:f72d:f91e::1/60 fe80::e13:faff:fe95:0/64
shutdown link
root@root:~# ip -br l show dev eth0
eth0 DOWN 0c:13:fa:95:00:00 <NO-CARRIER,BROADCAST,MULTICAST,UP>
root@root:~# ip r
default via 192.168.56.1 dev eth3 proto static src 192.168.56.231 metric 4261412865
10.10.10.0/24 dev eth0 proto kernel scope link src 10.10.10.1 linkdown
10.10.20.0/24 via 10.10.10.2 dev eth0 metric 16777216 linkdown
192.168.56.0/24 dev eth3 proto kernel scope link src 192.168.56.231
root@root:~# vtysh -c "sh ip route"
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, F - PBR,
f - OpenFabric,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup
t - trapped, o - offload failure
K>* 0.0.0.0/0 [254/1] via 192.168.56.1, eth3, src 192.168.56.231, 00:48:31
C>* 10.10.10.0/24 is directly connected, eth0, 00:48:31
K>* 10.10.20.0/24 [1/0] via 10.10.10.2, eth0, 00:48:31
C>* 192.168.56.0/24 is directly connected, eth3, 00:48:31
bring link back up
root@root:~# ip -br l show dev eth0
eth0 UP 0c:13:fa:95:00:00 <BROADCAST,MULTICAST,UP,LOWER_UP>
root@root:~# ip r
default via 192.168.56.1 dev eth3 proto static src 192.168.56.231 metric 4261412865
10.10.10.0/24 dev eth0 proto kernel scope link src 10.10.10.1
10.10.20.0/24 via 10.10.10.2 dev eth0 metric 16777216
192.168.56.0/24 dev eth3 proto kernel scope link src 192.168.56.231
root@root:~# vtysh -c "sh ip route"
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, F - PBR,
f - OpenFabric,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup
t - trapped, o - offload failure
K>* 0.0.0.0/0 [254/1] via 192.168.56.1, eth3, src 192.168.56.231, 00:52:15
C>* 10.10.10.0/24 is directly connected, eth0, 00:52:15
K>* 10.10.20.0/24 [1/0] via 10.10.10.2, eth0, 00:52:15
C>* 192.168.56.0/24 is directly connected, eth3, 00:52:15
administratively shutdown interface
root@root:~# ip -br l set dev eth0 down
root@root:~# ip r
default via 192.168.56.1 dev eth3 proto static src 192.168.56.231 metric 4261412865
192.168.56.0/24 dev eth3 proto kernel scope link src 192.168.56.231
root@root:~# vtysh -c "sh ip route"
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, F - PBR,
f - OpenFabric,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup
t - trapped, o - offload failure
K>* 0.0.0.0/0 [254/1] via 192.168.56.1, eth3, src 192.168.56.231, 00:53:34
C>* 192.168.56.0/24 is directly connected, eth3, 00:53:34
administratively enable interface
root@root:~# ip -br l set dev eth0 up
root@root:~# ip r
default via 192.168.56.1 dev eth3 proto static src 192.168.56.231 metric 4261412865
10.10.10.0/24 dev eth0 proto kernel scope link src 10.10.10.1
192.168.56.0/24 dev eth3 proto kernel scope link src 192.168.56.231
root@root:~# vtysh -c "sh ip route"
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, F - PBR,
f - OpenFabric,
> - selected route, * - FIB route, q - queued, r - rejected, b - backup
t - trapped, o - offload failure
K>* 0.0.0.0/0 [254/1] via 192.168.56.1, eth3, src 192.168.56.231, 00:53:49
C>* 10.10.10.0/24 is directly connected, eth0, 00:00:10
C>* 192.168.56.0/24 is directly connected, eth3, 00:53:49
Continuous Integration Result: FAILED
See below for issues. CI System Testrun URL: https://ci1.netdef.org/browse/FRR-PULLREQ2-7480/
This is a comment from an automated CI system. For questions and feedback in regards to this CI system, please feel free to email Martin Winter - mwinter (at) opensourcerouting.org.
Get source / Pull Request: Successful
Building Stage: Successful
Basic Tests: Failed
Topotests Ubuntu 18.04 i386 part 5: Failed (click for details)
Topology Test Results are at https://ci1.netdef.org/browse/FRR-PULLREQ2-TOPO5U18I386-7480/test
Topology Tests failed for Topotests Ubuntu 18.04 i386 part 5 see full log at https://ci1.netdef.org/browse/FRR-PULLREQ2-7480/artifact/TOPO5U18I386/ErrorLog/log_topotests.txt
Topotests Ubuntu 18.04 i386 part 9: Failed (click for details)
Topology Test Results are at https://ci1.netdef.org/browse/FRR-PULLREQ2-TOPO9U18I386-7480/test
Topology Tests failed for Topotests Ubuntu 18.04 i386 part 9 see full log at https://ci1.netdef.org/browse/FRR-PULLREQ2-7480/artifact/TOPO9U18I386/ErrorLog/log_topotests.txt
Topotests Ubuntu 18.04 arm8 part 8: Failed (click for details)
Topotests Ubuntu 18.04 arm8 part 8: No useful log foundTopotests Ubuntu 18.04 i386 part 8: Failed (click for details)
Topology Test Results are at https://ci1.netdef.org/browse/FRR-PULLREQ2-TOPO8U18I386-7480/test
Topology Tests failed for Topotests Ubuntu 18.04 i386 part 8 see full log at https://ci1.netdef.org/browse/FRR-PULLREQ2-7480/artifact/TOPO8U18I386/ErrorLog/log_topotests.txt
Topotests debian 10 amd64 part 8: Failed (click for details)
Topology Test Results are at https://ci1.netdef.org/browse/FRR-PULLREQ2-TOPO8DEB10AMD64-7480/test
Topology Tests failed for Topotests debian 10 amd64 part 8 see full log at https://ci1.netdef.org/browse/FRR-PULLREQ2-7480/artifact/TOPO8DEB10AMD64/ErrorLog/log_topotests.txt
Topotests Ubuntu 18.04 arm8 part 9: Failed (click for details)
Topotests Ubuntu 18.04 arm8 part 9: No useful log foundTopotests debian 10 amd64 part 9: Failed (click for details)
Topology Test Results are at https://ci1.netdef.org/browse/FRR-PULLREQ2-TOPO9DEB10AMD64-7480/test
Topology Tests failed for Topotests debian 10 amd64 part 9 see full log at https://ci1.netdef.org/browse/FRR-PULLREQ2-7480/artifact/TOPO9DEB10AMD64/ErrorLog/log_topotests.txt
Topotests Ubuntu 18.04 amd64 part 8: Failed (click for details)
Topology Test Results are at https://ci1.netdef.org/browse/FRR-PULLREQ2-TOPO8U18ARM64-7480/test
Topology Tests failed for Topotests Ubuntu 18.04 amd64 part 8 see full log at https://ci1.netdef.org/browse/FRR-PULLREQ2-7480/artifact/TOPO8U18ARM64/ErrorLog/log_topotests.txt
Topotests Ubuntu 18.04 amd64 part 9: Failed (click for details)
Topology Test Results are at https://ci1.netdef.org/browse/FRR-PULLREQ2-TOPO9U18AMD64-7480/test
Topology Tests failed for Topotests Ubuntu 18.04 amd64 part 9 see full log at https://ci1.netdef.org/browse/FRR-PULLREQ2-7480/artifact/TOPO9U18AMD64/ErrorLog/log_topotests.txt
Topotests Ubuntu 18.04 arm8 part 5: Failed (click for details)
Topotests Ubuntu 18.04 arm8 part 5: No useful log foundTopotests Ubuntu 18.04 amd64 part 5: Failed (click for details)
Topology Test Results are at https://ci1.netdef.org/browse/FRR-PULLREQ2-TOPO5U18AMD64-7480/test
Topology Tests failed for Topotests Ubuntu 18.04 amd64 part 5 see full log at https://ci1.netdef.org/browse/FRR-PULLREQ2-7480/artifact/TOPO5U18AMD64/ErrorLog/log_topotests.txt
Successful on other platforms/tests
- Topotests debian 10 amd64 part 6
- Topotests debian 10 amd64 part 1
- Topotests Ubuntu 18.04 arm8 part 1
- Topotests Ubuntu 18.04 amd64 part 7
- Fedora 29 rpm pkg check
- Addresssanitizer topotests part 3
- Topotests Ubuntu 18.04 i386 part 0
- Topotests Ubuntu 18.04 amd64 part 4
- CentOS 7 rpm pkg check
- Addresssanitizer topotests part 2
- Topotests Ubuntu 18.04 amd64 part 0
- Topotests debian 10 amd64 part 7
- Debian 9 deb pkg check
- Addresssanitizer topotests part 8
- Topotests debian 10 amd64 part 5
- Topotests Ubuntu 18.04 i386 part 7
- Topotests Ubuntu 18.04 amd64 part 6
- Topotests Ubuntu 18.04 i386 part 2
- Topotests Ubuntu 18.04 arm8 part 6
- Addresssanitizer topotests part 6
- Topotests Ubuntu 18.04 amd64 part 1
- Ubuntu 18.04 deb pkg check
- Topotests Ubuntu 18.04 amd64 part 2
- Addresssanitizer topotests part 5
- Addresssanitizer topotests part 4
- Topotests Ubuntu 18.04 i386 part 3
- Addresssanitizer topotests part 0
- Topotests Ubuntu 18.04 arm8 part 4
- IPv6 protocols on Ubuntu 18.04
- Topotests debian 10 amd64 part 4
- Topotests debian 10 amd64 part 3
- Topotests Ubuntu 18.04 arm8 part 3
- Topotests Ubuntu 18.04 amd64 part 3
- Addresssanitizer topotests part 1
- Topotests Ubuntu 18.04 i386 part 4
- Topotests Ubuntu 18.04 arm8 part 7
- Addresssanitizer topotests part 9
- IPv4 protocols on Ubuntu 18.04
- Static analyzer (clang)
- Topotests Ubuntu 18.04 arm8 part 0
- Topotests debian 10 amd64 part 0
- IPv4 ldp protocol on Ubuntu 18.04
- Ubuntu 16.04 deb pkg check
- Topotests Ubuntu 18.04 arm8 part 2
- Topotests debian 10 amd64 part 2
- Debian 10 deb pkg check
- Topotests Ubuntu 18.04 i386 part 1
- Addresssanitizer topotests part 7
- Topotests Ubuntu 18.04 i386 part 6
- Ubuntu 20.04 deb pkg check
Warnings Generated during build:
Checkout code: Successful with additional warnings
Topotests Ubuntu 18.04 i386 part 5: Failed (click for details)
Topology Test Results are at https://ci1.netdef.org/browse/FRR-PULLREQ2-TOPO5U18I386-7480/test
Topology Tests failed for Topotests Ubuntu 18.04 i386 part 5 see full log at https://ci1.netdef.org/browse/FRR-PULLREQ2-7480/artifact/TOPO5U18I386/ErrorLog/log_topotests.txt
Topotests Ubuntu 18.04 i386 part 9: Failed (click for details)
Topology Test Results are at https://ci1.netdef.org/browse/FRR-PULLREQ2-TOPO9U18I386-7480/test
Topology Tests failed for Topotests Ubuntu 18.04 i386 part 9 see full log at https://ci1.netdef.org/browse/FRR-PULLREQ2-7480/artifact/TOPO9U18I386/ErrorLog/log_topotests.txt
Topotests Ubuntu 18.04 arm8 part 8: Failed (click for details)
Topotests Ubuntu 18.04 arm8 part 8: No useful log foundTopotests Ubuntu 18.04 i386 part 8: Failed (click for details)
Topology Test Results are at https://ci1.netdef.org/browse/FRR-PULLREQ2-TOPO8U18I386-7480/test
Topology Tests failed for Topotests Ubuntu 18.04 i386 part 8 see full log at https://ci1.netdef.org/browse/FRR-PULLREQ2-7480/artifact/TOPO8U18I386/ErrorLog/log_topotests.txt
Topotests debian 10 amd64 part 8: Failed (click for details)
Topology Test Results are at https://ci1.netdef.org/browse/FRR-PULLREQ2-TOPO8DEB10AMD64-7480/test
Topology Tests failed for Topotests debian 10 amd64 part 8 see full log at https://ci1.netdef.org/browse/FRR-PULLREQ2-7480/artifact/TOPO8DEB10AMD64/ErrorLog/log_topotests.txt
Topotests Ubuntu 18.04 arm8 part 9: Failed (click for details)
Topotests Ubuntu 18.04 arm8 part 9: No useful log foundTopotests debian 10 amd64 part 9: Failed (click for details)
Topology Test Results are at https://ci1.netdef.org/browse/FRR-PULLREQ2-TOPO9DEB10AMD64-7480/test
Topology Tests failed for Topotests debian 10 amd64 part 9 see full log at https://ci1.netdef.org/browse/FRR-PULLREQ2-7480/artifact/TOPO9DEB10AMD64/ErrorLog/log_topotests.txt
Topotests Ubuntu 18.04 amd64 part 8: Failed (click for details)
Topology Test Results are at https://ci1.netdef.org/browse/FRR-PULLREQ2-TOPO8U18ARM64-7480/test
Topology Tests failed for Topotests Ubuntu 18.04 amd64 part 8 see full log at https://ci1.netdef.org/browse/FRR-PULLREQ2-7480/artifact/TOPO8U18ARM64/ErrorLog/log_topotests.txt
Topotests Ubuntu 18.04 amd64 part 9: Failed (click for details)
Topology Test Results are at https://ci1.netdef.org/browse/FRR-PULLREQ2-TOPO9U18AMD64-7480/test
Topology Tests failed for Topotests Ubuntu 18.04 amd64 part 9 see full log at https://ci1.netdef.org/browse/FRR-PULLREQ2-7480/artifact/TOPO9U18AMD64/ErrorLog/log_topotests.txt
Topotests Ubuntu 18.04 arm8 part 5: Failed (click for details)
Topotests Ubuntu 18.04 arm8 part 5: No useful log foundTopotests Ubuntu 18.04 amd64 part 5: Failed (click for details)
Topology Test Results are at https://ci1.netdef.org/browse/FRR-PULLREQ2-TOPO5U18AMD64-7480/test
Topology Tests failed for Topotests Ubuntu 18.04 amd64 part 5 see full log at https://ci1.netdef.org/browse/FRR-PULLREQ2-7480/artifact/TOPO5U18AMD64/ErrorLog/log_topotests.txt
Report for interface.c | 8 issues
===============================================
< WARNING: Block comments use * on subsequent lines
< #1073: FILE: /tmp/f1-8173/interface.c:1073:
< WARNING: Block comments use a trailing */ on a separate line
< #1073: FILE: /tmp/f1-8173/interface.c:1073:
< WARNING: Block comments use * on subsequent lines
< #1106: FILE: /tmp/f1-8173/interface.c:1106:
< WARNING: Block comments use a trailing */ on a separate line
< #1106: FILE: /tmp/f1-8173/interface.c:1106:
@yar-fed -> I believe I fixed this issue already. Can you please try the problem on latest master without your code to see if it still exists?
If the issue still exists can you give me a sequence of commands that shows the problem so I may understand what I am mising better?
@donaldsharp Thanks, I tested on latest master and the removal of kernel routes is fixed. But
- connected routes are still being removed, while also remaining in the kernel table with linkdown flag.
- there is a new separate bug: those linkdown kernel routes are not removed when ip address is deleted on interface with "ip address del"
Also can you link a PR (or multiple PRs) that addressed the original issue (or maybe there is already a backport to 8.1 or 8.3)?
I believe the connected routes come back on link up, correct? and for #2 can you show me a series of commands that show the issue?
@donaldsharp Can you please show the commit that fix this behaviour? I use 8.4.2 frr and there are a some reasons that do not use the latest master. If I understand right is this PR does not quite correct?
In my case I receive bgp route from my neigbor Then I delete ip address from interface and configure it again with the same address. Interface is in up state. After that I do not see routes from my neighbor in "ip r", but I see them if I run in vtysh "show ip route"