frr
frr copied to clipboard
bgpd - Exclude case for remote prefix w/o link-local #16198
This is another potential fix for #16198
- First we modify the original
(from == bgp->peer_self || peer->sort == BGP_PEER_EBGP)to an&&operator. - Then we create a second case for a remote_peer and check
IN6_IS_ADDR_LINKLOCAL(&attr->mp_nexthop_local)(e.g. link-local nh is actually included by remote peer.
In rfc2545 section 3 we have:
The link-local address shall be included in the Next Hop field if and only if the BGP speaker shares a common subnet with the entity identified by the global IPv6 address carried in the Network Address of Next Hop field and the peer the route is being advertised to. In all other cases a BGP speaker shall advertise to its peer in the Network Address field only the global IPv6 address of the next hop (the value of the Length of Network Address of Next Hop field shall be set to 16).
The code in FRR assumes that a remote prefix from a device on the same segment as an eBGP peer already has it's link-local next-hop field set and sets attr->mp_nexthop_len = BGP_ATTR_NHLEN_IPV6_GLOBAL_AND_LL. If this field is empty our own link-local will be added later in the update formation code HOWEVER the global next-hop is left unchanged.
This proposal expands the truth of (A OR B) to "(A AND B) OR (!A AND B) OR (A AND !B)" such that we can isolate (!A AND B) and add an extra test to ensure that original next-hop actually contains a link-local.
Fixes: https://github.com/FRRouting/frr/issues/16198
How does one include topology tests?
Some topo test are failing and I can see why this would break some other valid scenarios. Maybe the previous proposal to put this check behind PEER_FLAG_NEXTHOP_LOCAL_UNCHANGED is a better fix?
An alternative potential fix:
diff --git a/bgpd/bgp_route.c b/bgpd/bgp_route.c
index 94c21e1861..1ed9133742 100644
--- a/bgpd/bgp_route.c
+++ b/bgpd/bgp_route.c
@@ -2482,7 +2482,12 @@ bool subgroup_announce_check(struct bgp_dest *dest, struct bgp_path_info *pi,
*/
if (NEXTHOP_IS_V6) {
attr->mp_nexthop_len = BGP_ATTR_NHLEN_IPV6_GLOBAL;
- if ((CHECK_FLAG(peer->af_flags[afi][safi],
+ if (CHECK_FLAG(peer->af_flags[afi][safi],
+ PEER_FLAG_NEXTHOP_LOCAL_UNCHANGED)
+ && !IN6_IS_ADDR_LINKLOCAL(&attr->mp_nexthop_local)) {
+ if (bgp_debug_update(NULL, p, subgrp->update_group, 0))
+ zlog_debug("No link-local next-hop. Ignoring other checks");
+ } else if ((CHECK_FLAG(peer->af_flags[afi][safi],
PEER_FLAG_NEXTHOP_LOCAL_UNCHANGED)
&& IN6_IS_ADDR_LINKLOCAL(&attr->mp_nexthop_local))
|| (!reflect && !transparent
/cc @ton31337, @mjstapp, @riw777 - I really don't think I have the skills and knowledge to fix this and I am struggling to get resources to run tests locally. Would anyone be willing to guide me and help get a resolution for us?
I dug into the failing test some more and I believe that this code breaks that valid scenario, where the destination_peer is EBGP and we are actually replacing the next-hop with the outgoing interface global IPv6, so adding our own link-link is valid.
I am finding it difficult to isolate our specific case which is not represented in these topology tests. In our case the prefix originator is also on the same network as the EBGP peer and so our DUT is NOT replacing the global next-hop, but that decision (not to update the global next-hop) is made in a different part of the code :-/
I think the test needs to be modified to support this change ... :-(
Thanks @riw777 . Do you already ready have an idea how those test changes would look like? Happy to add them to get this PR in but I don’t have the resources to build a test environment currently.
@riw777 - After a rebase to pick up the latest test refactors looks like all tests are now passing. Is this now good to go?
Hey @riw777 - as the tests are passing now, can we kindly ask you to recheck the code and give us a stamp, please? We have a blocker using FRR w/ IPv6 now for a while that is soon reaching the deadline. So your help would be more than appreciated. Thank you!
This PR now removes the link-local in cases it was valid.
Why doesn't this pull-request have any topotests? I cannot reproduce the issue that was described in the initial issue.
About the change:
((from == bgp->peer_self && peer->sort == BGP_PEER_EBGP) ||
(from == bgp->peer_self && peer->sort != BGP_PEER_EBGP) ||
(from != bgp->peer_self &&
IN6_IS_ADDR_LINKLOCAL(&attr->mp_nexthop_local) &&
peer->sort == BGP_PEER_EBGP))
is actually:
((from == bgp->peer_self) ||
(IN6_IS_ADDR_LINKLOCAL(&attr->mp_nexthop_local) &&
peer->sort == BGP_PEER_EBGP))
@cunningr @anaperic could you provide the failing configuration ? And a topotests to check the issue
If you use a BGPClient other than FRR (ExaBGP, BIRD) and advertise a locally originated IPv6 prefix, it will NOT add the link-local, even though the global NH is on the same network segment. Then FRR, when onward advertising the IPv6 prefix to a peer also on the same network segment, FRR will now insert it's own link-local NH.
FRR
|
BGPClient --- | --- Cisco
So now the "Cisco" router gets a prefix with the Global NH of the BGPClient (OK) but a link-local NH of the FRR (NOK). If the link-local is present the link-local is preferred and connectivity breaks (since our FRR is not forwarding traffic). Since the link-local is not present when advertised by the BGPClient, we expect FRR to NOT insert it's own in this case.
@cunningr I cannot reproduce the issue with the described use case. Please provide more info to reproduce.
I tried to do the reproduce the use case in the https://github.com/FRRouting/frr/pull/17037 topotest