frr
frr copied to clipboard
bgpd: Revalidate locally originated routes also when RPKI state changes
If we have something like:
router bgp 65001
network 10.10.10.0/24 route-map rpki
!
route-map rpki permit 10
match rpki valid
set local-preference 150
route-map rpki permit 20
match rpki notfound
set local-preference 50
!
Then 10.10.10.0/24 is never revalidated when RPKI state changes. E.g. if it was valid, and moves to notfound => local-preference remains 150, but should be 50.
With this patch we force BGP network (static) routes to be revalidated when revalidation event is triggered from RPKI module.
Fixes https://github.com/FRRouting/frr/issues/16474
I know we have other places in the code where we have timers to regenerate data in BGP but I am personally not happy about that approach as that it adds unnecessary load to already loaded systems when there are a bunch of prefixes in the system. Is there a way we can have routes revalidated automatically? what is going on that we have to set a timer?
what is going on that we have to set a timer?
Routes are never revalidated actually (including static routes (network ...), redistributed routes, ...). When a callback is received from librtr, we validate per-prefix and per-neighbor (using soft inbound) only.
waiting on https://github.com/FRRouting/frr/issues/16474 to be tested to make certain it resolves this issue
https://github.com/FRRouting/frr/issues/16474 is fixed using this approach. @donaldsharp do you have any other solution or we can go with this?
this might need to be redesigned so it's not timer based ...
It's not really possible easily. RPKI cache server sends only changed prefixes. E.g. using StayRTR:
{
"roas": [
{
"prefix": "100.100.100.100/32",
"maxLength": 32,
"asn": "AS65001"
}
]
}
If I change to:
{
"roas": [
{
"prefix": "100.100.100.101/32",
"maxLength": 32,
"asn": "AS65001"
}
]
}
Then 100.100.100.101/32 will be sent as a prefix to be validated. If this route does not exist in the RIB, then no changes will be triggered and again we have a stale state.
With the timer, it's deterministic, we always know we have the right state eventually. It can be controlled with high timer values and does not need to be every 30 seconds.
UPDATE: Opened an issue, but I doubt it's even possible because it's more like up to the RPKI cache server to send those removed prefixes.
@donaldsharp are we still looking into another way to do this, or should we go with this commit for the moment?
I would really really prefer that we fix this problem the correct way the first time. I don't think we are in too much of a hurry since this is a day one problem with the rpki code.
This pull request has conflicts, please resolve those before we can evaluate the pull request.
Where are we on this? Still waiting on a better way to resolve it?
TBH, there is no other proper way of handling this other than scanning and revalidating. For the same reason, RPKI cache server daemons just do not send "removed" prefixes, and rtrlib is just a middleware here.
This pull request has conflicts, please resolve those before we can evaluate the pull request.
@Mergifyio rebase
rebase
❌ Pull request can't be updated with latest base branch changes
Mergify needs the author permission to update the base branch of the pull request. @opensourcerouting needs to authorize modification on its head branch.
We need to figure out what to do with this ... do we merge it as is, or do we wait on something from the RPKI folks to resolve it?
Russ let's bring it up during today's meeting
Waiting on RPKI folks to answer whether this will be fixed on their end or not ...
I assume we're still waiting on the RPKI folks here?
Yeah...
Should we try to have a side meeting with the RPKI folks to get this one pushed?
Anything happening here?
Can someone from FRR please reopen and merge this? It's still a problem, at least in nightly VyOS builds.
It's possible to reopen, but we don't have an agreement here. @donaldsharp, what do you think if we do this with a separate configuration knob?