gobgp
gobgp copied to clipboard
Native BFD support
Hi GoBGP team,
Thanks for your great progress on developing GoBGP.
Do you have any plans about implementing BFD natively in GoBGP? Direct and multihop sessions may be quite useful over non so reliable channels. Because BGP can't provide fast (ms) failure detection.
If someone sends a pull request for it, I would be happy to merge.
I have been working on this issue for a bit, here is what I have so far.
The state as of writing this is that I have basic BFD support, I can establish sessions with FRR on a single-hop connection, but no echo, multihop or authentication (all of which are optional BTW, so this is technically a minimal implementation of the spec).
I extended the gRPC API and hooked everything up on the server side. I have made two flags in the global
and neighbor add|update
commands to enable BFD with default settings as a demonstration. But I have doubts on what the best way is to add all of the BFD configuration and state info to the CLI. Would love some feedback on this @fujita if you can spare the time.
For the settings, the naive approach would be to add all settings to the existing param list:
usage: gobgp neighbor add [ <neighbor-address> | interface <neighbor-interface> ] as <VALUE> [ local-as <VALUE> | family <address-families-list> | vrf <vrf-name> | route-reflector-client [<cluster-id>] | route-server-client | allow-own-as <num> | remove-private-as (all|replace) | replace-peer-as | ebgp-multihop-ttl <ttl> | enable-bfd | bfd-passive | bfd-detect-multi <1-255> | bfd-tx-interval <ms> | bfd-rx-interval <ms> | bfd-demand | bfd-echo | bfd-echo-tx-interval <ms> | bfd-echo-rx-interval <ms>]
Given the amount of sessions, this seems like it would negatively impact the user experience.
A second approach could be to define a separate BFD command.
Again, I can go two ways in this. The first would be to force users to define and use a "profile".
usage: gobgp bfd profile add <name> [passive | detect-multi <1-255> | tx-interval <ms> | rx-interval <ms> | demand | echo | echo-tx-interval <ms> | echo-rx-interval <ms>]
And then when enabling BFD for a neighbor, a user can specify a profile if they want a non default config.
usage: gobgp neighbor add [ <neighbor-address> | interface <neighbor-interface> ] as <VALUE> [ local-as <VALUE> | family <address-families-list> | vrf <vrf-name> | route-reflector-client [<cluster-id>] | route-server-client | allow-own-as <num> | remove-private-as (all|replace) | replace-peer-as | ebgp-multihop-ttl <ttl> | enable-bfd | bfd-profile <name> ]
Alternatively, we decouple BFD sessions completely from the BGP neighbors, and we allow users to optionally link a given BFD session to a BGP neighbor.
usage: gobgp bfd session add <name> <remote> [multihop | max-ttl <1-254> | passive | detect-multi <1-255> | tx-interval <ms> | rx-interval <ms> | demand | echo | echo-tx-interval <ms> | echo-rx-interval <ms>]
usage: gobgp neighbor add [ <neighbor-address> | interface <neighbor-interface> ] as <VALUE> [ local-as <VALUE> | family <address-families-list> | vrf <vrf-name> | route-reflector-client [<cluster-id>] | route-server-client | allow-own-as <num> | remove-private-as (all|replace) | replace-peer-as | ebgp-multihop-ttl <ttl> | bfd-session <name> ]
would really appreciate some thoughts on this.
@dylandreimerink thanks for the update, I personally like the global BFD profiles definition and enabling BFD on a per BGP session with optional timer definitions at the peer level super-seeding the global profile
@dylandreimerink I can't open the PR for some reason. I'd be happy to contribute as needed to this PR... I like the second approach as well.
Out of curiosity, why integrating BFD into GoBGP is preferable? Having a separate BFD daemon that manages GoBGP via the gRPC APIs is simpler?
Having a separate BFD daemon that manages GoBGP via the gRPC APIs is simpler?
Yes, for a few reasons. BFD is all about the actual forwarding, so if you are running a route server, you want the actual BFD daemon to run on the devices that do the actual forwarding, not the control plane. If the BFD daemons detect trouble they contact the BGP daemon to perform some action.
Since BFD is routing protocol agnostic, it can also work with non BGP protocols like IS-IS and OSPF or even things like loadbalancers.
Its for these two reasons that I decided to not implement in GoBGP directly
Yeah, makes sense. In the past, I implemented BFD in the same way. If your BFD implementation needs a new gRPC API to manage GoBGP, feel free to make a pull request.