frr
frr copied to clipboard
bgpd: cleanup for path_info extra
The handling of MPLS labels in BGP faces an issue due to the way labels are stored in memory. They are stored in bgp_path_info but not in bgp_adj_in and bgp_adj_out structures. As a consequence, some configuration changes result in losing labels or even a bgpd crash. For example, when retrieving routes from the Adj-RIB-in table ("soft-reconfiguration inbound" enabled), labels are missing.
bgp_path_info stores the MPLS labels, as shown below:
struct bgp_path_info { struct bgp_path_info_extra *extra; [...] struct bgp_path_info_extra { mpls_label_t label[BGP_MAX_LABELS]; uint32_t num_labels; [...]
Since the bgp_path_info, bgp_adj_in and bgp_adj_out structures share a common reference to an attr structure pointer, a move of labels from the extra to the attr structure would be a solution to fix those issues.
However, an issue in the code prevents us from doing that without a preliminary rework. The extra->num_labels field, which is intended to indicate the number of labels in extra->label[], is not reliably checked or set. The code often incorrectly assumes that if the extra pointer is present, then a label must also be present, leading to direct access to extra->label[] without verifying extra->num_labels. This assumption usually works because extra->label[0] is set to MPLS_INVALID_LABEL when a new bgp_path_info_extra is created, but it is technically incorrect.
The attr field, in contrast to extra, is always present in bgp_path_info so it is not possible to condition the presence of labels to the validity of the attr pointer. Furthermore, the attr structure is usually declared on the stack with all fields to 0 but 0 is a valid label. Direct access to attr labels without any checks would not be possible.
Cleanup the label code by setting num_labels each time values are set in extra->label[] and checking extra->num_labels before accessing the labels.
ci:rerun
ci:rerun
I've just added the reason for the rework in the first commit log as requested in the last meeting
@donaldsharp is concerned about the information passed to route map processing; need to think about this more before merging
@donaldsharp is concerned about the information passed to route map processing; need to think about this more before merging
I think we can do the cleanup rework even we don't move labels from extra to attr. There is definitively something wrong in the way label[] is accessed.
To write down my problem:
Currently FRR bgp sends updates to peers where we track the attribute + the list of prefixes that use that attribute. This allows FRR to pack data via nlri very efficiently to peers. You can get 100's-1000's of prefixes per bgp update packet because of this. With this change it's possible to have a label per prefix, thus generating a attribute per prefix. BGP memory allocation would take a huge hit and in addition packing data to the peers would take a huge hit as well. I am not comfortable with this approach of moving the label information into the attribute.
To write down my problem:
Currently FRR bgp sends updates to peers where we track the attribute + the list of prefixes that use that attribute. This allows FRR to pack data via nlri very efficiently to peers. You can get 100's-1000's of prefixes per bgp update packet because of this. With this change it's possible to have a label per prefix, thus generating a attribute per prefix. BGP memory allocation would take a huge hit and in addition packing data to the peers would take a huge hit as well. I am not comfortable with this approach of moving the label information into the attribute.
I agree with you @donaldsharp.
The alternative approach to solve the following issues would be the solution from https://github.com/FRRouting/frr/pull/15205
The handling of MPLS labels in BGP faces an issue due to the way labels are stored in memory. They are stored in bgp_path_info but not in bgp_adj_in and bgp_adj_out structures. As a consequence, some configuration changes result in losing labels or even a bgpd crash. For example, when retrieving routes from the Adj-RIB-in table ("soft-reconfiguration inbound" enabled), labels are missing.
https://github.com/FRRouting/frr/pull/15205 duplicates the labels to bgp_adj_in and bgp_adj_out structures. A better approach would be to reference the extra pointer in bgp_adj_in and bgp_adj_out structures. What do you think of that ?
If labels are kept in extra, it seems the cleanup of this PR is still needed.
@donaldsharp In the new https://github.com/FRRouting/frr/pull/15434 pull-request, the approach is now to store labels in a new bgp_labels structure. Reference bgp_labels pointer in bgp_adj_in
and bgp_adj_out
structures in addition to the
bgp_path_info_extra
structure and to use a bgp_label hash list to save memory.
@donaldsharp In the new #15434 pull-request, the approach is now to store labels in a new bgp_labels structure. Reference bgp_labels pointer in
bgp_adj_in
andbgp_adj_out
structures in addition to thebgp_path_info_extra
structure and to use a bgp_label hash list to save memory.
Should we close this one?
This pull request has conflicts, please resolve those before we can evaluate the pull request.
replaced by #15434