frr icon indicating copy to clipboard operation
frr copied to clipboard

bgpd: cleanup for path_info extra

Open louis-6wind opened this issue 1 year ago • 3 comments

The handling of MPLS labels in BGP faces an issue due to the way labels are stored in memory. They are stored in bgp_path_info but not in bgp_adj_in and bgp_adj_out structures. As a consequence, some configuration changes result in losing labels or even a bgpd crash. For example, when retrieving routes from the Adj-RIB-in table ("soft-reconfiguration inbound" enabled), labels are missing.

bgp_path_info stores the MPLS labels, as shown below:

struct bgp_path_info { struct bgp_path_info_extra *extra; [...] struct bgp_path_info_extra { mpls_label_t label[BGP_MAX_LABELS]; uint32_t num_labels; [...]

Since the bgp_path_info, bgp_adj_in and bgp_adj_out structures share a common reference to an attr structure pointer, a move of labels from the extra to the attr structure would be a solution to fix those issues.

However, an issue in the code prevents us from doing that without a preliminary rework. The extra->num_labels field, which is intended to indicate the number of labels in extra->label[], is not reliably checked or set. The code often incorrectly assumes that if the extra pointer is present, then a label must also be present, leading to direct access to extra->label[] without verifying extra->num_labels. This assumption usually works because extra->label[0] is set to MPLS_INVALID_LABEL when a new bgp_path_info_extra is created, but it is technically incorrect.

The attr field, in contrast to extra, is always present in bgp_path_info so it is not possible to condition the presence of labels to the validity of the attr pointer. Furthermore, the attr structure is usually declared on the stack with all fields to 0 but 0 is a valid label. Direct access to attr labels without any checks would not be possible.

Cleanup the label code by setting num_labels each time values are set in extra->label[] and checking extra->num_labels before accessing the labels.

louis-6wind avatar Feb 05 '24 16:02 louis-6wind

ci:rerun

louis-6wind avatar Feb 06 '24 09:02 louis-6wind

ci:rerun

louis-6wind avatar Feb 12 '24 09:02 louis-6wind

I've just added the reason for the rework in the first commit log as requested in the last meeting

louis-6wind avatar Feb 15 '24 09:02 louis-6wind

@donaldsharp is concerned about the information passed to route map processing; need to think about this more before merging

riw777 avatar Feb 20 '24 16:02 riw777

@donaldsharp is concerned about the information passed to route map processing; need to think about this more before merging

I think we can do the cleanup rework even we don't move labels from extra to attr. There is definitively something wrong in the way label[] is accessed.

louis-6wind avatar Feb 20 '24 16:02 louis-6wind

To write down my problem:

Currently FRR bgp sends updates to peers where we track the attribute + the list of prefixes that use that attribute. This allows FRR to pack data via nlri very efficiently to peers. You can get 100's-1000's of prefixes per bgp update packet because of this. With this change it's possible to have a label per prefix, thus generating a attribute per prefix. BGP memory allocation would take a huge hit and in addition packing data to the peers would take a huge hit as well. I am not comfortable with this approach of moving the label information into the attribute.

donaldsharp avatar Feb 20 '24 16:02 donaldsharp

To write down my problem:

Currently FRR bgp sends updates to peers where we track the attribute + the list of prefixes that use that attribute. This allows FRR to pack data via nlri very efficiently to peers. You can get 100's-1000's of prefixes per bgp update packet because of this. With this change it's possible to have a label per prefix, thus generating a attribute per prefix. BGP memory allocation would take a huge hit and in addition packing data to the peers would take a huge hit as well. I am not comfortable with this approach of moving the label information into the attribute.

I agree with you @donaldsharp.

The alternative approach to solve the following issues would be the solution from https://github.com/FRRouting/frr/pull/15205

The handling of MPLS labels in BGP faces an issue due to the way labels are stored in memory. They are stored in bgp_path_info but not in bgp_adj_in and bgp_adj_out structures. As a consequence, some configuration changes result in losing labels or even a bgpd crash. For example, when retrieving routes from the Adj-RIB-in table ("soft-reconfiguration inbound" enabled), labels are missing.

https://github.com/FRRouting/frr/pull/15205 duplicates the labels to bgp_adj_in and bgp_adj_out structures. A better approach would be to reference the extra pointer in bgp_adj_in and bgp_adj_out structures. What do you think of that ?

If labels are kept in extra, it seems the cleanup of this PR is still needed.

louis-6wind avatar Feb 22 '24 16:02 louis-6wind

@donaldsharp In the new https://github.com/FRRouting/frr/pull/15434 pull-request, the approach is now to store labels in a new bgp_labels structure. Reference bgp_labels pointer in bgp_adj_in and bgp_adj_out structures in addition to the bgp_path_info_extra structure and to use a bgp_label hash list to save memory.

louis-6wind avatar Feb 27 '24 09:02 louis-6wind

@donaldsharp In the new #15434 pull-request, the approach is now to store labels in a new bgp_labels structure. Reference bgp_labels pointer in bgp_adj_in and bgp_adj_out structures in addition to the bgp_path_info_extra structure and to use a bgp_label hash list to save memory.

Should we close this one?

riw777 avatar Feb 27 '24 15:02 riw777

This pull request has conflicts, please resolve those before we can evaluate the pull request.

github-actions[bot] avatar Feb 27 '24 15:02 github-actions[bot]

replaced by #15434

riw777 avatar Feb 27 '24 16:02 riw777