public icon indicating copy to clipboard operation
public copied to clipboard

Clarify AFTS state-synced

Open dplore opened this issue 8 months ago • 4 comments

https://github.com/openconfig/public/blob/b2258aa30dcf939144c094dacd8da239fc090855/release/models/aft/openconfig-aft.yang#L288-L304

This should be clarified that this indicates the device's internal AFT consistency. Examples should be given such as, is the RIB corrupt or known to be incomplete due to device bootup, internal processes starting or failed to start or other internal issues known to cause the RIB to be incomplete. The state-synced container does not reflect the status of a device to client streaming state which may be handled by network management protocols such as gNMI's sync_response, .

Other examples of state-synced should be added such as

  • Expected to affect state-synced
    • Device boot
    • When CONTROLLER_CARD switchover occurs
  • Not expected to affect state-synced
    • BGP session flaps
    • Interface state change (up/down)

dplore avatar May 09 '25 17:05 dplore

Here is a potential suggestion...."This leaf indicates device's internal AFT data consistency. 'True' means the device completed full RIB-FIB reconciliation: routing protocols converged, RIB programmed FIB, and FIB accurately reflects RIB. This signals operational readiness and establishes an initial, consistent RIB-FIB state after disruptive events (e.g, boot, device controller-card switchover etc). While RIB/FIB are dynamic due to churn, this flag denotes initial RIB-FIB synchronization completion post-event. This is a device-centric flag reflects only internal forwarding integrity, not per-client (gNMI sync_response handles streaming state)".

masood-shah avatar May 09 '25 19:05 masood-shah

Discussed in Operators meeting 2025-06-03:

  • routing reconvergence should not impact this leaf.
  • original discussion was around the case where a supervisor/routing-engine/control-plane restarts and has not yet populated the database it has of what entries are already in hardware. there was an original document that covered this case - which should probably be shared and discussed further.

robshakir avatar Jun 03 '25 17:06 robshakir

Discussed in the meeting (June 17th, 2025). There is still some confusion about the exact use case here.

I suggested that we try to codify our requirements into a functional test, though @dplore thought that there would not be a good way to "force" the leaf to become false? So, confirming that it is implemented properly could be difficult.

Could we review how it is used by system owners to get a better description of the use-case?

ElodinLaarz avatar Jun 17 '25 16:06 ElodinLaarz

Discussed in OC Operator's meet on Oct 07: Moving out of Ready-to-discuss. @masood-shah can you (or someone else you find) add some info here on the intended use case of the leaf?

ElodinLaarz avatar Oct 07 '25 16:10 ElodinLaarz