public icon indicating copy to clipboard operation
public copied to clipboard

Adding support for operational state representing interface rates.

Open hyun-arista opened this issue 2 years ago • 7 comments

Change Scope

  • Adding support for operational state representing interface rates under /interfaces/interface. Given an interface, this operational state will expose ingress/egress rates in units of bits-per-second (bps) and packets-per-second (pps).
  • This change is backwards compatible.

pyang output:

module: openconfig-interfaces
  +--rw interfaces
     +--rw interface* [name]
        +--rw oc-if-rates:rates
           +--ro oc-if-rates:state
              +--ro oc-if-rates:out-bits-rate?   oc-types:ieeefloat32
              +--ro oc-if-rates:in-bits-rate?    oc-types:ieeefloat32
              +--ro oc-if-rates:out-pkts-rate?   oc-types:ieeefloat32
              +--ro oc-if-rates:in-pkts-rate?    oc-types:ieeefloat32

Platform Implementations

hyun-arista avatar Dec 05 '23 19:12 hyun-arista

No major YANG version changes in commit 48acc8e1eaed5ee6db4c6464fe1f9469e32fdcdb

OpenConfigBot avatar Dec 05 '23 19:12 OpenConfigBot

Should there also be an refresh interval which is configurable?

(example ref: Cisco IOS XR

dplore avatar Dec 12 '23 21:12 dplore

Perhaps the type should be uint64 (gauge64?) or uint32.

LimeHat avatar Dec 12 '23 21:12 LimeHat

@hyun-arista any response for us?

dplore avatar Mar 19 '24 16:03 dplore

Sorry for the delay on the response; lost track this.

Regarding the load-interval: Yes, having a load-interval to be configurable with a range of say 0-600 seconds with default of 300 seconds sounds reasonable.

Regarding changing the rate value type from float to int: I am not sure about this; I don't have a concrete example but wouldn't it be better to have it as accurate as possible? We can leave it up to the end-user to truncate the fractional part if needed.

hyun-arista avatar Mar 29 '24 00:03 hyun-arista

Sorry for the delay on the response; lost track this.

Regarding the load-interval: Yes, having a load-interval to be configurable with a range of say 0-600 seconds with default of 300 seconds sounds reasonable.

Regarding changing the rate value type from float to int: I am not sure about this; I don't have a concrete example but wouldn't it be better to have it as accurate as possible? We can leave it up to the end-user to truncate the fractional part if needed.

We'd like to avoid introducing more floats into OC unless the underlying data really requires it. Integer or decimal would be preferred. uint64 seems good here. I don't think we really need to see fractional packets and bits per second?

Here's a link to the latest thread on float in OC yang models: https://github.com/openconfig/public/issues/1042

dplore avatar Mar 29 '24 05:03 dplore

Sorry for the delay on the response; lost track this. Regarding the load-interval: Yes, having a load-interval to be configurable with a range of say 0-600 seconds with default of 300 seconds sounds reasonable. Regarding changing the rate value type from float to int: I am not sure about this; I don't have a concrete example but wouldn't it be better to have it as accurate as possible? We can leave it up to the end-user to truncate the fractional part if needed.

We'd like to avoid introducing more floats into OC unless the underlying data really requires it. Integer or decimal would be preferred. uint64 seems good here. I don't think we really need to see fractional packets and bits per second?

Here's a link to the latest thread on float in OC yang models: #1042

Thanks for letting me know. I wasn't aware of this issue. I'll submit an update of the PR early next week (adding a config container to the rates container and adding a load-interval leaf + changing the rate value type to uint64). Thanks!

hyun-arista avatar Mar 29 '24 15:03 hyun-arista

The proposal is updated as follows:

  • Added configuration for load-interval (type: uint16, units: "seconds", range: "0..600", and default: 300)
  • Changed the type of state leaves from float to uint64.

Thanks!

hyun-arista avatar Apr 02 '24 19:04 hyun-arista

/gcbrun

wenovus avatar Apr 03 '24 18:04 wenovus

/gcbrun

dplore avatar Apr 05 '24 01:04 dplore

I've updated the PR in the following way:

  • Changed the range of load-interval to 1..600.
  • Added a mirroring state leaf for the load-interval config leaf. Thank you!

hyun-arista avatar Apr 12 '24 20:04 hyun-arista

/gcbrun

dplore avatar Apr 15 '24 17:04 dplore

Just catching up on this PR now...

JunOS is at 180 max in 10 second increments.

Note: This is an invalid reference for current JUNOS/EVO platforms (very old EOL platform, different feature-set). Thx for digging up refs @dplore but ideally the submitter puts this in the PR submission that already states this vs. ignoring the template (all too common I see on submissions) or if ever unsure - feel free to tag other implementor github handles.

For reference - we do not currently support this feature in JUNOS/EVO as of today.

Other than that a few questions/comments:

  • It appears we are bringing a long standing feature that's use has primarily been on-box diagnosis (run a command to give the operator an idea of calculated rates over the last sample period) - useful but another view of data already present that can be calculated off sibling/child leafs correct? If that is the only use then why additional nodes in a data-model and not handle this in downstream renderers?
  • The categorization is flat leafs at the state container - at a minimum I'd recommend a container such as rates to group (but I'd like to understand the prior point first)

earies avatar Apr 16 '24 18:04 earies

  • It appears we are bringing a long standing feature that's use has primarily been on-box diagnosis (run a command to give the operator an idea of calculated rates over the last sample period) - useful but another view of data already present that can be calculated off sibling/child leafs correct? If that is the only use then why additional nodes in a data-model and not handle this in downstream renderers?

We discussed this point in the OC Operator's review. The consensus was that we see a strong need to support on-box debugging, including configuring a custom the interface counter rate interval (load-interval in this model).

We didn't have a deep discussion on the actual bits/pkt rate counters themselves, but there was no objection to including these. Agreed the values are typically derived off device by a NMS from the octet/pkt counts.

dplore avatar Apr 16 '24 22:04 dplore

  • It appears we are bringing a long standing feature that's use has primarily been on-box diagnosis (run a command to give the operator an idea of calculated rates over the last sample period) - useful but another view of data already present that can be calculated off sibling/child leafs correct? If that is the only use then why additional nodes in a data-model and not handle this in downstream renderers?

We discussed this point in the OC Operator's review. The consensus was that we see a strong need to support on-box debugging, including configuring a custom the interface counter rate interval (load-interval in this model).

We didn't have a deep discussion on the actual bits/pkt rate counters themselves, but there was no objection to including these. Agreed the values are typically derived off device by a NMS from the octet/pkt counts.

Understood for the on-box debuggability aspect but I see this as something that doesn't need to be expanded in the data-model directly as it is presentation of other nodes depending on some configured value and underlying feature (specific interface rate feature at that). It's a question of data-model node growth vs. handling this in presentation.

The underlying feature is essentially a cache of sampled data over a period of time on which calculations can be ran so if there is support on-box of such a cache, a configurable load-interval should not even be necessary but a query to the cache with a value of time could return such results - not limited to "interface rates"

Essentially the equivalent of a lightweight on-box TSDB that maybe can selectively cache certain datapoints over a limited period of time is likely the better solution here imo.

earies avatar Apr 17 '24 21:04 earies

I don't think you're wrong Ebben. But seeing as we have two references that have full support, this is enough to move this forward.

I noted that JunOS show interface extensive shows a bandwidth (bps) rate, but does not expose what the "load-interval" is.

dplore avatar Apr 25 '24 16:04 dplore

Discussed in April 30, 2024 OC operators meeting without objection.

dplore avatar Apr 30 '24 16:04 dplore