router icon indicating copy to clipboard operation
router copied to clipboard

At least the metric `apollo_router_session_count_total` is violating Prometheus naming best-practices

Open frittentheke opened this issue 1 year ago • 2 comments

Describe the bug

The metric apollo_router_session_count_total is reporting

Number of currently connected clients (https://www.apollographql.com/docs/router/configuration/telemetry/instrumentation/standard-instruments/#session)

so it's a gauge as it's increasing and decreasing with the number of connections.

According to the Prometheus metric naming best-practice (and thus consumer expectations) this should NOT have the total suffix as this is documented to be used for accumulating counts only:

Note that an accumulating count has total as a suffix, in addition to the unit if applicable.

frittentheke avatar Aug 30 '24 16:08 frittentheke

Thanks for opening this. We agree. Unfortunately, this is one of our oldest metrics (perhaps one of the first!).

We admit we made a mistake with its naming, but changing it at this point would be a breaking change. Overall, we're trying to replace them with new better metrics that meet the requirements of Prometheus and OpenTelemetry naming conventions and best practices, but that's one at a time. We do believe that we have moved to a "good" place. We have this tracked internally as part of a different initiative, but this particular metric won't change until a major version release as we don't want to just emit twice the metrics unnecessarily either.

Hope this helps explain where we are on it! Let us know if you have any other thoughts.

abernix avatar Aug 30 '24 18:08 abernix

For what it's worth, in all the dashboards I've ever seen this on, I don't actually personally get a ton of value out of this metric versus just general request-based metrics that the router provides. That's particularly true since other infrastructure usually provides that insight in a more uniform way across a wider variety of products (e.g., Istio/Envoy/Linkerd/proxies usually cover this).

abernix avatar Aug 30 '24 18:08 abernix