router
router copied to clipboard
Review `apollo_router_http_requests_total` metric usage
The behaviour of this metric is confusing right now. Looking at the telemetry plugin, it is supposed to be recorded on every response:
https://github.com/apollographql/router/blob/05a651df33e80b58c2727c06e4a1eb44d42cb3ec/apollo-router/src/plugins/telemetry/mod.rs#L853-L860
but is set in multiple places:
https://github.com/apollographql/router/blob/05a651df33e80b58c2727c06e4a1eb44d42cb3ec/apollo-router/src/plugins/telemetry/mod.rs#L1123-L1128
but it is also directly set in other places like https://github.com/apollographql/router/blob/05a651df33e80b58c2727c06e4a1eb44d42cb3ec/apollo-router/src/services/layers/query_analysis.rs#L75-L80
I looked into that because the content-negotiation layer does not set this metric when returning a HTTP 415: https://github.com/apollographql/router/blob/05a651df33e80b58c2727c06e4a1eb44d42cb3ec/apollo-router/src/services/layers/content_negotiation.rs#L55-L73
The apollo_router_operations_total metric, though, correctly record a 415 status
I think that this metric can largely be replaced once the work for custom instruments that @bnjjj is working on lands.
Will this be replaced by http.server.request.duration?
Yes this example of configuration should do the same job
telemetry:
instrumentation:
instruments:
router:
http.server.request.duration:
attributes:
error:
error: reason
operation_name:
response_context: operation_name
status:
response_status: code
This PR will help to add operation_name at the router service without playing with the response_context
Closing and we will look to review all metrics for the next major release.