opentelemetry-java-instrumentation icon indicating copy to clipboard operation
opentelemetry-java-instrumentation copied to clipboard

Spans produced by OkHttp integration can include the route as part of their name

Open dmedinag opened this issue 1 year ago • 6 comments

Background

First off, starting from the context of the OTel specification for HTTP spans, I must say that the current OkHttp spans do adhere to it. Quoting it,

HTTP spans MUST follow the overall guidelines for span names. HTTP server span names SHOULD be {method} {http.route} if there is a (low-cardinality) http.route available (see below for the exact definition of the {method} placeholder). If there is no (low-cardinality) http.route available, HTTP server span names SHOULD be {method}. HTTP client spans have no http.route attribute since client-side instrumentation is not generally aware of the “route”, and therefore HTTP client span names SHOULD be {method}. The {method} MUST be {http.request.method} if the method represents the original method known to the instrumentation. In other cases (when {http.request.method} is set to _OTHER), {method} MUST be HTTP. Instrumentation MUST NOT default to using URI path as span name, but MAY provide hooks to allow custom logic to override the default span name.

On the other hand, it's my understanding that the premise under which the standard is founded, i.e.

client-side instrumentation is not generally aware of the “route”

does not apply for OkHttp in particular, given that OkHttp interceptors provide direct access to the route before making the request.

Proposal

So my proposal here is: let's enrich the span name of client-side OkHttp spans so that they include the path group. Some examples:

Request Current span name Proposed span name
GET request to /banana GET GET /banana
POST request to /customer/123 POST POST /customer/{id}

Notes on adherence to the standard

  1. The specification uses the word “SHOULD” and not “MUST” when it comes to naming client HTTP spans
  2. The proposed solution does not involve a backwards incompatible change, but an enrichment of the current span names
  3. The span name remains as low cardinality by ensuring we use the path group and not full URI in case of path variables. This goes in line with the standard restriction above:

Instrumentation MUST NOT default to using URI path as span name, but MAY provide hooks to allow custom logic to override the default span name.

Describe alternatives you've considered

An obvious alternative to monitoring the client-side observed latency of a target API per endpoint is to issue the metrics directly as a histogram from the client side. This is what my company's doing at the moment as an alternative and it works fine, but requires app-level instrumentation that can't be centralised (at the scale of an organisation) as easily as the collector configuration.

Why?

Other than more complete span names, the proposed solution enables granular (per endpoint) client-side latency monitoring when combined with trace-derived metrics, as generated by connectors like spanmetrics or datadog.

Without this information in the span name, these metrics aggregate the data for all clients within an app, to all resources, which renders RED data useless.

dmedinag avatar Mar 25 '24 13:03 dmedinag

hi @dmedinag! can you add a bit more detail on how we can get /customer/{id} template from the full path /customer/123? thanks

trask avatar Mar 26 '24 00:03 trask

I'm not sure exactly how they do it, but it must be possible since Datadog does this with their java agent 🤔

dmedinag avatar Mar 26 '24 08:03 dmedinag

@tylerbenson do you have any knowledge about this. Does dd really do this in the agent or perhaps is it done in the backend?

laurit avatar Mar 26 '24 11:03 laurit

I'm not aware of this being done for OkHttp, only for server spans with an instrumented routing framework. Like Trask, I'd be curious to learn more about how this is expected to work. I think it would also be helpful to get a code snippet of how it looks like on the OkHttp side of things.

tylerbenson avatar Mar 26 '24 15:03 tylerbenson

I've received confirmation from a dev at Datadog that they don't do this. They do have simple URL normalizers that fill the sections in question with ?, but not based on any route. (SimpleHttpPathNormalizer and AntPatternHttpPathNormalizer)

In the example provided, it'll name the span POST /customer/?.

tylerbenson avatar Mar 27 '24 02:03 tylerbenson

I've received confirmation from a dev at Datadog that they don't do this. They do have simple URL normalizers that fill the sections in question with ?, but not based on any route. (SimpleHttpPathNormalizer and AntPatternHttpPathNormalizer)

Yes this is what I meant. It's certainly not as powerful as what we can do at the server side spans, but it still opens up an important feature set IMO. Having a span with name POST /customer/? is much more informative than just POST, specially when the server side spans are outside of your domain (so you don't have access to their observability in your trace)

dmedinag avatar Apr 05 '24 11:04 dmedinag