opentelemetry-java-instrumentation
opentelemetry-java-instrumentation copied to clipboard
Spans produced by OkHttp integration can include the route as part of their name
Background
First off, starting from the context of the OTel specification for HTTP spans, I must say that the current OkHttp spans do adhere to it. Quoting it,
HTTP spans MUST follow the overall guidelines for span names. HTTP server span names SHOULD be
{method} {http.route}if there is a (low-cardinality) http.route available (see below for the exact definition of the{method}placeholder). If there is no (low-cardinality)http.routeavailable, HTTP server span names SHOULD be{method}. HTTP client spans have nohttp.routeattribute since client-side instrumentation is not generally aware of the “route”, and therefore HTTP client span names SHOULD be{method}. The{method}MUST be{http.request.method}if the method represents the original method known to the instrumentation. In other cases (when{http.request.method}is set to_OTHER),{method}MUST beHTTP. Instrumentation MUST NOT default to using URI path as span name, but MAY provide hooks to allow custom logic to override the default span name.
On the other hand, it's my understanding that the premise under which the standard is founded, i.e.
client-side instrumentation is not generally aware of the “route”
does not apply for OkHttp in particular, given that OkHttp interceptors provide direct access to the route before making the request.
Proposal
So my proposal here is: let's enrich the span name of client-side OkHttp spans so that they include the path group. Some examples:
| Request | Current span name | Proposed span name |
|---|---|---|
GET request to /banana |
GET |
GET /banana |
POST request to /customer/123 |
POST |
POST /customer/{id} |
Notes on adherence to the standard
- The specification uses the word “SHOULD” and not “MUST” when it comes to naming client HTTP spans
- The proposed solution does not involve a backwards incompatible change, but an enrichment of the current span names
- The span name remains as low cardinality by ensuring we use the path group and not full URI in case of path variables. This goes in line with the standard restriction above:
Instrumentation MUST NOT default to using URI path as span name, but MAY provide hooks to allow custom logic to override the default span name.
Describe alternatives you've considered
An obvious alternative to monitoring the client-side observed latency of a target API per endpoint is to issue the metrics directly as a histogram from the client side. This is what my company's doing at the moment as an alternative and it works fine, but requires app-level instrumentation that can't be centralised (at the scale of an organisation) as easily as the collector configuration.
Why?
Other than more complete span names, the proposed solution enables granular (per endpoint) client-side latency monitoring when combined with trace-derived metrics, as generated by connectors like spanmetrics or datadog.
Without this information in the span name, these metrics aggregate the data for all clients within an app, to all resources, which renders RED data useless.
hi @dmedinag! can you add a bit more detail on how we can get /customer/{id} template from the full path /customer/123? thanks
I'm not sure exactly how they do it, but it must be possible since Datadog does this with their java agent 🤔
@tylerbenson do you have any knowledge about this. Does dd really do this in the agent or perhaps is it done in the backend?
I'm not aware of this being done for OkHttp, only for server spans with an instrumented routing framework. Like Trask, I'd be curious to learn more about how this is expected to work. I think it would also be helpful to get a code snippet of how it looks like on the OkHttp side of things.
I've received confirmation from a dev at Datadog that they don't do this. They do have simple URL normalizers that fill the sections in question with ?, but not based on any route. (SimpleHttpPathNormalizer and AntPatternHttpPathNormalizer)
In the example provided, it'll name the span
POST /customer/?.
I've received confirmation from a dev at Datadog that they don't do this. They do have simple URL normalizers that fill the sections in question with
?, but not based on any route. (SimpleHttpPathNormalizerandAntPatternHttpPathNormalizer)
Yes this is what I meant. It's certainly not as powerful as what we can do at the server side spans, but it still opens up an important feature set IMO. Having a span with name POST /customer/? is much more informative than just POST, specially when the server side spans are outside of your domain (so you don't have access to their observability in your trace)