opentelemetry-specification
opentelemetry-specification copied to clipboard
Add `Context` parameter to Enabled for metrics instruments
We should add Context as a parameter similarly to https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/logs/bridge-api.md#enabled.
Context is needed for allowing "context related" behavior. For instance, it enables to user to disable the instrument for some key/value in the context has been set. It can also used to see if the call is within a span. Related PR: https://github.com/open-telemetry/opentelemetry-specification/pull/4203
Created here.
Why?
Why?
Description updated. Thanks
There's no corresponding SDK behavior that would allow the enabled API to change its response based on context.
There's no corresponding SDK behavior that would allow the enabled API to change its response based on context.
A custom API implementation (e.g. SDK wrapper) could change its response based on context. The SDK can add such behavior in future.
I find that adding is important for Go that require passing Context explicitly. Adding it later to the API would be a burden. For reference: https://go.dev/wiki/CodeReviewComments#contexts.
CC @open-telemetry/go-maintainers, @open-telemetry/collector-maintainers
@open-telemetry/technical-committee since there is precedence for this already, we think that we can move forward with this, any objections? (cc @pellared @jpkrohling @danielgblanco)
Agree with @pellared that adding this is important for Go
IMO there is no need to modify the specification here.
To me, this is a Go-specific question. Golang uses explicit context, and all API calls should have context independent of whether OTel specifications explicitly mention context. There is no need to modify the OTel specification to add a Context parameter in Golang. Most OTel APIs rely on implicit context, like being able to get a thread-local context object. Golang does not have thread-local state, so we always expect a Context argument.
It shouldn't require a lot of imagination to see how context is useful for customized SDK purposes. A custom metric SDK could record metric events to the active span when the context is sampled, for example.
OTel APIs rely on implicit context
This is why in almost all places Context is added as required for languages requiring explicit context. I would prefer to have the specification consistent.
@pellared I can't find what you're referring to. In the metrics api.md the word "Context" appears very little, certainly not as a required parameter anywhere. For metrics, we see:
[Measurements](#measurement) recorded by synchronous instruments can be
associated with the [Context](../context/README.md).
this implies that Enabled() should accept a context.
In the trace api.md, Context is used but it refers to the OTel context concept, not the Golang Context object.
Moreover, if you look at the one operation that absolutely requires a context, tracer.Start(), we see it is permitted to be implicit:
The API MAY also have an option for implicitly using
the current Context as parent as a default behavior.
This seems to set a precedent that context can be implicit. The definition of Context itself also repeats this:
Depending on the language, its usage may be either explicit or implicit.
I still believe it is not necessary to modify the specification in this case.
@pellared I can't find what you're referring to.
@jmacd, I added all these in the description of https://github.com/open-telemetry/opentelemetry-specification/pull/4262
Context is used but it refers to the OTel context concept
Is there any reason that adding the OTel concept of context would be bad in this scenario? Especially given that in most languages it is always implicitly available.
I still believe it is not necessary to modify the specification in this case.
I prefer the specification to be less ambiguous. I would rather change the specification and call out that adding a Context parameter MAY be added to any operation if the language does not support implicit Context and requires passing Context explicitly.
+1 for making our specification explicit in its meaning.
+1 as well, for explicitness, but also consistency. When tracing mentions the context, we should also mention it for similar appropriate cases with other signals.
I would rather change the specification and call out that adding a Context parameter MAY be added to any operation if the language does not support implicit Context and requires passing Context explicitly.
Mentioned here, but will repeat again here: Including Context parameter to any operation doesn't seem right. Should languages with explicit context pass it when obtaining a tracer, meter, logger? Should all the span operations to add / update fields include a context argument? Including Context everywhere, even if the language requires explicit context propagation, seems like a bad API design.
I don't think there are actually many operations that benefit from Context. At least, not so many that its cumbersome to explicitly mention it as a parameter.
I do think Context is a critical foundation of OTEL. Any method which records telemetry should have it (at least optionally) included. This is the only way, e.g. to allow baggage to work, and future things like context-attributes.
However, this must also be balanced by the expense/cost of context. I think it's appropriate that we be super careful in terms of what we promote and push into context. For example, the reason "green threads" and "co-routines" have such a huge performance win over "raw threads" or "native threads" is due to the context passing - they are able to ignore MOST of the stack copying that would occur in raw threads. As OTEL adds back in things which must be passed, we need to understand the overhead of this and its usage.
To me this means we should assume the following:
- Things placed in context should be highly vetted, minimal and necessary horizontally for application o11y.
- Using context in should be relatively "cheap" as the cost has already been paid to propagate it.
- We should avoid death by a thousand cuts with context -> Due to its highly concurrent usage, if we can interact with any locks/concurrency primitives ONCE in instrumentation critical paths, that's better than many times.
To me this means a pattern of:
var myContext = doTheExpensiveConcurrentAccessThingToGetContext();
if (logger.isEnabled(myContext)) {
logger.log(myContext, ...expensive to compute things...)
}
Is fine under a few assumptions:
- Interacting with context does not need any future concurrency-safety-primitives.
- global "is enabled" check is less costly than performing the expensive operation (forcing cache clear to volatile read a boolean has non-zero cost in throughput).
- there is no way to pass the expensive computation to the
loggeritself, e.g. a thunk / function / callback.
Basically, I don't see Context adding significant burden here, given its design and use in OTEL. However, I also am concerned if we partition out enablement checks in very fine grained fashion that we may locally optimise a benchmark at the cost of overhead increasing across the application.
TL;DR; it would be nice to see some prototypes and benchmarks here to address concerns. I'm theoretically supportive, but I really want to understand what is being checked to understand if we're introducing a foot-gun here.
@jsuereth
Any method which records telemetry should have it (at least optionally) included.
Notice that Enabled is not recording telemetry. The question is whether we want to support cross-cutting concerns or any context related functionality for Enabled API.
In OTel Go, we might need the Context in the Enabled API even if the SDK doesn't use it right away.
For example, in future, the user may want the Logs SDK to filter out span events (emitted via Logs SDK) when they are not inside a recording span. This could be implemented in the SDK by adding disabled_not_recorded_spans to LoggerConfig. The SDK can check the context.Context to see if the Enabled call is inside a recording span. Then the caller would write:
if logger.Enabled(ctx, params) {
logger.Emit(ctx, createHeavyLogRecord(payload))
}
Alternatively, the caller could check if the span is being recorded by himself:
if trace.SpanFromContext(ctx).IsRecording() && logger.Enabled(params) {
logger.Emit(ctx, createHeavyLogRecord(payload))
}
This alternative might be better because the caller usually knows better whether the log record is related to tracing.
However, adding the Context later for any other reason would be a big burden for OTel Go users. We might have to add a Logger.EnabledContext method and deprecate Logger.Enabled, which would complicate backward compatibility. Starting with Context as a required parameter avoids this inconvenience.
But maybe we indeed do not want cross-cutting concerns for Enabled API?
On the other hand, there may be other API implementations than the SDK that want to support cross-cutting concerns.
I find the principle "Do not break users" very important. Adding Context parameter decreases this chance a lot (for OTel Go). In Go, it's common to pass a context.Context. This is why I still lean toward including it from the start: it minimizes disruptive API changes in future releases and supports custom API implementations and future SDK enhancements.
Most of the reasons above also apply to Enabled for metrics instruments.
Can you elaborate on
Interacting with context does not need any future concurrency-safety-primitives.
From https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/context/README.md#overview:
A
ContextMUST be immutable, and its write operations MUST result in the creation of a newContextcontaining the original values and the specified values updated.
We don't need synchronization when interacting with context because it is immutable.
global "is enabled" check is less costly than performing the expensive operation
I think this relates to the purpose of the Enabled API, not the Context parameter. Most logging libraries offer this feature to improve performance for expensive log records. I can add benchmarks to the OTEP.
there is no way to pass the expensive computation to the logger itself, e.g. a thunk / function / callback.
This might be possible for users of the Logs API when emitting events. But Enabled is crucial for bridging Enabled with other logging libraries such as https://pkg.go.dev/log/slog, https://pkg.go.dev/go.uber.org/zap, https://pkg.go.dev/github.com/go-logr/logr.
For example, in future, the user may want the Logs SDK to filter out span events (emitted via Logs SDK) when they are not inside a recording span. This could be implemented in the SDK by adding disabled_not_recorded_spans to LoggerConfig. The SDK can check the context.Context to see if the Enabled call is inside a recording span.
That's an interesting use case. And you've still retained the ability to have the SDK config be static (i.e. no user provided function that consumes context). I like it. 👍
We don't need synchronization when interacting with context because it is immutable.
Quick response to this: You need to get access to the current context. Context itself is immutable, but the notion of "which context is current" is not. You need to go through concurrency primitives to get access to it or (as in Go) you explicitly pass a Context downstream to methods.
From the specification SIG on 19-Nov, it sounds like there isn't a strong need to have the context parameter at this time. Will close this issue in 2 weeks if there's no one that brings a strong argument by then.
As the author of this issue I am OK on closing (for metrics) this issue as long as the @open-telemetry/technical-committee DOES NOT block having Context in Logger.Enabled. Let's give others 2 weeks to veto with my statement.
Thanks @pellared
I support closing this issue. I'll make an argument against using the context parameter to the metrics-enabled API.
In the Lightstep metrics SDK, we have added a MeasurementProcessor API which has been discussed in the OpenTelemetry specification as far back as the OpenCensus origin. I would like to see OTel add this feature!
MeasurementProcessor absolutely uses the context to derive metric attributes. However, to conditionally enable or disable a metric instrument based on context would introduce an undetectable bias in the data. For example, if you only enable a metric instrument when the context is traced, then the metric instrument will under-count depending on sampling rate and the consumer of the data will have no way of knowing this. Metric instruments should strive to measure everything--users should let the context be used to adjust which attributes are used, not to conditionalize.
While I typically want to future proof APIs, I support closing this issue since adding it later would be relatively easy and with a relatively low cost in terms of maintenance burden.
I shared this issue on the #otel-maintainers CNCF Slack channel: https://cloud-native.slack.com/archives/C01NJ7V1KRC/p1732034582753799 and I also added an item to the discussion topics in the Maintainers call to make sure other people see this. I am not able to join the maintainers call, so I would appreciate if somebody else raises this on my behalf :)
SIG meeting:
We agreed to close it. The languages need to still have a possibility to add parameters to Enabled without breaking the users. Everyone on the call was fine with the fact that OTel Go wants to still add context.Context to the API as it is also a language-specific thing (Go devs seemed even happy with it).