opentelemetry-specification
opentelemetry-specification copied to clipboard
Stabilize Logger.Enabled
Stabilize Logger.Enabled API
Blockers:
- https://github.com/open-telemetry/opentelemetry-specification/pull/4203
- https://github.com/open-telemetry/opentelemetry-specification/issues/4207
- https://github.com/open-telemetry/opentelemetry-specification/pull/4221
- https://github.com/open-telemetry/opentelemetry-specification/issues/4220
Question to @open-telemetry/technical-committee: Do we want to stabilize the Logger.Enabled API sooner than we stabilize the spec defining how SDK implements it? Or do we want to stabilize Enabled for API and SDK at the same time?
Question to @open-telemetry/technical-committee: Do we want to stabilize the Logger.Enabled API sooner than we stabilize the spec defining how SDK implements it? Or do we want to stabilize Enabled for API and SDK at the same time?
Same for Metrics too: https://github.com/open-telemetry/opentelemetry-specification/pull/4219/files#r1767789558
The lack of stabilization of Logger.Enabled API blocks stabilization of OTel Go Logs.
Logger.Enabled API is required for bridging most popular Go logging libraries (including slog from the Go standard library).
From OTel Go perspective, the SDK support can be experimental. See: https://pkg.go.dev/go.opentelemetry.io/otel/sdk/log/internal/x.
This is currently the only known blocker for stabilizing the OTel Go Logs.
@open-telemetry/technical-committee, are you able to revalidate if the issues listed as blockers are still seen as blockers or if they can be addressed after stabilization of Logger.Enabled in Logs Bridge API?
Personally, I think the main blocker is to have at least 3 prototypes of the API in different languages.
To clarify the process: we expect 3 prototypes in 3 different languages that can be used by the end users, so that they can try the feature, provide feedback, submit bugs and issues about it. This is a necessary process before the spec section is marked "Stable".
From this perspective a PR does not counts as a prototype since it is not easily usable by the end users. A PR is fine for proposing new experimental features and demonstrating how they would work, but it is not enough for stabilizing the spec.
The lack of stabilization of Logger.Enabled API blocks stabilization of OTel Go Logs. Logger.Enabled API is required for bridging most popular Go logging libraries (including slog from the Go standard library).
@pellared you either need to find a way to have unstable APIs in Go or wait until other languages implement the prototypes. Either way the ability to have unstable APIs is very valuable and this is likely to come up again as Otel evolves and we keep adding new experimental APIs to existing signals.
--
As a side note: I encourage using maturity levels between "Development" and "Stable" to signal increasing level of confidence in the capability (both in the spec and the SDK). For example if we have 1-2 prototypes then we can move the maturity level of the feature from "Development" to "Alpha" or "Beta" to signal it is moving closer to the "Stable" state.
OK. See we need 3 different languages to have it released as experimental API.
Here is how the experimental Logger.Enabled API is currently defined in a 3 languages:
- C++: https://github.com/open-telemetry/opentelemetry-cpp/blob/f69963f587b148d2eb20f7d30f8b4a1ccb184a6f/api/include/opentelemetry/logs/logger.h#L251-L276
- Rust: https://github.com/open-telemetry/opentelemetry-rust/blob/8d84a76b8f8aaa0d625ab67a16320680c276f6b5/opentelemetry/src/logs/logger.rs#L23-L25
- Go: https://github.com/open-telemetry/opentelemetry-go/blob/a2347542010c929ab30ff63529b4c7531af35334/log/logger.go#L33-L53
I will do my best to work on this with others to move this forward (as we have inconsistencies).
@pellared you either need to find a way to have unstable APIs in Go
All major log bridges need it so it does not even make sense to stabilize the rest as the Logs API would not be usable in Go ecosystem. From https://github.com/open-telemetry/opentelemetry-specification/issues/3917:
Multiple logging libraries in Go provide this optimization^1^3. If the Go SIG is going to be able to support these critical logging systems we need this functionality in the Logs Bridge API.
or wait until other languages implement the prototypes.
We then need to wait for other languages to add it.
As a side note: I encourage using maturity levels between "Development" and "Stable" to signal increasing level of confidence in the capability (both in the spec and the SDK)
I am not sure if we can use Alpha and Beta in the spec documents as these are not defined in https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/document-status.md#lifecycle-status.
I am not sure if we can use
AlphaandBetain the spec documents as these are not defined in https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/document-status.md#lifecycle-status.
We can bring as many levels from OTEP 0232 to the spec as we will believe is useful. I started with 3 but we can bring more if we feel there is value. I personally think it can be valuable to have more granularity between Development (the most immature) and Stable (the most mature). It is an important signal and having just a binary value for it I think is not nuanced enough. Stabilization is a process, often a long one at Otel. As you move along that process it is important to indicate the progress by updating the level labels.
From this perspective a PR does not counts as a prototype since it is not easily usable by the end users. A PR is fine for proposing new experimental features and demonstrating how they would work, but it is not enough for stabilizing the spec.
FWIW this is a change in policy. Many features have been stabilized that relied on "Go implementations" which were just PRs.
I'm not sure it is fair to make this change in policy in such an ad hoc manner.
@MrAlias my post is a result of a discussion by a few TCs while triaging this issue, so it is not an official policy change yet.
I tried finding the current policy but couldn't. This document does not seem to have an opinion about what criteria must be met before spec stabilization.
If anyone is aware of where do we state how many prototypes are needed please post the link.
If the policy does not exist in written form or we need to modify it I will create an issue so that we can discuss and formalize it.
Let's keep this issue open for now so that we can apply consistent rules after we clarify what the rules are.
I am gonna move this back to TC inbox.
Removed from TC inbox. The prototype requirement is being separately tracked, and there are other blockers preventing stability.
Labeling with follow-up, as I understand we need the 3 implementations before this can be declared stable.
removing triage:followup since the automation will automatically add it back 2 weeks after additional comments / ref links
@jack-berg, I have question on what is required regarding the SDK implementation specification/design to unblock the stabilization of Logger.Enabled API.
- Is an OTEP like this good enough: https://github.com/open-telemetry/opentelemetry-specification/pull/4290?
- Or maybe addressing is good enough: https://github.com/open-telemetry/opentelemetry-specification/issues/4364 would be good enough?
- Or both of them are required (OTEP + some experimental feature in specification)?
- Something else?
From our OTel Go experience, it is better to stabilize the API first and then gather feedback for months before stabilizing anything in the SDK. In OTel Go all our logging bridges use Logger.Enabled and we have an experimental support for adding hooks on the SDK side (as described in the OTEP). Nobody reported any issue or negative feedback since a few months.
The stabilization of Logger.Enabled on the API side is one of our main blockers for releasing a stable OTel Go Logs API and SDK.
I also want to call out that we already have 4 working prototypes of Logger.Enabled:
They have slight differences in the accepted parameters and I think this needs to be sorted out.
@open-telemetry/cpp-maintainers, @open-telemetry/php-maintainers, @open-telemetry/rust-maintainers, I need your help here.
@jack-berg, I have question on what is required regarding the SDK implementation specification/design to unblock the stabilization of Logger.Enabled API.
Honestly, I've been struggling to keep up with / track all the open issues / PRs related to this.
The current state of the operation is that enabled accepts context and severity number parameters. My original critique was that we shouldn't let API drift away, or stabilize, without corresponding features in the SDK. We could of overlook this requirement, given that the usefulness of the parameters seems pretty obvious, but I think doing so would set a bad precedent so I would vote against it.
Your OTEP #4290 tries to establish this corresponding SDK behavior. I agree with some of the content in there, and in particular, think that adding LoggerConfig#min_severity and LoggerConfig#trace_based properties are no brainers. I wonder if anyone would disagree? If not, then adding those two properties (along with extending #4381 to reference them) seems like the easiest way forward to me.
I also want to call out that we already have 4 working prototypes of Logger.Enabled:
Five - I originally proposed this experimental method with an accompanying java prototype 🙂
adding
LoggerConfig#min_severityandLoggerConfig#trace_basedproperties are no brainers. I wonder if anyone would disagree? If not, then adding those two properties (along with extending #4381 to reference them) seems like the easiest way forward to me.
👍
I also want to call out that we already have 4 working prototypes of
Logger.Enabled
PHP's implementation of enabled hasn't caught up to more recent spec changes, so is based on https://github.com/open-telemetry/opentelemetry-specification/blob/v1.34.0/specification/logs/bridge-api.md#enabled
When we do catch up, I'll follow the latest spec (although I might wait for this effort to complete first)
in particular, think that adding
LoggerConfig#min_severityandLoggerConfig#trace_basedproperties are no brainers. I wonder if anyone would disagree?
I do not see them as no brainers. I am personally not convinced that these are good proposals.
Regarding LoggerConfig#min_severity: https://github.com/open-telemetry/opentelemetry-specification/pull/4290#discussion_r1927546170
Regarding LoggerConfig#trace_based: https://github.com/open-telemetry/opentelemetry-specification/pull/4290#discussion_r1898672657
I find adding opt-in Enabled to LogRecordProcessor more flexible, cohesive, composable. I think this is the route I will try to pursue even though I am aware that it will be harder for me, I think this is simply a better design and the way to go.
I think this is simply a better design and the way to go.
I've said something to this effect in other comments, but extending LogRecordProcessor with enabled and the work to keep LogRecordProcessor mutations isolated from each other doesn't have precedence in the tracing or metrics signals. Generally, its trying to bring a collector-style pipeline paradigm to the SDKs.
We could do this. Some users will benefit from it, preferring to do pipeline style work in the SDK vs. the collector. But most users will have a single batch log record processor paired with the OTLP exporter. And these users will want / expect easy ways to configure which logs make it into their pipeline. Filtering logs by severity is table stakes for any log system. For a system like OpenTelemetry that prioritizes correlation across signals, filtering logs based on whether the active span is sampled is obvious low hanging fruit.
So if we need some mechanism for filtering logs by severity and trace context, the next question is where does that configuration mechanism live. Options:
- At the logger provider level, analog to how exemplars are configurable at the meter provider level. This is too broad a brush stroke for severity. Maybe trace context could get away with a global config like this - not sure.
- At the logger level (i.e. LoggerConfig). This gives users plenty of granularity, while also having the ability to paint broad brush strokes since scope config allows you to do pattern matching of the form "set min severity to info for all loggers matching 'foo*'". Downside is that logger config would apply to all processors: in the event that SDKs go in a pipeline direction, the logs that a pipeline would get would be the intersection of the records that pass LoggerConfig criteria and the LogProcessor criteria. Not ideal to have competing config concepts.
- At the processor level. This is your proposal - bringing pipelines to the SDK and giving each processor to decide which log records they are interested in. Processor A wants all logs from the "foo" logger at severity "info" or higher; Processor B wants all logs from the "foo" logger at severity "debug" or higher.
I'm opposed to bringing proper pipeline support to SDKs (its currently possible but you have to jump through hoops) because its a large implementation burden that has to be paid 11 times (once for each language implementation), and it duplicates the capabilities of the general purpose and extremely powerful collector. From a prioritization standpoint, asking resource-constrained language maintainers to implement better pipeline tooling doesn't seem like a good use of time right now, given all the other project objectives - especially semconv and stable instrumentation.
Not bringing proper pipeline support to SDKs means the users who want to do collector-style things in SDKs have to jump through more hoops with worse ergonomics. This isn't ideal, but tradeoffs.
If the community decides to that bringing proper pipeline support to SDKs is important, I do think its important to solve it holistically, and look at how the concepts apply to traces and metrics as well as logs.
If the community decides to that bringing proper pipeline support to SDKs is important, I do think its important to solve it holistically, and look at how the concepts apply to traces and metrics as well as logs.
👍
According to spec compliance matrix it is implemented in 3 languages:
https://github.com/open-telemetry/opentelemetry-specification/blob/d2035751f3fc89a1ef44eb883dac63facc913138/spec-compliance-matrix.md?plain=1#L202
I saw that Logger.Enabled is also added to PHP (which gives 4 languages):
https://github.com/open-telemetry/opentelemetry-php/blob/015da800aa005d69f1b54a54c28af9173e236c7f/src/API/Logs/LoggerInterface.php#L15-L20
@open-telemetry/technical-committee, are there any required steps to mark this as stable?
Did a quick peak at the go, rust, cpp, php implementations:
- Go - 👍
- Rust - has an
event_enabledmethod which appears to accept severity and event name arguments. Not sure if this is supposed to representedLogger.Enabled - cpp - 👍 also accepts an EventId which I don't quite understand - maybe a cpp specific construct?
- php - doesnt accept any parameters
- java - Java also has a
enabledmethod, but I didn't fill out the compliance table because its still experimental and that's a grey area. It doesn't accept any parameters yet, but will be straight forward to add.
Personally, I'm comfortable stabilizing Logger#Enabled even if the prototype implementations don't currently all perfectly represent the spec. The parameters are intuitive and should be easy to add to implementations where they are missing.
Rust - has an event_enabled method which appears to accept severity and event name arguments. Not sure if this is supposed to represented Logger.Enabled
Otel Rust added more parameters than the spec has. The extra parameters are compile time generated/static in most logging libraries in Rust and is readily available. So it won't cost extra, but allows more advanced filtering.
@jack-berg, thanks. I think it would be beneficial to have approval from at least one maintainer per language listed above. If nobody objects I can create a PR tomorrow.
cpp - 👍 also accepts an EventId which I don't quite understand - maybe a cpp specific construct?
Yes. Otel C++'s logging API was meant for end-user usage too, even before Spec made that relaxation recently. (From Rust side, we wish to have an EventId too, but none of the existing libraries in Rust has that concept)
According to https://github.com/open-telemetry/opentelemetry-specification/issues/4208#issuecomment-2608496302 PHP should also be fine. @brettmc, am I correct?
According to https://github.com/open-telemetry/opentelemetry-specification/issues/4208#issuecomment-2608496302 PHP should also be fine. @brettmc, am I correct?
@pellared correct, that won't be an issue for PHP.