opentelemetry-specification Add an `Enabled` method to the `Logger`

trafficstars

Knowing if a LogRecord will be dropped prior to performing computationally expensive operations helps developers using the Logs Bridge API avoid wasted and expensive work. To facilitate this, a Logger should provide a function on a Logger that users can call to determine if a LogRecord will be recorded or dropped.

Proposal

Add an Enabled function to the Logger in the Logs Bridge API.
- This functionality will be a recommendation
  - If there are more idiomatic ways for a language to achieve this, they are recommended to do that.
- If a language provides this, the functionality will accept at a minimum:
  - context (optional): the Context associated with the LogRecord that will be logged.
    - This parameter is useful to determine if some tracing information will be included. Or, in languages with more expansive definitions of context, other dynamic values included in a context (i.e. HTTP header information).
  - severity number: The Severity Number of the LogRecord that will be logged.
  - Other parameters that a particular language deems necessary and helpful to its users may be accepted as well.
- If a language provides this, the functionality will respond with the following
  - A language idiomatic boolean (e.g. a built-in type or some other type representing one).
    - This returned boolean will be true when logging is enabled, but may be true or false, when logging is disabled (this gives room for memory optimization or indeterminable state).

Alternates

Provide a `GetSeverityLevel` function

Add a GetServerityLevel function to the Logger in the Logs Bridge API.
- Accepts no parameters
- Returns the minimum Severity Number a Logger will record a LogRecord for

This approach will lose a lot of easy value gained by including a context. Simple look-up of values within the context can often determine if a LogRecord will be dropped or not. This approach would miss that.

Allow language to define their own approach

Instead of prescribing something that all languages should do. Call out that languages may extend the Logger to provide some form of this functionality and that the specification will always leave this unspecified so as not to conflict with these implementations.

This approach is the opposite of having a specification driven project where all implementation look the same.

Additional context.

Multiple logging libraries in Go provide this optimization^1 ^3. If the Go SIG is going to be able to support these critical logging systems we need this functionality in the Logs Bridge API.

Feb 29 '24 22:02 MrAlias

cc @open-telemetry/specs-logs-approvers @pellared

Feb 29 '24 22:02 MrAlias

otel-rust added a similar method under feature flag - https://github.com/open-telemetry/opentelemetry-rust/pull/1147 otel-cpp also has a similar approach - https://github.com/open-telemetry/opentelemetry-cpp/blob/07f6cb54ece56691dbd2a94b0cbeec722ff6a631/api/include/opentelemetry/logs/logger.h#L259

Feb 29 '24 22:02 lalitb

When is a Logger from the bridge API disabled? I proposed adding the ability to disable tracer / meter / logger via SDK configuration in #3877, and extending that to APIs on tracer / meter / logger is a natural addition. It would also be natural to add a logger level severity threshold. But as of today, I don't believe there is any mechanism for a logger to be disabled, or for a severity threshold to be specified.

The idea is that these options are controlled in the log framework being bridged, and that only logs which are enabled make it to the point where the bridge API is called.

Feb 29 '24 22:02 jack-berg

The idea is that these options are controlled in the log framework being bridged, and that only logs which are enabled make it to the point where the bridge API is called.

Many (maybe all?) of the Go logging libraries take the other approach. They assume the downstream handlers will tell them if a log will actually be used or not. Having the bridge API respond appropriately is going to be critical here.

I still need to look at the links @lalitb posted, but I'm guessing the same is true in other language based on their addition of this.

Feb 29 '24 22:02 MrAlias

When is a Logger from the bridge API disabled? I proposed adding the ability to disable tracer / meter / logger via SDK configuration in #3877, and extending that to APIs on tracer / meter / logger is a natural addition. It would also be natural to add a logger level severity threshold. But as of today, I don't believe there is any mechanism for a logger to be disabled, or for a severity threshold to be specified.

Yeah, that is an interesting point. We don't really have any threshold configuration downstream in the SDK or even in the exporters. In fact, we don't seem to have any definition of the OTLP exporter at all (that I can find at least?).

Feb 29 '24 22:02 MrAlias

@jack-berg I think to your point we need to have a way for exporters to be configured for a severity threshold and pass that all the way back up to the API. Otherwise, we're setting users up with a situation where they are going to generate fire-hoses of logs data and have no relief valve.

Feb 29 '24 22:02 MrAlias

@jack-berg to the question of how do you disable logging, (outside of your proposal) the No-Op implementation should disable logging, right? Ideally users of an API can get informed about this backing prior to doing something computationally expensive.

Feb 29 '24 22:02 MrAlias

No other parameters will be allowed so as to minimize the complexity of this method (and keep it useful).

Certain implementations might want to have "Enabled(level, event_id)", and if they can implement it efficiently, we shouldn't stop them from doing so.

If a language provides this, the functionality will only be allowed to respond with a language idiomatic boolean (e.g. a built-in type or some other type representing one).

One related thinking - the boolean value MUST be true if the log is enabled. It can be true or false if the log is disabled (this gives room for memory optimization).

The returned value will be required to be static (passing the same arguments will always return the same value), meaning the response can be cached.

Would you elaborate a bit on this? This sounds like the backend cannot change the configuration on-the-fly, which seems to be an important feature (I think it is very common to turn on verbose log for a short period of time without having to restart the application).

Mar 01 '24 16:03 reyang

No other parameters will be allowed so as to minimize the complexity of this method (and keep it useful).

Certain implementations might want to have "Enabled(level, event_id)", and if they can implement it efficiently, we shouldn't stop them from doing so.

If a language provides this, the functionality will only be allowed to respond with a language idiomatic boolean (e.g. a built-in type or some other type representing one).

One related thinking - the boolean value MUST be true if the log is enabled. It can be true or false if the log is disabled (this gives room for memory optimization).

:+1:

Mar 01 '24 17:03 MrAlias

The returned value will be required to be static (passing the same arguments will always return the same value), meaning the response can be cached.

Would you elaborate a bit on this? This sounds like the backend cannot change the configuration on-the-fly, which seems to be an important feature (I think it is very common to turn on verbose log for a short period of time without having to restart the application).

Yeah, I've been thinking about this a bit as well. What we want is to support situations, like you say, where the value may change, but also situations where the bridge shouldn't waste extra time rechecking something that won't change.

Giving this some thought, I think what we should beak-up the information returned. We should require a bool return value signifying the enabled state, but also allow for an additional return value that states if the value can be cached or not. This second return value can be optional.

@reyang what do you think?

Mar 01 '24 17:03 MrAlias

@reyang updated the description. PTAL.

Mar 01 '24 17:03 MrAlias

Yeah, I've been thinking about this a bit as well. What we want is to support situations, like you say, where the value may change, but also situations where the bridge shouldn't waste extra time rechecking something that won't change.

Yeah. There are some tricks how to make it very efficient. For example, the check can be optimized to just testing a bit (or byte) in the RAM (so each check is just 1~3 CPU cycles), and the memory can be updated asynchronously by the control plane.

Mar 01 '24 20:03 reyang

We can remove the caching return value given the caching can live closer to where any dynamic behavior will live.

Mar 07 '24 18:03 MrAlias

We can remove the caching return value given the caching can live closer to where any dynamic behavior will live.

Updated.

Mar 07 '24 20:03 MrAlias

opentelemetry-specification opentelemetry-specification copied to clipboard

Add an `Enabled` method to the `Logger`

Proposal

Alternates

Provide a GetSeverityLevel function

Allow language to define their own approach

Additional context.

opentelemetry-specification
opentelemetry-specification copied to clipboard

Provide a `GetSeverityLevel` function