opentelemetry-specification
opentelemetry-specification copied to clipboard
Add an `Enabled` method to the `Logger`
Knowing if a LogRecord will be dropped prior to performing computationally expensive operations helps developers using the Logs Bridge API avoid wasted and expensive work. To facilitate this, a Logger should provide a function on a Logger that users can call to determine if a LogRecord will be recorded or dropped.
Proposal
- Add an
Enabledfunction to theLoggerin the Logs Bridge API.- This functionality will be a recommendation
- If there are more idiomatic ways for a language to achieve this, they are recommended to do that.
- If a language provides this, the functionality will accept at a minimum:
context(optional): the Context associated with theLogRecordthat will be logged.- This parameter is useful to determine if some tracing information will be included. Or, in languages with more expansive definitions of
context, other dynamic values included in a context (i.e. HTTP header information).
- This parameter is useful to determine if some tracing information will be included. Or, in languages with more expansive definitions of
severity number: The Severity Number of theLogRecordthat will be logged.- Other parameters that a particular language deems necessary and helpful to its users may be accepted as well.
- If a language provides this, the functionality will respond with the following
- A language idiomatic boolean (e.g. a built-in type or some other type representing one).
- This returned boolean will be
truewhen logging is enabled, but may betrueorfalse, when logging is disabled (this gives room for memory optimization or indeterminable state).
- This returned boolean will be
- A language idiomatic boolean (e.g. a built-in type or some other type representing one).
- This functionality will be a recommendation
Alternates
Provide a GetSeverityLevel function
- Add a
GetServerityLevelfunction to theLoggerin the Logs Bridge API.- Accepts no parameters
- Returns the minimum Severity Number a
Loggerwill record aLogRecordfor
This approach will lose a lot of easy value gained by including a context. Simple look-up of values within the context can often determine if a LogRecord will be dropped or not. This approach would miss that.
Allow language to define their own approach
Instead of prescribing something that all languages should do. Call out that languages may extend the Logger to provide some form of this functionality and that the specification will always leave this unspecified so as not to conflict with these implementations.
This approach is the opposite of having a specification driven project where all implementation look the same.
Additional context.
Multiple logging libraries in Go provide this optimization^1^3. If the Go SIG is going to be able to support these critical logging systems we need this functionality in the Logs Bridge API.
cc @open-telemetry/specs-logs-approvers @pellared
otel-rust added a similar method under feature flag - https://github.com/open-telemetry/opentelemetry-rust/pull/1147 otel-cpp also has a similar approach - https://github.com/open-telemetry/opentelemetry-cpp/blob/07f6cb54ece56691dbd2a94b0cbeec722ff6a631/api/include/opentelemetry/logs/logger.h#L259
When is a Logger from the bridge API disabled? I proposed adding the ability to disable tracer / meter / logger via SDK configuration in #3877, and extending that to APIs on tracer / meter / logger is a natural addition. It would also be natural to add a logger level severity threshold. But as of today, I don't believe there is any mechanism for a logger to be disabled, or for a severity threshold to be specified.
The idea is that these options are controlled in the log framework being bridged, and that only logs which are enabled make it to the point where the bridge API is called.
The idea is that these options are controlled in the log framework being bridged, and that only logs which are enabled make it to the point where the bridge API is called.
Many (maybe all?) of the Go logging libraries take the other approach. They assume the downstream handlers will tell them if a log will actually be used or not. Having the bridge API respond appropriately is going to be critical here.
I still need to look at the links @lalitb posted, but I'm guessing the same is true in other language based on their addition of this.
When is a
Loggerfrom the bridge API disabled? I proposed adding the ability to disable tracer / meter / logger via SDK configuration in #3877, and extending that to APIs on tracer / meter / logger is a natural addition. It would also be natural to add a logger level severity threshold. But as of today, I don't believe there is any mechanism for a logger to be disabled, or for a severity threshold to be specified.
Yeah, that is an interesting point. We don't really have any threshold configuration downstream in the SDK or even in the exporters. In fact, we don't seem to have any definition of the OTLP exporter at all (that I can find at least?).
@jack-berg I think to your point we need to have a way for exporters to be configured for a severity threshold and pass that all the way back up to the API. Otherwise, we're setting users up with a situation where they are going to generate fire-hoses of logs data and have no relief valve.
@jack-berg to the question of how do you disable logging, (outside of your proposal) the No-Op implementation should disable logging, right? Ideally users of an API can get informed about this backing prior to doing something computationally expensive.
No other parameters will be allowed so as to minimize the complexity of this method (and keep it useful).
Certain implementations might want to have "Enabled(level, event_id)", and if they can implement it efficiently, we shouldn't stop them from doing so.
If a language provides this, the functionality will only be allowed to respond with a language idiomatic boolean (e.g. a built-in type or some other type representing one).
One related thinking - the boolean value MUST be true if the log is enabled. It can be true or false if the log is disabled (this gives room for memory optimization).
The returned value will be required to be static (passing the same arguments will always return the same value), meaning the response can be cached.
Would you elaborate a bit on this? This sounds like the backend cannot change the configuration on-the-fly, which seems to be an important feature (I think it is very common to turn on verbose log for a short period of time without having to restart the application).
No other parameters will be allowed so as to minimize the complexity of this method (and keep it useful).
Certain implementations might want to have "Enabled(level, event_id)", and if they can implement it efficiently, we shouldn't stop them from doing so.
If a language provides this, the functionality will only be allowed to respond with a language idiomatic boolean (e.g. a built-in type or some other type representing one).
One related thinking - the boolean value MUST be true if the log is enabled. It can be true or false if the log is disabled (this gives room for memory optimization).
:+1:
The returned value will be required to be static (passing the same arguments will always return the same value), meaning the response can be cached.
Would you elaborate a bit on this? This sounds like the backend cannot change the configuration on-the-fly, which seems to be an important feature (I think it is very common to turn on verbose log for a short period of time without having to restart the application).
Yeah, I've been thinking about this a bit as well. What we want is to support situations, like you say, where the value may change, but also situations where the bridge shouldn't waste extra time rechecking something that won't change.
Giving this some thought, I think what we should beak-up the information returned. We should require a bool return value signifying the enabled state, but also allow for an additional return value that states if the value can be cached or not. This second return value can be optional.
@reyang what do you think?
@reyang updated the description. PTAL.
Yeah, I've been thinking about this a bit as well. What we want is to support situations, like you say, where the value may change, but also situations where the bridge shouldn't waste extra time rechecking something that won't change.
Yeah. There are some tricks how to make it very efficient. For example, the check can be optimized to just testing a bit (or byte) in the RAM (so each check is just 1~3 CPU cycles), and the memory can be updated asynchronously by the control plane.
We can remove the caching return value given the caching can live closer to where any dynamic behavior will live.
We can remove the caching return value given the caching can live closer to where any dynamic behavior will live.
Updated.