opentelemetry-python Record logger name as the instrumentation scope name

As spec'd here https://github.com/open-telemetry/opentelemetry-specification/pull/2359

Feb 26 '22 11:02 srikanthccv

A follow up question for this feature -

Currently LoggingHandler holds a single _logger object which uses a single scope name. Since the same handler may be used by multiple loggers with different names, it will require initiating new internal Logger objects on demand.

Is this a design change we are willing to make?

Oct 27 '22 10:10 ronyis

Correct, the broader agreement was to maintain a map of scope: logger and use them. It would be great if we could have benchmarking numbers and memory requirements.

Oct 27 '22 10:10 srikanthccv

Thanks @srikanthccv . Is there a reference for this decision? What kind of benchmarks do you mean?

Oct 27 '22 11:10 ronyis

I don't know if it's ever written somewhere in the issues, but there was a slack channel where this discussion happened. I can probably look it ups and link here.

Here are the benchmarks java did when it had to make this decision https://github.com/open-telemetry/opentelemetry-specification/pull/1236#issuecomment-984058622.

Oct 29 '22 07:10 srikanthccv

@tm0nk can you link the Java PR for this that you mentioned? I haven't been super involved in the Logging API/SDK but I'm a little surprised that a bridging API doesn't allow passing the instrumentation scope name directly in the LogRecord.

Would that be a possible implementation in our API instead of having to cache a Logger object that serve no purpose beyond holding the instrumentation scope? Any previous spec discussion?

Apr 11 '24 22:04 aabmass

@aabmass The analysis I mentioned for opentelemetry-java was done by Jack Berg, and he summarized it well in this comment: https://github.com/open-telemetry/opentelemetry-specification/pull/1236#issuecomment-984058622

This comment makes it sound like they chose the path of not using a cache at all.

However, the implementation of keeping one logger per instrumentation scope in the latest opentelemetry-java release (v1.37.0) does seem to be using a form of cache. It appears ComponentRegistry<SdkLogger> is being used for this purpose:

https://github.com/open-telemetry/opentelemetry-java/blob/v1.37.0/sdk/logs/src/main/java/io/opentelemetry/sdk/logs/SdkLoggerProvider.java#L63

Can you describe in more detail the approach you have in mind that implements this in a bridging API?

Apr 11 '24 23:04 tm0nk

Will bring this up during the SIG meeting.

Apr 23 '24 15:04 lzchen

@tm0nk I think this Java code is analogous to this issue. It looks like your PR, they just get a new logger for each log entry. I think it makes sense to have caching in the API to avoid creating a bunch of garbage. I guess I was just suprised there is no escape hatch for this kind of thing. We have MetricProducer in the Metrics SDK which allows emitting a batch of metrics including instrumentation scope directly.

Apr 26 '24 21:04 aabmass

Hi @lzchen I was planning to pick this issue up to add the requested cache tests and benchmarks to the initial PR from @tm0nk. I noticed that an attributes argument was recently added added to InstrumentationScope and thus get_logger. https://github.com/open-telemetry/opentelemetry-python/blob/main/opentelemetry-sdk/src/opentelemetry/sdk/_logs/_internal/init.py#L655

attributes is an unhashable type (Mapping), so the original @lru_cache approach no longer works for caching. What is the expected behavior if a user has a Logger with the same instrumentation scope name but different attributes? Is it best to avoid caching entirely, or should we create a custom hash and caching logic that only uses some arguments (e.g. name, version, and schema_url).

Oct 01 '24 18:10 sfc-gh-jopel

opentelemetry-python opentelemetry-python copied to clipboard

Record logger name as the instrumentation scope name

opentelemetry-python
opentelemetry-python copied to clipboard