opentelemetry-go icon indicating copy to clipboard operation
opentelemetry-go copied to clipboard

log: Add design doc

Open pellared opened this issue 7 months ago • 7 comments

First part of https://github.com/open-telemetry/opentelemetry-go/issues/3827

Prototype: https://github.com/open-telemetry/opentelemetry-go/pull/4725 (use it for reference when reviewing).

pellared avatar Jan 02 '24 17:01 pellared

Changing to draft after getting some very constructive feedback from @MrAlias. Basically it is about making the design doc more readable for non-maintainers. E.g. calling out some design decisions more explicitly as are some of them look obvious just for me (or us - maintainers).

pellared avatar Jan 09 '24 18:01 pellared

From the SIG meeting today:

I am concerned about the sharing of resources potentially held in a sync.Pool across the logs bridge API into an SDK. This design for a logging library shim will help alleviate heap allocations, but opens a whole host of issues.

The enforcement of an SDK not holding a reference to the attribute data passed in a Record will only be done via documentation. There will not be any syntactic or programmatic enforcement. This will mean that all logging library shims that make this optimization will need to publish to their users a set of SDK that it is compatible with. Any distribution of OTel that does not abide by this documented reference model will not be compatible.

This concern motivates moving back to a getter/setter model for the Record type. For reference, @pellared pointed out the revert in the design would look like a revert of https://github.com/pellared/opentelemetry-go/pull/4. Please all, review this revert design and comment if you have objections to making this switch back.

MrAlias avatar Jan 11 '24 20:01 MrAlias

Please all, review this revert design and comment if you have objections to making this switch back.

I want to notice that even after reverting the API implementation SHOULD still NOT modify the log record and SHOULD still copy the record (Clone) in case of asynchronous processing. The user could still reuse the record passed to Emit after the method returns.

Reference: https://github.com/golang/example/blob/master/slog-handler-guide/README.md#copying-records

However, in case of improper API implementation, the benefit would be that the chances that a "dirty read / data race" of log records would be a lot less likely as bridge implementation would create a record, pass it and forget about it. Therefore this design would be safer (not encounter a data race) in most scenarios even when the API implementation would break the interface recommendations.

Another (IMO significant) benefit is that bridges and users would not have to create sync.Pool for attributes to reduce the number of heap allocation so it would be easier to use without sacrificing performance.

I totally support reverting https://github.com/pellared/opentelemetry-go/pull/4 as I think it is a safer, more user-friendly and has good-enough performance (trusting Go team and my benchmarks).

PS. I still think it was worth to explore the sync.Pool and I do not regret going back and forth. I appreciate your proactive feedback.

EDIT: Here is a "revert" PR (includes benchstat results): https://github.com/pellared/opentelemetry-go/pull/6

pellared avatar Jan 11 '24 20:01 pellared

@open-telemetry/go-approvers The design is ready for next (last?) review round. PTAL

pellared avatar Feb 02 '24 08:02 pellared

@jba Thanks a lot for your review 👍 Your (and others) feedback are welcome.

pellared avatar Feb 02 '24 20:02 pellared

Plan is to merge this tomorrow unless there is more comments/feedback.

MrAlias avatar Feb 08 '24 18:02 MrAlias

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Comparison is base (cfaf1f0) 82.6% compared to head (f72a6ed) 82.6%.

Additional details and impacted files

Impacted file tree graph

@@          Coverage Diff          @@
##            main   #4809   +/-   ##
=====================================
  Coverage   82.6%   82.6%           
=====================================
  Files        232     232           
  Lines      18870   18870           
=====================================
+ Hits       15603   15605    +2     
+ Misses      2977    2975    -2     
  Partials     290     290           

see 1 file with indirect coverage changes

codecov[bot] avatar Feb 08 '24 18:02 codecov[bot]

@peterbourgon @MrAlias @jack-berg @dashpole @carrbs @MadVikingGod @tigrannajaryan @jba thank you a lot for your help.

pellared avatar Feb 09 '24 07:02 pellared