opentelemetry-go icon indicating copy to clipboard operation
opentelemetry-go copied to clipboard

[PoC] sdk/log: Record is concurrent safe

Open pellared opened this issue 1 year ago • 3 comments

Only sdk/log module is changed.

Per https://github.com/open-telemetry/opentelemetry-go/pull/5469#issuecomment-2145588742

Pros:

  • compliant with the specification
  • the most similar design to span processors

Cons:

  • The worst performance (see benchmark results below), not possible to have zero-allocations
  • design different from slog
  • usage of WalkAttributes can lead to a deadlocks and panics
  • record cannot be copied because it stores a mutex. A user may forget to use Clone.

Benchmarks results compared to main:

benchstat old.txt new.txt
goos: linux
goarch: amd64
pkg: go.opentelemetry.io/otel/sdk/log
cpu: Intel(R) Core(TM) i9-10885H CPU @ 2.40GHz
                                    │    old.txt    │               new.txt                │
                                    │    sec/op     │    sec/op     vs base                │
BatchProcessorOnEmit-16                302.5n ±  7%   209.0n ±  6%  -30.91% (p=0.000 n=10)
Processor/Simple-16                    667.7n ± 12%   596.2n ±  4%  -10.72% (p=0.000 n=10)
Processor/Batch-16                    1235.5n ± 12%   787.0n ± 17%  -36.31% (p=0.000 n=10)
Processor/ModifyTimestampSimple-16     645.4n ±  6%   824.2n ±  3%  +27.71% (p=0.000 n=10)
Processor/ModifyTimestampBatch-16     1154.5n ±  6%   964.5n ±  2%  -16.46% (p=0.000 n=10)
Processor/ModifyAttributesSimple-16    646.8n ±  5%   896.2n ±  2%  +38.56% (p=0.000 n=10)
Processor/ModifyAttributesBatch-16     1.067µ ±  2%   1.038µ ±  3%   -2.76% (p=0.006 n=10)
LoggerNewRecord/5_attributes-16        298.2n ± 12%   589.8n ±  5%  +97.77% (p=0.000 n=10)
LoggerNewRecord/10_attributes-16       1.453µ ±  6%   1.597µ ±  6%   +9.94% (p=0.007 n=10)
SetAddAttributes-16                    276.1n ± 12%   300.6n ±  9%   +8.89% (p=0.012 n=10)
SimpleProcessorOnEmit-16              176.35n ± 22%   12.47n ± 11%  -92.93% (p=0.000 n=10)
geomean                                584.2n         471.5n        -19.30%

                        │    old.txt    │               new.txt                │
                        │      B/s      │     B/s       vs base                │
BatchProcessorOnEmit-16   1337.0Mi ± 8%   146.0Mi ± 7%  -89.08% (p=0.000 n=10)

                                    │    old.txt     │                new.txt                 │
                                    │      B/op      │    B/op      vs base                   │
BatchProcessorOnEmit-16                 0.000 ± 0%      0.000 ± 0%         ~ (p=1.000 n=10) ¹
Processor/Simple-16                     417.0 ± 0%      425.0 ± 0%    +1.92% (p=0.000 n=10)
Processor/Batch-16                      628.0 ± 3%      431.5 ± 0%   -31.29% (p=0.000 n=10)
Processor/ModifyTimestampSimple-16      417.0 ± 0%      842.0 ± 0%  +101.92% (p=0.000 n=10)
Processor/ModifyTimestampBatch-16       632.5 ± 2%      849.0 ± 0%   +34.23% (p=0.000 n=10)
Processor/ModifyAttributesSimple-16     465.0 ± 0%      890.0 ± 0%   +91.40% (p=0.000 n=10)
Processor/ModifyAttributesBatch-16      650.0 ± 2%      897.0 ± 0%   +38.00% (p=0.000 n=10)
LoggerNewRecord/5_attributes-16           0.0 ± 0%      417.0 ± 0%         ? (p=0.000 n=10)
LoggerNewRecord/10_attributes-16        610.0 ± 0%     1027.0 ± 0%   +68.36% (p=0.000 n=10)
SetAddAttributes-16                     48.00 ± 0%      48.00 ± 0%         ~ (p=1.000 n=10) ¹
SimpleProcessorOnEmit-16              416.000 ± 0%      8.000 ± 0%   -98.08% (p=0.000 n=10)
geomean                                            ²                ?                       ²
¹ all samples are equal
² summaries must be >0 to compute geomean

                                    │   old.txt    │                new.txt                │
                                    │  allocs/op   │ allocs/op   vs base                   │
BatchProcessorOnEmit-16               0.000 ± 0%     0.000 ± 0%         ~ (p=1.000 n=10) ¹
Processor/Simple-16                   1.000 ± 0%     2.000 ± 0%  +100.00% (p=0.000 n=10)
Processor/Batch-16                    0.000 ± 0%     1.000 ± 0%         ? (p=0.000 n=10)
Processor/ModifyTimestampSimple-16    1.000 ± 0%     3.000 ± 0%  +200.00% (p=0.000 n=10)
Processor/ModifyTimestampBatch-16     0.000 ± 0%     2.000 ± 0%         ? (p=0.000 n=10)
Processor/ModifyAttributesSimple-16   2.000 ± 0%     4.000 ± 0%  +100.00% (p=0.000 n=10)
Processor/ModifyAttributesBatch-16    1.000 ± 0%     3.000 ± 0%  +200.00% (p=0.000 n=10)
LoggerNewRecord/5_attributes-16       0.000 ± 0%     1.000 ± 0%         ? (p=0.000 n=10)
LoggerNewRecord/10_attributes-16      4.000 ± 0%     5.000 ± 0%   +25.00% (p=0.000 n=10)
SetAddAttributes-16                   1.000 ± 0%     1.000 ± 0%         ~ (p=1.000 n=10) ¹
SimpleProcessorOnEmit-16              1.000 ± 0%     1.000 ± 0%         ~ (p=1.000 n=10) ¹
geomean                                          ²               ?                       ²
¹ all samples are equal
² summaries must be >0 to compute geomean

Benchmark results compared to #5469:

goos: linux
goarch: amd64
pkg: go.opentelemetry.io/otel/sdk/log
cpu: Intel(R) Core(TM) i9-10885H CPU @ 2.40GHz
                                    │   5469.txt    │                new.txt                │
                                    │    sec/op     │    sec/op     vs base                 │
BatchProcessorOnEmit-16                260.1n ± 25%   209.0n ±  6%         ~ (p=0.050 n=10)
Processor/Simple-16                    905.4n ± 11%   596.2n ±  4%   -34.15% (p=0.000 n=10)
Processor/Batch-16                    1359.5n ± 28%   787.0n ± 17%   -42.11% (p=0.000 n=10)
Processor/ModifyTimestampSimple-16     837.4n ±  3%   824.2n ±  3%         ~ (p=0.315 n=10)
Processor/ModifyTimestampBatch-16     1325.5n ±  2%   964.5n ±  2%   -27.24% (p=0.000 n=10)
Processor/ModifyAttributesSimple-16    882.4n ±  2%   896.2n ±  2%    +1.57% (p=0.024 n=10)
Processor/ModifyAttributesBatch-16     1.346µ ±  6%   1.038µ ±  3%   -22.89% (p=0.000 n=10)
LoggerNewRecord/5_attributes-16        290.9n ±  5%   589.8n ±  5%  +102.77% (p=0.000 n=10)
LoggerNewRecord/10_attributes-16       1.345µ ±  7%   1.597µ ±  6%   +18.77% (p=0.000 n=10)
SimpleProcessorOnEmit-16              176.75n ± 10%   12.47n ± 11%   -92.95% (p=0.000 n=10)
SetAddAttributes-16                                   300.6n ±  9%
geomean                                702.3n         471.5n         -29.77%

                        │   5469.txt    │            new.txt             │
                        │      B/s      │     B/s       vs base          │
BatchProcessorOnEmit-16   117.4Mi ± 33%   146.0Mi ± 7%  ~ (p=0.052 n=10)

                                    │    5469.txt    │                new.txt                │
                                    │      B/op      │    B/op      vs base                  │
BatchProcessorOnEmit-16                 0.000 ± 0%      0.000 ± 0%        ~ (p=1.000 n=10) ¹
Processor/Simple-16                     834.0 ± 0%      425.0 ± 0%  -49.04% (p=0.000 n=10)
Processor/Batch-16                     1057.0 ± 3%      431.5 ± 0%  -59.18% (p=0.000 n=10)
Processor/ModifyTimestampSimple-16      834.0 ± 0%      842.0 ± 0%   +0.96% (p=0.000 n=10)
Processor/ModifyTimestampBatch-16      1058.0 ± 2%      849.0 ± 0%  -19.75% (p=0.000 n=10)
Processor/ModifyAttributesSimple-16     882.0 ± 0%      890.0 ± 0%   +0.91% (p=0.000 n=10)
Processor/ModifyAttributesBatch-16     1092.5 ± 1%      897.0 ± 0%  -17.89% (p=0.000 n=10)
LoggerNewRecord/5_attributes-16           0.0 ± 0%      417.0 ± 0%        ? (p=0.000 n=10)
LoggerNewRecord/10_attributes-16        610.0 ± 0%     1027.0 ± 0%  +68.36% (p=0.000 n=10)
SimpleProcessorOnEmit-16              416.000 ± 0%      8.000 ± 0%  -98.08% (p=0.000 n=10)
SetAddAttributes-16                                     48.00 ± 0%
geomean                                            ²                ?                      ²
¹ all samples are equal
² summaries must be >0 to compute geomean

                                    │   5469.txt   │                new.txt                │
                                    │  allocs/op   │ allocs/op   vs base                   │
BatchProcessorOnEmit-16               0.000 ± 0%     0.000 ± 0%         ~ (p=1.000 n=10) ¹
Processor/Simple-16                   2.000 ± 0%     2.000 ± 0%         ~ (p=1.000 n=10) ¹
Processor/Batch-16                    1.000 ± 0%     1.000 ± 0%         ~ (p=1.000 n=10) ¹
Processor/ModifyTimestampSimple-16    2.000 ± 0%     3.000 ± 0%   +50.00% (p=0.000 n=10)
Processor/ModifyTimestampBatch-16     1.000 ± 0%     2.000 ± 0%  +100.00% (p=0.000 n=10)
Processor/ModifyAttributesSimple-16   3.000 ± 0%     4.000 ± 0%   +33.33% (p=0.000 n=10)
Processor/ModifyAttributesBatch-16    2.000 ± 0%     3.000 ± 0%   +50.00% (p=0.000 n=10)
LoggerNewRecord/5_attributes-16       0.000 ± 0%     1.000 ± 0%         ? (p=0.000 n=10)
LoggerNewRecord/10_attributes-16      4.000 ± 0%     5.000 ± 0%   +25.00% (p=0.000 n=10)
SimpleProcessorOnEmit-16              1.000 ± 0%     1.000 ± 0%         ~ (p=1.000 n=10) ¹
SetAddAttributes-16                                  1.000 ± 0%
geomean                                          ²               ?                       ²
¹ all samples are equal
² summaries must be >0 to compute geomean

pellared avatar Jun 04 '24 12:06 pellared

@MrAlias, PTAL

pellared avatar Jun 04 '24 12:06 pellared

Implementation PoC looks good.

The performance impact is going to be sever, as you mention. Ideally we don't go in this direction.

Another thing not included, the Record cannot be copied after it is first used now. This will cause issues with user code eventually.

MrAlias avatar Jun 04 '24 17:06 MrAlias

Another problem is that the usage of WalkAttributes can lead to a deadlock or even panic. E.g. if one to tries to use any of the record's method in the function passed to WalkAttributes it result in a panic (recursive lock).

pellared avatar Jun 05 '24 10:06 pellared