opentelemetry-java icon indicating copy to clipboard operation
opentelemetry-java copied to clipboard

REUSABLE_DATA memory mode creates duplicate aggregator handles in DELTA temporalit

Open lenin-jaganathan opened this issue 3 months ago • 1 comments

https://github.com/open-telemetry/opentelemetry-java/blob/e6b9462f24410ae9d5b1a3717624122c7709640c/sdk/metrics/src/main/java/io/opentelemetry/sdk/metrics/internal/state/DefaultSynchronousMetricStorage.java#L219-L323

In DefaultSynchronousMetricStorage, when using MemoryMode.REUSABLE_DATA with AggregationTemporality.DELTA, the implementation maintains two aggregator handles per unique attribute set due to a ping-pong pattern between collection intervals.

During the collection operation, a new aggregatorHolder is created with the handles from previousCollectionAggregatorHandles. In any case, when an instrument is registered for the first time, it will not be present in the previousCollectionAggregatorHandles. So, for the next collection cycle, another duplicate handle is created for the same. Throughout the lifetime of the application, both of these objects coexist, causing an increased memory footprint in the storage. For instruments with multiple tags, this becomes a memory overhead.

One thing I could probably think of is probably hold writelock for a few more instructions for DELTA and REUSABLE_DATA, and reuse the existing handle by only resetting the values. We can also improve the attribute reset behaviour (cardinality control) in this mode to remove only those instruments that are not recorded in the last interval.

lenin-jaganathan avatar Oct 07 '25 12:10 lenin-jaganathan

DefaultSynchronousMetricStorage tries to minimize locking and memory allocation, while also maintaining a small memory footprint. A simple locking strategy isn't going to cut it, but I'm open to suggestions on how to improve. However, given the complexity of the code and the importance of performance in this area: 1. reviews would likely be slow and careful 2. a lot of consideration would need to be given to benchmarks, with potential new benchmarks written if the current ones are insufficient.

jack-berg avatar Oct 23 '25 14:10 jack-berg