opentelemetry-rust icon indicating copy to clipboard operation
opentelemetry-rust copied to clipboard

Memory optimization for metric datapoints

Open cijothomas opened this issue 1 year ago • 8 comments

Metric datapoints currently store the starttime and endtime of the data point.

Since starttime and endtime will be same for each and every datapoint within a metric, it can be moved to the Sum struct itself, instead of repeating it for each datapoint. (Similar for Gauge/Histograms too).

cijothomas avatar Feb 22 '24 02:02 cijothomas

This will be a breaking change for folks authoring metric exporters, and must be done before SDK stable release.

cijothomas avatar Feb 22 '24 02:02 cijothomas

Is that the case ? Start time should be the first point in time something existed End time is usually when the datapoint was recorded.

And https://github.com/open-telemetry/opentelemetry-proto/blob/c5c8b28012583fda55b0cb16f73a820722171d49/opentelemetry/proto/metrics/v1/metrics.proto#L176

I would imagine starttime could be moved up, but that the time/endtime would need to stay under.

hdost avatar Feb 23 '24 03:02 hdost

Is that the case ?

Yes. Every datapoint/metricpoint(time-series) in the same metric has same start/end time for a collection.

cijothomas avatar Feb 23 '24 04:02 cijothomas

Is that the case ?

Yes. Every datapoint/metricpoint(time-series) in the same metric has same start/end time for a collection.

Start time isn't intended to be based on collection time, but observations time. If a datapoint was first observed for series A at T1 then it should be T1 until the process dies. If the last observation occurs at T3 then start should be T1 and "endtime" or really just "time" should be T3.

// StartTimeUnixNano in general allows detecting when a sequence of // observations is unbroken. This field indicates to consumers the // start time for points with cumulative and delta // AggregationTemporality, and it should be included whenever possible // to support correct rate calculation. Although it may be omitted // when the start time is truly unknown, setting StartTimeUnixNano is // strongly encouraged.

From the proto

hdost avatar Feb 23 '24 05:02 hdost

What does the proto tell? I am not able to follow your concern. Could you elaborate or send a unit test showing the issue?

cijothomas avatar Feb 23 '24 05:02 cijothomas

Perhaps for sum its not an issue since we are just adding all structures together, but if in a collection for a histogram you need to aggregate latencies. The observation time matters.

hdost avatar Feb 23 '24 05:02 hdost

Yes I can see your point, but the current implementation in this repo does not support that - it uses same start and end for all datapoints. Also, I don't think we ever record the precise time at which the measurement was reported (too much perf cost) - we only say it is between the start and end time. It is possible to go more precise, at the cost of more perf.

The data model has some pictures (Histograms and Sums), which is aligned with my proposal - the actual recordings occur somewhere between t1 and t2. https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/data-model.md#histogram

Anyway, I'll send a PR with the change.

cijothomas avatar Feb 23 '24 05:02 cijothomas

Just to be clear, start-time could be moved up, and end-time/aggregation-time would remain in data point - is it so?

https://github.com/open-telemetry/opentelemetry-proto/blob/9d139c87b52669a3e2825b835dd828b57a455a55/opentelemetry/proto/metrics/v1/metrics.proto#L156C1-L157C17

// Data points with the 0 value for TimeUnixNano SHOULD be rejected // by consumers.

lalitb avatar Feb 23 '24 08:02 lalitb