opentelemetry-specification
opentelemetry-specification copied to clipboard
StartTime in Gauge is hard to understand/inconsistent
There are multiple places where we refer to the start-time in a gauge data model description:
- In the overview description of the gauge https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/datamodel.md#gauge. Here the start time is recommended to be set to
the timestamp when a metric collection system started.
- And in the gauge data model img. Here the start time is recommended to be "reset" after each report.
What do I expect:
- [P0] Consistency between the two examples.
- [P1] I believe that we should ignore start time for Gauges at this moment.
The recommendation to use "the timestamp when a metric collection system started" was included as an option that would allow simultaneous producers of a Gauge to be detected by the system (i.e., that they are overlapping). Start time accomplishes this for all of the other points, so it was included for consistency. When a process restarts and there are briefly two producers of the same gauge, this allows consumers to tell which measurement is from a younger series.
There is another way two timestamps might be used for gauges, where the start timestamp is the time the measurement was set (i.e., assuming a synchronous gauge instrument) and the "now" timestamp is the time the measurement was captured as the current one. Even if OTel adds a synchronous Gauge instrument, I do not prefer this interpretation.
These two ways of using timestamp do not necessarily create confusion by co-existing. The StartTimestamp is always earlier in time, the Timestamp is always later in time. In both cases the metric had a value at through Timestamp, and "when did this value take effect?" requires history to disambiguate, but since we do not have a synchronous Gauge the point is practically irrelevant.
I agree we should remove the default use of a start time for gauges from the image. I'm happy with the option to " timestamp when a metric collection system started", but if we think Gauge points shouldn't have the option of using two timestamps -- which I do not -- then we should remove one of the timestamp fields.
I think the image is just wrong, so fixing the start time to match the specification is easy.
Regarding whether to ignore start_time
for now, I'd suggest that I think most backends are ignoring start time.
The real question is whether we find value in start_time
on Gauge. While we don't have any users of it yet, we also don't have many (or any?) systems generating Gauges with start time prior to OTEL metric stability.
Personally, I'm fine keeping it and evaluating usage. Given start_time on gauge is (necessarily) optional (so we can interact with other metric systems), I don't think it's a hard ask either way to keep or remove. I suggest we re-evaluate Gauge start_time after metric stability and adoption to see if it's paying off / is in use.
Considering #2318, I think we should update the specification stating that start_timestamp is not used for Gauge data points. I also think we should add to the specification stating that exporters MAY use a single timestamp for the entire set of exported data, which means that Gauge timestamps will be the time of collection.
IOW: We already state that all observations from a callback MUST use the same timestamp. I would like to add a statement that all observations from the SDK MAY use the same timestamp, and this pins down what the one timestamp of a Gauge data point means.