opentelemetry-specification icon indicating copy to clipboard operation
opentelemetry-specification copied to clipboard

Cumulative data points should set `StartTimeUnixNano` per timeseries

Open ArthurSens opened this issue 1 year ago • 15 comments
trafficstars

Current behavior

As mentioned by the Metrics Data Model, the goals of StartTimeUnixNano is to help measure the increase rate of unbroken data points and identifying resets and gaps.

The current implementation of most OTel SDKs sets the value of StartTimeUnixNano as the timestamp representing the start of the instrumented process. This works pretty well when all known data points start being pushed/exposed right at the beginning of the process lifetime, but not so much if unknown data points appear sometime after the process starts.

Problem statement

Let me try to explain with an example:

Let's say an application starts at a time T = 0, some HTTP requests are happening and they are all successful (i.e. status code 200). After 30s (T+30), the first request with code 404 happens and it continues to happen every 1s (therefore 1 request per second is the increase rate).

The increase rate can be measured with a formula like this:

$rate = \frac{Cumulative Value} {Current Time - StartTimeUnixNano}$

1 minute after T, HTTP requests with status code 404 would have happened 30 times, but let's see how different the measurement would be if we use T or T+30 as the start time.

  • If StartTimeUnixNano = T, $rate = \frac{30} {60 - 0} = 0.5$ req/sec
  • If StartTimeUnixNano = T+30, $rate = \frac{30} {60-30} = 1$ req/sec

As mentioned in Current Behavior, the measurement works well when the series is initialized alongside the process, but not so when the series is initialized after.

Requested change

The requested change is that StartTimeUnixNano is set separately per time series.

But of course, this comes with performance drawbacks. For that to be possible, SDKs will need to store in memory some sort of time series ID + StartTimeUnixNano for all time series ever created. This can be a huge drawback for processes wishing to expose high cardinality metrics.

I believe the appropriate approach is to allow users to configure the SDK, where they can opt-in to the current behavior or to the behavior I'm suggesting here.

Additional context

The discussion comes from an issue in OpenTelemetry-Collector-Contrib, where we're implementing support for Prometheus/OpenMetrics's created timestamp: https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/32521

ArthurSens avatar Aug 13 '24 18:08 ArthurSens

This is important for compatibility between Prometheus and OpenTelemetry going forward. @ArthurSens it might be helpful for folks here to describe what issues this causes when pushing OTLP to Prometheus today.

dashpole avatar Aug 14 '24 15:08 dashpole

See also https://github.com/open-telemetry/opentelemetry-specification/issues/2232 CC/ @zeitlinger

jmacd avatar Aug 21 '24 15:08 jmacd

Is this really a specification issue? I read the spec as saying that implementations should set StartTime to match when the unbroken series of data points started.

To me this is an implementation issue.

bboreham avatar Sep 04 '24 17:09 bboreham

@bboreham I believe it is a spec issue (my initial analysis of the spec is here). The spec says:

The StartTimeUnixNano of each point matches the StartTimeUnixNano of the initial observation.

Which just means that the start time is repeated, not that it matches when the series of data points started.

dashpole avatar Sep 04 '24 17:09 dashpole

I am referring to this text in the spec:

The second, optional timestamp is used to indicate when a sequence of points is unbroken, and is referred to as StartTimeUnixNano.

EDIT:

After discussion with @jesusvazquez, I wish to point to this part:

When StartTimeUnixNano is less than TimeUnixNano, a new unbroken sequence of observations begins with a "true" reset at a known start time.

I agree the spec could be more explicit, but using a time hours or days earlier as the "known start time" seems perverse to me.

bboreham avatar Sep 05 '24 09:09 bboreham

I came here to create the same issue and found this out. Totally agree, created timestamp basically doesn't work as expected because of this logic. It ingest the zero for the first cardinality, if the same metric with different attribute appeared later, since it has the same Start Time as the other time series of the same metric, zero ingestion does not work on Prometheus side.

metinogurlu avatar Mar 14 '25 07:03 metinogurlu

Could you clarify what you mean?

austinlparker avatar Mar 18 '25 20:03 austinlparker

Sure, the thing is that StartTimeUnixNano indicates the starting time of the metric not the time series. Right now, there's no way to understand the starting time of each individual time series.

T1 - application started ... T2 - metric being instrumented with the first counter increase at 2025-05-12T07:47:04 http_request_count{path:"/"} - StartTimeUnixNano is 2025-05-12T07:47:04 ... T3 - counter increase with another attribute/label value at 2025-05-12T11:00:00 http_request_count{path:"/profile"} - StartTimeUnixNano is still 2025-05-12T07:47:04

What is the outcome?

In Prometheus, counters don’t automatically begin at zero for time series that appear after startup. Instead, the first scrape will capture whatever the current counter's value is—often the total increase since instrumentation began (e.g., over the first 15 s scrape interval). As a result, Prometheus’ built-in rate() based functions will effectively “skip” this initial jump, since they are designed to detect the changes between data points. This leads counters to show fewer values than the actual change.

This behavior is recognized by the Prometheus project, and a workaround exists behind the--enable-feature=created-timestamp-zero-ingestion flag. However, This logic is depended on the correct StartTimeUnixNano value. Prometheus detects the initial instrumentation of a metric, and adds 0 before it's first value and currently, it works as expected only for the first time series of a metric. Since StartTimeUnixNano remains the same for the upcoming time series of the same metric, it's considered that, zero ingestion is already done for this metric and it will be skipped.

Here there's an example the side effect of this

metinogurlu avatar May 12 '25 08:05 metinogurlu

I think it is related to:

  • https://github.com/open-telemetry/opentelemetry-specification/issues/3576

pellared avatar May 12 '25 11:05 pellared

@ArthurSens do you have any updates this issue? Is this something on the Prometheus Interop SIG's radar?

danielgblanco avatar May 19 '25 09:05 danielgblanco

Hey Daniel, yes it is still on our radar but it's lower priority than updating the spec with the new features we got in Prometheus' side.

I'll probably come back here once we're done updating spec with UTF-8, Native Histograms and flexibilizing suffix additions

ArthurSens avatar May 19 '25 18:05 ArthurSens

Thanks @ArthurSens! I'll mark this as sig-issue then, which effectively marks it as "triaged" from the spec perspective. Do you have a list/board for these so it doesn't get lost?

danielgblanco avatar May 22 '25 16:05 danielgblanco

We do not, but I was talking with @ywwg today exactly about this! We'll come up with something :)

Where would be the best place to create this board? We don't have many permissions in the OTel organization, so I'm not sure where it should be created

ArthurSens avatar May 22 '25 17:05 ArthurSens

@danielgblanco, we updated this issue to look like a board: https://github.com/open-telemetry/opentelemetry-specification/issues/4494

ArthurSens avatar May 22 '25 18:05 ArthurSens

Where would be the best place to create this board? We don't have many permissions in the OTel organization

hmmm any OTel member should be able to create a project, and then maintainers can "link" the project to any repo they maintain (see the community repo for linked projects for example). We do ask that projects have public visibility so that non-org members can see it.

danielgblanco avatar May 23 '25 12:05 danielgblanco