opentelemetry-demo
opentelemetry-demo copied to clipboard
Prometheus out of order sample from remote write
Bug Report
Which version of the demo you are using? 1.10.0
Symptom
Prometheus logs indicate out-of-order sample errors and sometimes restarts.
What is the expected behavior?
No log message after ts=2024-06-22T13:59:41.447Z caller=manager.go:163 level=info component="rule manager" msg="Starting rule manager..."
and Prometheus doesn't restart
What is the actual behavior?
After some period of time (a few minutes) out of order sample logs are seen and in some cases I've experienced Prometheus restarts. For example:
ts=2024-06-22T14:06:42.763Z caller=write_handler.go:134 level=error component=web msg="Out of order sample from remote write" err="out of order sample" series="{__name__=\"target_info\", container_id=\"f2c9465e88d12e42c419403ef8aab2027b18337e74bf7d9610e9576420d2db10\", docker_cli_cobra_command_path=\"docker%20compose\", host_name=\"f2c9465e88d1\", job=\"cartservice\", telemetry_sdk_language=\"dotnet\", telemetry_sdk_name=\"opentelemetry\", telemetry_sdk_version=\"1.9.0\"}" timestamp=1719065202193
While the cartservice
/ .NET appears to be the most common, this is seen across other services/languages as well.
Reproduce
Download version 1.10.0 and run make start
. Once started, tail the Prometheus containers logs (tail -f <ID>
) and wait.
Additional Context
Consider enabling out-of-order sample support available in Prometheus 2.39.x: https://github.com/prometheus/prometheus/pull/11075