Add system uptime metric
What are you trying to achieve?
I want to add a metric to the semantic conventions that will describe the system uptime. How about system.uptime?
Additional context.
This is reported by Telegraf as uptime field of the system metric (in seconds).
Here's a related proposal on the hostmetrics receiver to add this metric: https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/14130.
Is this duplicate of https://github.com/open-telemetry/opentelemetry-specification/issues/1273 ?
Thanks ~~Dan~~ Tigran :facepalm:, didn't see this issue. It is closely related. It talks about process namespace and not system, but I think the discussion can be applied to system too. If I understand correctly, (at least from the perspective of this issue) it boils down to adding an attribute process.start_time and system.start_time.
In fact, I can see there's a process namespace for attributes, but I cannot see a system namespace for attributes - only an os namespace. Would the attribute become os.start_time then?
Also when running the OT collector with the hostmetrics receiver, I cannot see any attributes from the os. namespace being reported (this is of course out of scope of this issue and repository).
We discussed this briefly during today's SIG Spec call, let's see where the conversation in open-telemetry/opentelemetry-specification#1273 takes us.
I would support system.uptime as a metric that measures the uptime of the system. The process.uptime is a different concern.
system.uptime would be, in the case of linux, which is read from /proc/uptime. Analogous for other operating systems.
I support both system.uptime and process.uptime semantic conventions.
These all make sense, but please pause for now, we are considering refactoring existing semantic conventions. Please come to ongoing discussions. See https://github.com/open-telemetry/opentelemetry-specification/issues/2753.
@jsuereth Can we transfer this to the semantic-conventions repository?
Q. Is there any plan to do it? I'm interested in it.
@kernelpanic77 offered to do it here https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/31627#issuecomment-2133882411 :)
Looks like we have an agreement here. Just need someone to submit a PR
Is there any way we can generalize this to be an "uptime" of any entity, not just of "system"? What if we make this an uptime metric with the Resource describing what it is about (e.g. Resource can have "host.name=foo" to indicate that it is an uptime of a host).
Is there any way we can generalize this to be an "uptime" of any entity, not just of "system"? What if we make this an
uptimemetric with the Resource describing what it is about (e.g. Resource can have "host.name=foo" to indicate that it is an uptime of a host).
I suppose we could do it, I wonder what others think.
The uptime attribute name would not be namespaced, unlike system.uptime that is namespaced to system. Looking at the Attributes Registry, it doesn't look like we currently have any non-namespaced attributes in the semantic conventions. Is that true?
Is there any way we can generalize this to be an "uptime" of any entity, not just of "system"?
I think this is part of a broader discussion taking place in https://github.com/open-telemetry/semantic-conventions/issues/1161 (system.uptime vs process.uptime vs container.uptime). The main benefit of using the metric without namespace seems to be dashboards correlation and avoiding deduplication. But it comes at the cost of implying resource attributes to corresponding metrics (not sure if this is possible in semconv), for example, the uptime metric should always be linked to either host.name, process.pid or container.id.
During the System Semantic Conventions SIG (20/06/2024) we agreed on keeping the metrics in namespaces (even if there are duplications) due to:
The potential for minute differences between the meanings of seemingly identical metrics between the different contexts The namespaces also semantically represent the reporting source, making query scenarios more clear (i.e. "I want all my operating system process metrics" or "I want all my jvm metrics" has a clear separation due to the metrics reported from each source all having their respective namespaces)