semantic-conventions icon indicating copy to clipboard operation
semantic-conventions copied to clipboard

Consider merging `*.cpu.state` metric attributes

Open ChrsMark opened this issue 1 year ago • 7 comments
trafficstars

Area(s)

area:container area:system area:process

Is your change request related to a problem? Please describe.

At the moment we have 3 different metric attributes representing the cpu.state:

Based on the global flat registry idea it would be nice if we had one single definition for this attribute and then be able to re-use it in the various namespaces.

Describe the solution you'd like

As it is explained at https://github.com/open-telemetry/semantic-conventions/pull/681#discussion_r1511310673 we could consider merging into one single one called cpu.state similar to the disk.io.direction and network.io.direction.

Describe alternatives you've considered

Maybe the embed concept would be sth to think of here: https://github.com/open-telemetry/build-tools/issues/240

Additional context

No response

cc @open-telemetry/semconv-system-approvers @open-telemetry/semconv-container-approvers @open-telemetry/semconv-k8s-approvers

ChrsMark avatar Mar 27 '24 09:03 ChrsMark

So to summarize the situation here, we have the following:

System Metrics

Values for system.cpu.state:

  • user
  • system
  • nice
  • idle
  • iowait
  • interrupt
  • steal

Process Metrics

Values for process.cpu.state:

  • user
  • system
  • wait

Container Metrics

Values for container.cpu.state:

  • user
  • system
  • kernel

Based on @braydonk's comment we can change the wait value of the process.cpu.state to idle and hence the process.cpu.state will be a subset of the system.cpu.state. Then we can use the superset for both.

Then the only difference with the container.cpu.state is the kernel value. Checking the dockerstatsreceiver implementation, this value comes directly from the Docker API. For now I guess we should use it as is unless we find way to properly correlate it with rest of set's values. @jamesmoessis feel free to comment here if you have more information.

My first proposal would be to unify the attribute to cpu.state with possible values being the following [user, system, nice, idle, iowait, interrupt, steal, kernel].

What do you @open-telemetry/semconv-system-approvers think about this?

ChrsMark avatar Apr 04 '24 10:04 ChrsMark

I like the proposal, it simplifies things.

@lmolkova Can we limit the valid values of an attribute when used in a concrete context to a subset of the overall allowed values through the tooling?

AlexanderWert avatar Apr 05 '24 07:04 AlexanderWert

@ChrsMark : Primarily for system metrics, we consider the states mentioned here. IThere's an additional state softirq, which I didn't find in the code you referenced. Just a thought, if we want to consider that as another member of the state as well.

ishleenk17 avatar Apr 08 '24 12:04 ishleenk17

Removing the triage label, as this sounds like a good plan. Unifying them makes sense to me.

joaopgrassi avatar Apr 17 '24 15:04 joaopgrassi

@lmolkova any thoughts on https://github.com/open-telemetry/semantic-conventions/issues/840#issuecomment-2039169682?

ChrsMark avatar Apr 18 '24 08:04 ChrsMark

I am in favor of this.

IThere's an additional state softirq, which I didn't find in the code you referenced.

@ishleenk17 I think this is Linux-specific, so it should go into its own separate metric like we did on open-telemetry/build-tools#323

mx-psi avatar Apr 26 '24 09:04 mx-psi

@ChrsMark it's not possible yet but we are discussinga few weeks ago this option, one similar reference would be https://github.com/open-telemetry/weaver/issues/479 to exend enum from some base one, but option to use specific fields will be most probably also implemented

trisch-me avatar May 10 '24 14:05 trisch-me