semantic-conventions icon indicating copy to clipboard operation
semantic-conventions copied to clipboard

add k8s.container.status.state and k8s.container.status.reason metrics

Open povilasv opened this issue 10 months ago • 7 comments

Fixes https://github.com/open-telemetry/semantic-conventions/issues/1672

Changes

Adds k8s.container.status.state metric, it would allow us to alert and monitor containers in not ready state.

I'm still not sure if this should be multiple different metrics or a single one :thinking:

The current problems, with single metric:

  • Running state doesn't have a reason, so we would set reason to empty.
  • Waiting state and Terminated state have different sets of reasons:

Waiting state - "ContainerCreating", "CrashLoopBackOff", "CreateContainerConfigError", "ErrImagePull", "ImagePullBackOff"

Terminated state - "OOMKilled", "Completed", "Error", "ContainerCannotRun"

Alternative approach would be to do what KSM does:

  • k8s.container.status.state metric without reason attribute.
  • k8s.container.status.waiting_reason metric for waiting reason enum.
  • k8s.container.status.terminated_reason metric for terminated reason enum.

This is not intended to merge, I would appreciate any feedback to see what we want to do here.

Merge requirement checklist

povilasv avatar Jan 22 '25 07:01 povilasv