semantic-conventions icon indicating copy to clipboard operation
semantic-conventions copied to clipboard

Add Pressure Stall Information (PSI) metrics (reopened #2996)

Open alpineQ opened this issue 2 weeks ago • 0 comments

Closes #2995

Changes

This PR adds support for Linux Pressure Stall Information (PSI) metrics to the system semantic conventions.

PSI is a Linux kernel feature (available since kernel 4.20) that identifies and quantifies resource contention by measuring the time impact that CPU, memory, and I/O resource crunches have on workloads.

New Metrics

  • system.linux.psi.pressure (Gauge): Measures resource pressure as a percentage of time that tasks were stalled over a time window (10s, 60s, or 300s)
  • system.linux.psi.total_time (Counter): Tracks the total cumulative stall time in microseconds since system boot

New Attributes

  • system.psi.resource: The resource type (cpu, memory, io)
  • system.psi.stall_type: The stall severity (some for partial stalls, full for complete stalls where all non-idle tasks are blocked)
  • system.psi.window: The time window for pressure calculation (10s, 60s, 300s)

Use Cases

PSI metrics enable:

  • Sizing workloads to hardware or provisioning hardware according to workload demand
  • Detecting productivity losses caused by resource scarcity
  • Dynamic system management (load shedding, job migration, strategic pausing)
  • Maximizing hardware utilization without sacrificing workload health

References

Relevant issues and PRs

There are issues on this matter in:

  • https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/42779
  • https://github.com/open-telemetry/opentelemetry-go-contrib/issues/8082

And 2 PRs that I am proposing to address these issues:

  • https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/43823
  • https://github.com/open-telemetry/opentelemetry-go-contrib/pull/8083

[!IMPORTANT] Pull requests acceptance are subject to the triage process as described in Issue and PR Triage Management. PRs that do not follow the guidance above, may be automatically rejected and closed.

Merge requirement checklist

  • [x] CONTRIBUTING.md guidelines followed.
  • [x] Change log entry added, according to the guidelines in When to add a changelog entry.
    • If your PR does not need a change log, start the PR title with [chore]
  • [x] Links to the prototypes or existing instrumentations (when adding or changing conventions)

Reopened #2996

alpineQ avatar Nov 11 '25 15:11 alpineQ