opentelemetry-java-instrumentation
opentelemetry-java-instrumentation copied to clipboard
JMX Metrics with yaml: need the ability to filter negative values
For some metrics like jvm.cpu.time and jvm.cpu.recent_utilization, the JVM can return a negative value as the underlying data might not be available as indicated by OperatingSystemMXBean.getCpuLoad() javadoc.
This corner case is properly handled in the runtime-telemetry implementation for JVM metrics as seen here, but this is not the case for metrics that are defined in YAML, hence preventing to being 100% compliant with semconv for JVM metrics.
Also, the same issue is likely to happen with other systems for which metrics are captured through a yaml definition because returning a negative value is a common pattern to indicate something is not available.
This was discovered while working on https://github.com/open-telemetry/opentelemetry-java-instrumentation/pull/13392 where we try to capture JVM metrics that are compliant with semantic conventions for java runtime.
Support for this could look like this in yaml:
- bean: java.lang:type=OperatingSystem
prefix: jvm.
mapping:
# jvm.cpu.recent_utilization
ProcessCpuLoad:
metric: cpu.recent_utilization
type: gauge
unit: '1'
negativeValues: false
desc: Recent CPU utilization for the process as reported by the JVM.
This is currently blocking progress on #13392
Suggestions on naming, feedback and suggestions are welcome:
discardNegativeValues:trueto remove negative valuesdropNegativenegativeValues:true(default),falsewould discard values
I tend to prefer negativeValues as it's short and it sounds a bit more positive (pun not intentional here).
I like "drop" over "discard" because it aligns with metric view terminology: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/sdk.md#drop-aggregation
I think I like "dropNegativeValues" overall, but curious what others think