opentelemetry-java-instrumentation icon indicating copy to clipboard operation
opentelemetry-java-instrumentation copied to clipboard

jmx add jvm metrics yaml

Open SylvainJuge opened this issue 9 months ago • 1 comments

  • adds jvm.yaml in the JMX library so it can be reused by jmx-scraper in contrib.
  • adds documentation on the captured metrics and links to semconv, also include limitations.

Part of https://github.com/open-telemetry/opentelemetry-java-instrumentation/issues/13238 with a scope limited to JVM metrics.

PR opened as draft, not ready for review nor being merged yet.

TODOs

  • [x] add integration tests to validate JVM metrics in YAML, this could for example rely on a properly packaged agent using the yaml files directly.
  • [x] add jvm.memory.type metric attribute once https://github.com/open-telemetry/opentelemetry-java-instrumentation/issues/13361 is implemented
  • [ ] add support for jvm.cpu.time once https://github.com/open-telemetry/opentelemetry-java-instrumentation/issues/13369 is implemented (optional, can be done as a follow-up PR).
  • [x] evaluate if we need to filter negative values on some metrics
    • answer: yes, we need that for some JVM metrics (for example jvm.cpu.time and jvm.cpu.recent_utilization) that might return -1 and would thus not fit their semconv definitions.
  • [x] evaluate if we need to refactor assertions to fit the ones provided by SDK testing module
    • answer: better do an SDK contribution here, for now it's simpler to leave it as-is (details in comment)
  • [ ] filter negative values for impacted metrics : https://github.com/open-telemetry/opentelemetry-java-instrumentation/issues/13461

SylvainJuge avatar Feb 25 '25 15:02 SylvainJuge

I tried to refactor the test assertions to only rely on the ones provided by the SDK (using io.opentelemetry.sdk.testing.assertj.MetricAssert), but this requires quite significant work which is probably not worth the investment time for now.

  • the protobuf parsing logic can be reused from io.opentelemetry.javaagent.testing.common.AgentTestingExporterAccess but needs to be moved to a separate class to prevent classloading issues.
  • the metric type checks like isCounter and isGauge needs to be adapted to fit both long and double variants, for example with isLongCounter and isDoubleCounter

Overall I think it would be more beneficial if we extend the SDK MetricAssert with similar "high level" assertions that are closer to the metric data model in order to help making test code more readable. For example, currently writing test code for an updowncounter requires to know that it is stored as a "non monotonic sum" and that a counter is translated to a "monotonic sum", both of which require to read the actual values as the monoticity is not preserved when parsing it from protobuf to MetricData.

SylvainJuge avatar Mar 05 '25 14:03 SylvainJuge

All the test infrastructure code changes have been copied to #13597 to make review easier and allow reusing it.

SylvainJuge avatar Mar 27 '25 10:03 SylvainJuge