opentelemetry-java-instrumentation icon indicating copy to clipboard operation
opentelemetry-java-instrumentation copied to clipboard

Some metrics dropped when dynamically attaching agent to running VM

Open james-harlow-10x-banking opened this issue 1 year ago • 2 comments

Describe the bug

When running with -javaagent, I see a complete set of metrics gathered from the instrumentation I have enabled. However, when I attach the agent to a running VM, despite seeing the agent instrumenting the classes, no metrics are exported.

Steps to reproduce

  1. Make a new Spring boot service from the initializer.

https://start.spring.io/#!type=gradle-project&language=java&platformVersion=3.2.6&packaging=jar&jvmVersion=17&groupId=com.example&artifactId=demo&name=demo&description=Demo%20project%20for%20Spring%20Boot&packageName=com.example.demo&dependencies=web

  1. Write a little code to load a java agent dynamically:
  public static void main(String[] args) throws IOException, AttachNotSupportedException, AgentLoadException, AgentInitializationException {
    if (args.length > 0) {
      var pid = args[0];
      var agent = args[1];
      VirtualMachine jvm = VirtualMachine.attach(pid);
      jvm.loadAgent(agent);
      jvm.detach();
    } else {
      SpringApplication.run(DemoApplication.class, args);
    }
  }
  1. Download the otel agent:
mkdir -p agent && curl -L https://github.com/open-telemetry/opentelemetry-java-instrumentation/releases/download/v2.4.0/opentelemetry-javaagent.jar -o agent/otel-agent.jar
  1. Start an otel collector:
receivers:
  otlp:
    protocols:
      grpc:
      http:
exporters:
  logging:
    loglevel: info
  debug:
    verbosity: detailed
    sampling_initial: 5
    sampling_thereafter: 15
service:
  pipelines:
    metrics:
      receivers: [otlp]
      exporters: [debug, logging]
    traces:
      receivers: [otlp]
      exporters: [logging]
    logs:
      receivers: [otlp]
      exporters: [logging]
docker run -v $PWD/otel-collector-config.yaml:/otel-collector-config.yaml -p 4318:4318 otel/opentelemetry-collector "--config=/otel-collector-config.yaml"
  1. Start the Spring service with the agent attached.
./gradlew --info clean build && java -javaagent:$PWD/agent/otel-agent.jar -Dotel.instrumentation.common.default-enabled=false -Dotel.instrumentation.micrometer.enabled=true -Dotel.exporter.oltp.endpoint='http://localhost:4318' -Dotel.metrics.exporter=otlp -Dotel.javaagent.enabled=true -Dotel.javaagent.debug=true -jar build/libs/demo-0.0.1-SNAPSHOT.jar

make a request to http://localhost:8080/ and observe a metric line for e.g. http.server.requests

  1. Restart the collector and start the Spring service without the agent attached.
./gradlew --info clean build && java                                     -Dotel.instrumentation.common.default-enabled=false -Dotel.instrumentation.micrometer.enabled=true -Dotel.exporter.oltp.endpoint='http://localhost:4318' -Dotel.metrics.exporter=otlp -Dotel.javaagent.enabled=true -Dotel.javaagent.debug=true -jar build/libs/demo-0.0.1-SNAPSHOT.jar

then load the agent

java -cp build/libs/demo-0.0.1-SNAPSHOT-plain.jar com.example.demo.DemoApplication $(jcmd | grep build/libs/demo-0.0.1-SNAPSHOT.jar | cut -f1 -d' ') $PWD/agent/otel-agent.jar

and make a request to http://localhost:8080/. There is no line for http.server.requests

Expected behavior

Whether I attach the agent before the JVM starts, or while it's running, the observability gathered should be the same for all requests processed after the attach has finished.

Actual behavior

Metrics differ and it looks like it's dropping all metrics that have come from instrumentation, though some SDK-generated metrics (e.g. queueSize) remain.

Javaagent or library instrumentation version

2.3.0 and 2.4.0

Environment

JDK:

$ java -version
openjdk version "21.0.3" 2024-04-16 LTS
OpenJDK Runtime Environment Zulu21.34+19-CA (build 21.0.3+9-LTS)
OpenJDK 64-Bit Server VM Zulu21.34+19-CA (build 21.0.3+9-LTS, mixed mode, sharing)

OS: MacOS 14.5

Additional context

The reason for trying to attach to a running JVM is that in our use-case, there's a CRaC checkpoint and restore between JVM start and agent attach. We also need various bits of config (e.g. span attributes) to be available after attach that can't be known on JVM start-up (because we share one base image to multiple deployment locations). I'm aware of some different ways of doing this - I've looked into the demo extensions project, and gotten pretty far with getting that to read from a config file when it detects a crac restore - but this is our preferred way if it's possible.

hi @james-harlow-10x-banking!

we don't currently support dynamic attach, check out #1932 for some background

since you are using Spring Boot, you may want to try out the OpenTelemetry Spring Boot Starter which doesn't use the Java agent

trask avatar Jun 13 '24 23:06 trask

@james-harlow-10x-banking looking at your command line the only instrumentation you are interested in seems to be micrometer. To understand why it isn't working you should take a look at https://github.com/open-telemetry/opentelemetry-java-instrumentation/blob/main/instrumentation/micrometer/micrometer-1.5/javaagent/src/main/java/io/opentelemetry/javaagent/instrumentation/micrometer/v1_5/MetricsInstrumentation.java Instrumentation adds a call to Metrics.addRegistry(MicrometerSingletons.meterRegistry()); to the static initializer of io.micrometer.core.instrument.Metrics. If the micrometer Metrics class is already initialized by the time agent is attached then the instrumentation won't work as expected. You are probably better off using the library instrumentation for micrometer https://github.com/open-telemetry/opentelemetry-java-instrumentation/blob/main/instrumentation/micrometer/micrometer-1.5/library/README.md Or you could build a custom extension that uses a different integration point to install the micrometer instrumentation.

laurit avatar Jun 14 '24 11:06 laurit

Closing as dynamic attach is not supported, see #1932 for more background

trask avatar Oct 03 '25 02:10 trask