Some metrics dropped when dynamically attaching agent to running VM
Describe the bug
When running with -javaagent, I see a complete set of metrics gathered from the instrumentation I have enabled. However, when I attach the agent to a running VM, despite seeing the agent instrumenting the classes, no metrics are exported.
Steps to reproduce
- Make a new Spring boot service from the initializer.
https://start.spring.io/#!type=gradle-project&language=java&platformVersion=3.2.6&packaging=jar&jvmVersion=17&groupId=com.example&artifactId=demo&name=demo&description=Demo%20project%20for%20Spring%20Boot&packageName=com.example.demo&dependencies=web
- Write a little code to load a java agent dynamically:
public static void main(String[] args) throws IOException, AttachNotSupportedException, AgentLoadException, AgentInitializationException {
if (args.length > 0) {
var pid = args[0];
var agent = args[1];
VirtualMachine jvm = VirtualMachine.attach(pid);
jvm.loadAgent(agent);
jvm.detach();
} else {
SpringApplication.run(DemoApplication.class, args);
}
}
- Download the otel agent:
mkdir -p agent && curl -L https://github.com/open-telemetry/opentelemetry-java-instrumentation/releases/download/v2.4.0/opentelemetry-javaagent.jar -o agent/otel-agent.jar
- Start an otel collector:
receivers:
otlp:
protocols:
grpc:
http:
exporters:
logging:
loglevel: info
debug:
verbosity: detailed
sampling_initial: 5
sampling_thereafter: 15
service:
pipelines:
metrics:
receivers: [otlp]
exporters: [debug, logging]
traces:
receivers: [otlp]
exporters: [logging]
logs:
receivers: [otlp]
exporters: [logging]
docker run -v $PWD/otel-collector-config.yaml:/otel-collector-config.yaml -p 4318:4318 otel/opentelemetry-collector "--config=/otel-collector-config.yaml"
- Start the Spring service with the agent attached.
./gradlew --info clean build && java -javaagent:$PWD/agent/otel-agent.jar -Dotel.instrumentation.common.default-enabled=false -Dotel.instrumentation.micrometer.enabled=true -Dotel.exporter.oltp.endpoint='http://localhost:4318' -Dotel.metrics.exporter=otlp -Dotel.javaagent.enabled=true -Dotel.javaagent.debug=true -jar build/libs/demo-0.0.1-SNAPSHOT.jar
make a request to http://localhost:8080/ and observe a metric line for e.g. http.server.requests
- Restart the collector and start the Spring service without the agent attached.
./gradlew --info clean build && java -Dotel.instrumentation.common.default-enabled=false -Dotel.instrumentation.micrometer.enabled=true -Dotel.exporter.oltp.endpoint='http://localhost:4318' -Dotel.metrics.exporter=otlp -Dotel.javaagent.enabled=true -Dotel.javaagent.debug=true -jar build/libs/demo-0.0.1-SNAPSHOT.jar
then load the agent
java -cp build/libs/demo-0.0.1-SNAPSHOT-plain.jar com.example.demo.DemoApplication $(jcmd | grep build/libs/demo-0.0.1-SNAPSHOT.jar | cut -f1 -d' ') $PWD/agent/otel-agent.jar
and make a request to http://localhost:8080/. There is no line for http.server.requests
Expected behavior
Whether I attach the agent before the JVM starts, or while it's running, the observability gathered should be the same for all requests processed after the attach has finished.
Actual behavior
Metrics differ and it looks like it's dropping all metrics that have come from instrumentation, though some SDK-generated metrics (e.g. queueSize) remain.
Javaagent or library instrumentation version
2.3.0 and 2.4.0
Environment
JDK:
$ java -version
openjdk version "21.0.3" 2024-04-16 LTS
OpenJDK Runtime Environment Zulu21.34+19-CA (build 21.0.3+9-LTS)
OpenJDK 64-Bit Server VM Zulu21.34+19-CA (build 21.0.3+9-LTS, mixed mode, sharing)
OS: MacOS 14.5
Additional context
The reason for trying to attach to a running JVM is that in our use-case, there's a CRaC checkpoint and restore between JVM start and agent attach. We also need various bits of config (e.g. span attributes) to be available after attach that can't be known on JVM start-up (because we share one base image to multiple deployment locations). I'm aware of some different ways of doing this - I've looked into the demo extensions project, and gotten pretty far with getting that to read from a config file when it detects a crac restore - but this is our preferred way if it's possible.
hi @james-harlow-10x-banking!
we don't currently support dynamic attach, check out #1932 for some background
since you are using Spring Boot, you may want to try out the OpenTelemetry Spring Boot Starter which doesn't use the Java agent
@james-harlow-10x-banking looking at your command line the only instrumentation you are interested in seems to be micrometer. To understand why it isn't working you should take a look at https://github.com/open-telemetry/opentelemetry-java-instrumentation/blob/main/instrumentation/micrometer/micrometer-1.5/javaagent/src/main/java/io/opentelemetry/javaagent/instrumentation/micrometer/v1_5/MetricsInstrumentation.java Instrumentation adds a call to Metrics.addRegistry(MicrometerSingletons.meterRegistry()); to the static initializer of io.micrometer.core.instrument.Metrics. If the micrometer Metrics class is already initialized by the time agent is attached then the instrumentation won't work as expected. You are probably better off using the library instrumentation for micrometer https://github.com/open-telemetry/opentelemetry-java-instrumentation/blob/main/instrumentation/micrometer/micrometer-1.5/library/README.md Or you could build a custom extension that uses a different integration point to install the micrometer instrumentation.
Closing as dynamic attach is not supported, see #1932 for more background