Possible JVM monitoring issue after updating to 1.23.19
No issues in the web interface of Coroot.
Coroot: 1.10.2 Coroot-node-agent: 1.23.19 Prometheus: 2.53.4 clickhouse: 25.4.1.2934
non-docker installation Linux 4.18.0-553.50.1.el8_10.x86_64 https://github.com/coroot/coroot-node-agent/pull/1 SMP Wed Apr 16 11:36:26 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux (Rocky Linux 8.x)
program = elasticsearch on java 15.0.1 (as seen in coroot) [root@montst bin]# java -version openjdk version "1.8.0_452" OpenJDK Runtime Environment (build 1.8.0_452-b09) OpenJDK 64-Bit Server VM (build 25.452-b09, mixed mode)
(Graylog in #199 is running on another and newer java version)
/var/log/messages
Apr 24 17:25:41 montst coroot-node-agent[13352]: I0424 17:25:41.491141 13352 profiling.go:245] JVM detected PID: 13561, perfmap dump supported: true
Apr 24 17:25:41 montst coroot-node-agent[13352]: W0424 17:25:41.512427 13352 profiling.go:255] failed to dump perfmap of JVM 13561: status:-
Apr 24 17:25:41 montst coroot-node-agent[13352]: I0424 17:25:41.639650 13352 profiling.go:139] collected 6 profiles in 149ms
Apr 24 17:25:41 montst coroot-node-agent[13352]: I0424 17:25:41.661209 13352 profiling.go:149] uploaded 6 profiles in 21ms
cat /proc/13561/status
Name: java
Umask: 0022
State: S (sleeping)
Tgid: 13561
Ngid: 0
Pid: 13561
PPid: 1
TracerPid: 0
Uid: 990 990 990 990
Gid: 986 986 986 986
FDSize: 1024
Groups: 986
NStgid: 13561
NSpid: 13561
NSpgid: 13561
NSsid: 13561
VmPeak: 84787288 kB
VmSize: 84760180 kB
VmLck: 7559844 kB
VmPin: 0 kB
VmHWM: 5446848 kB
VmRSS: 5446848 kB
RssAnon: 4886256 kB
RssFile: 560592 kB
RssShmem: 0 kB
VmData: 4989328 kB
VmStk: 132 kB
VmExe: 4 kB
VmLib: 27184 kB
VmPTE: 13420 kB
VmSwap: 0 kB
HugetlbPages: 0 kB
CoreDumping: 0
THP_enabled: 1
Threads: 84
SigQ: 0/46634
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000000000000
SigIgn: 0000000000000000
SigCgt: 2000000181005ccf
CapInh: 0000000000000000
CapPrm: 0000000000000000
CapEff: 0000000000000000
CapBnd: 000001ffffffffff
CapAmb: 0000000000000000
NoNewPrivs: 1
Seccomp: 2
Speculation_Store_Bypass: thread vulnerable
Cpus_allowed: ffffffff,ffffffff,ffffffff,ffffffff
Cpus_allowed_list: 0-127
Mems_allowed: 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000001
Mems_allowed_list: 0
voluntary_ctxt_switches: 26
nonvoluntary_ctxt_switches: 2
We forgot to mention in the docs that the agent relies on JVM mechanisms introduced in JDK 17 😕 (JDK-8254723). We should add a version check with more meaningful logging.
Already thought it could be something like that, thank you.
Something to keep in mind.
Updating the system binary for Java to 17 shows in the node log at first that scraping is possible, but this fails later on for Elasticsearch using it's own Java (15) binary.
Additional log information would be very useful.
I encountered the following error:
failed to dump perfmap of JVM 18768: status:-
I went through several files to understand why many profiles contained [unknown] entries. The actual root cause wasn’t obvious until I read your post, @def
The error occurs here: https://github.com/coroot/coroot-node-agent/blob/34373d23fde80ab03a6b19d79e1af06e8f6f78b3/jvm/perfmap.go#L15
Dial returns *JVM and calls DumpPerfmap: https://github.com/coroot/coroot-node-agent/blob/34373d23fde80ab03a6b19d79e1af06e8f6f78b3/jvm/jattach.go#L70
It took me several hours of digging to discover that another prerequisite for using Coroot eBPF profiling is JDK >= 17. Giving some more verbose information to logs will save another from going that rabbit hole.
And not forget:
"Coroot automates this step by periodically calling jcmd in the background. However, the JVM must be started with the -XX:+PreserveFramePointer option. This allows for accurate stack traces and proper symbolization of JIT-compiled code, with only a small performance overhead (typically around 1-3%)."
https://coroot.com/blog/troubleshooting-java-applications-with-coroot/
Yea coroot have clear statement about preserve frame pointers at https://docs.coroot.com/profiling/ebpf-based-profiling
We have nearly 2k independent apps in k8s envinronment but that was a first time when I have added preserve frame pointer flag and profiles has not been displayed properly on java app.
My first thought was that java sidecar deployed next to main java app container in pod do not have flag. I work with developers to enable it but that was not the case, so I started digging into code and search the github issues and finally found that one.
I have done a change request in the JVM monitoring page.