JVM Profiling Follow ups
Description
We have released JVM Profiling in https://github.com/getsentry/sentry-java/releases/tag/8.23.0
There are still tasks /improvements we could do:
- [X] Docs (https://github.com/getsentry/sentry-java/issues/4909)
- [X] Release registry entries
- [X] Add module to
.craft.yml - [X] Verify caching behaviour of
ServiceLoaderis not an issue (i.e. does not cause a large overhead on each chunk) - otherwise cache (obsolete with getsentry/sentry-java#4815) - [X] Remove / reduce vendored code and replace what's possible with JFR converter, also see https://github.com/async-profiler/async-profiler/issues/1525 #4852
- [X] Upgrade async-profiler to 4.1+ to support newer Java Versions (23+); there are issues with a Converter dictionary (#4853)
- [ ] Log a specific message why Profiling isn't enabled if the wrong config property for setting sample rate is used (
profiles-sample-rate) on JVM - [ ] Enable / showcase Profiling in (more of) our samples #4941
- [ ] Allow configuring
profilingTracesHzviasentry.properties(ExternalOptions) - [ ] Add a
.cursorrules file to explain how Profiling works - [ ] We could reduce profile chunk size / duration based on thread count of the application. This could help in case profiles are dropped due to being too large. Thread.activeCount() or ThreadMXBean (more accurate)
- [ ] https://github.com/getsentry/sentry-java/issues/4768 (we should wait for feedback first, maybe we don't need this)
- [ ] https://github.com/getsentry/sentry-java/issues/4779
- [ ] Detect that
async-profileris the (major) version we expect on startup, crash otherwise - [X] Fix an issue where the profilerId would not be propagated to SentryTracer correctly when used with OTEL Agent (#4854)
- [X] Fix an issue with profiler initialization when using spring or spring-boot in OTEL Agent auto-init mode (#4855)
- [ ] Add profiling option to build plugins:
- [ ] Gradle
- [ ] Maven
@adinauer great news on this being released!
Appreciate it's early days for this feature, but wanted to ask if there any environments where profiling is not expected to work yet?
Haven't had a chance to do much troubleshooting yet (would need to spin it up again and grab the log file/core dump), but tried enabling profiling and it caused a crash.
Running on ECS Fargate on an arm64 instance. It's a Spring Boot app written in Kotlin.
Maybe it's not compatible with arm64? Or Kotlin?
Relevant startup logs:
DEBUG: Started Profiler.
A fatal error has been detected by the Java Runtime Environment:
SIGSEGV (0xb) at pc=0x0000000000000000, pid=1, tid=203
JRE version: OpenJDK Runtime Environment Corretto-24.0.2.12.1 (24.0.2+12) (build 24.0.2+12-FR)
Java VM: OpenJDK 64-Bit Server VM Corretto-24.0.2.12.1 (24.0.2+12-FR, mixed mode, sharing, tiered, compressed class ptrs, z gc, linux-aarch64)
Problematic frame:
C [libasyncProfiler-1255257512175285949.so+0x1f990]
Core dump will be written. Default location: //core.1
An error report file with more information is saved as:
//hs_err_pid1.log
The crash happened outside the Java Virtual Machine in native code.
See problematic frame for where to report the bug.
If you would like to submit a bug report, please visit:
https://github.com/corretto/corretto-24/issues/
Thanks in advance!
Hey @joaocanaverde-blue! Thanks for reaching out and wanting to try profiling.
It's known that the async-profiler library on version 3.0 (which is the one we're using at the moment for our profiling feature) segfaults on JDK 23 and later, so that's what's happening in your case.
As you can see in the list of follow-ups, we have an item to upgrade to the newest version of the library.
Hey @joaocanaverde-blue! Thanks for reaching out and wanting to try profiling. It's known that the
async-profilerlibrary on version 3.0 (which is the one we're using at the moment for our profiling feature) segfaults on JDK 23 and later, so that's what's happening in your case. As you can see in the list of follow-ups, we have an item to upgrade to the newest version of the library.
@lcian I see, missed that one. Thanks!
Hey! Thanks for adding support for JVM profiling - really appreciate it. I’ve been trying to enable it for a Spring Boot application running on Java 24. In the logs I can see:
DEBUG: AsyncProfiler initialized successfully. Version: 4.1
DEBUG: Started Profiler.
DEBUG: Profile chunk finished.
Continuous and UI profiling are both enabled in Sentry, and async-profiler is updated to 4.1. However, I still don’t see any data appearing in the Profiling tab in the Sentry UI. Could you please advise what might be missing or misconfigured?
Hey @reszkapiotr! Thanks for giving it a try and reporting back.
To make profiling work, you'll need to add the sentry-async-profiler dependency.
It seems you've added that one, in addition to async-profiler version 4.1. Is that right?
If that's the case, note that this won't work.
sentry-async-profiler brings in and requires async-profiler version 3.0.
Unfortunately, async-profiler version 3.x. is known to not work on JDK 23 and later, so you won't be able to use profiling unless you downgrade JDK at the moment.
We're working on upgrading async-profiler to version 4.1.
The reason just adding a dependency to version 4.1 of async-profiler to override ours doesn't work, is that version 3.x. and 4.x use a different format for the profiling chunks, and our parser expects the format emitted by version 3.x..
Hi @lcian! Yes, I tried using sentry-async-profiler 8.23.0 with the async-profiler dependency excluded, and then added async-profiler 4.1 separately.
Okay - looking forward to the update. Thanks for getting back to me!
@reszkapiotr Thank you for your patience. With our newest release 8.26.0 we updated to version 4.2 of async-profiler, which now supports the newest Java versions. Please give it a try.
@lbloder Thank you for update! Tested the change and it works great! Was able to make it to work for:
- Slowest functions
- Aggregate flamegraph.
Can't see flamegraphs for specific transactions. Any idea why? Thanks again
Hi @reszkapiotr,
Can't see flamegraphs for specific transactions. Any idea why?
Which setting do you have for sentry.profile-lifecycle? By default it is set to MANUAL but can also be set to TRACE which automatically starts the profiler when a sampled transaction starts and stops it when no transaction is active anymore.
In case of a MANUAL profile lifecycle the profiler needs to be started before the transaction starts, as this is the point at which they are linked together.
So this might have to do with the profiler not being started when the transactions starts.
Can you have a look if that is the case?
@lbloder Got it set to TRACE. Interestingly, in another project that uses Java 21 it works - there are flamegraphs for transactions. In the one running on Java 24, it doesn’t seem to collect transaction profiles specifically (although profiles for the slowest functions are available).
@reszkapiotr Thanks for the additional Info, will give it a try locally to confirm
Hi @reszkapiotr, sorry for the long wait. However we have not been able to reproduce this problem. We used Spring Boot 3 and tested with both Java 24 and 25.
Did you, by any chance, use a different Spring Boot Version? Also, we've been testing on Mac OS, what system are you running this on?
Hello @lbloder We've used spring boot version 3 as well. And system: Mac OS and linux. I guess it's something application specific then that's causing this behaviour. Testing in the meantime integration with otel. Anyway great to have profiling available for Java 24. Thank you for help and regards!