mandrel icon indicating copy to clipboard operation
mandrel copied to clipboard

JFR test crosses RSS set threshold with Mandrel 25.0.0.1

Open Karm opened this issue 2 months ago • 10 comments

JFRPerfTest crosses the RSS memory threshold. The number might seem trivial at first sight, but it's consistent over many runs, CI and local, RHEL 8-like and RHEL 9-like Linux systems, amd64. Mandrel 21 or 24 does not cross that threshold, i.e. JFR overhead is smaller there. Th increase might be warranted by more work being done though.

Mandrel Integration Testsuite

With Mandrel 25.0.0.1:

$ mvn clean verify 
    -Ptestsuite -DincludeTags=reproducers,perfcheck,runtimes
    -Dtest=JFRTest#jfrPerfTest
    -Dquarkus.version=3.28.1
    -Dquarkus.native.container-runtime=podman -Drootless.container-runtime=true
    -Dpodman.with.sudo=false 2>&1 | tee log.log
Quarkus 3.28.1
[ERROR]   JFRTest.jfrPerfTest:185->jfrPerfTestRun:229->startComparisonForBenchmark:342
Application JFR_PERFORMANCE in mode diff_native consumed 40 kB of RSS memory more ,
which is over 35 kB threshold by 14%.

Quarkus 3.20.3
[ERROR] JFRTest.jfrPerfTest:185->jfrPerfTestRun:229->startComparisonForBenchmark:342
Application JFR_PERFORMANCE in mode diff_native consumed 38 kB of RSS memory more , 
which is over 35 kB threshold by 9%.

Quarkus 3.15.7
[ERROR] JFRTest.jfrPerfTest:185->jfrPerfTestRun:229->startComparisonForBenchmark:342
Application JFR_PERFORMANCE in mode diff_native consumed 39 kB of RSS memory more , 
which is over 35 kB threshold by 11%.

Regardless the iterations, one does not hit this with an older, i.e. 24 major or 21 major Mandrel.

Karm avatar Sep 28 '25 18:09 Karm

@roberttoyonaga FYI, perhaps it could be explained by an increased coverage, added instrumentation? Not sure atm. Worth a note though as it's a consistent result, not a flaky test.

Karm avatar Sep 28 '25 18:09 Karm

It would be worth noting the new JFR features in Mandrel 25 vs Mandrel 24 and below.

jerboaa avatar Sep 29 '25 08:09 jerboaa

There haven't been any major JFR features added between Mandrel 24 and 25. At the SubstrateVM-level there have been bug fixes and minor internal improvements, but nothing that should cause a significant increase in RSS. There have been 3 major JFR features added to open JDK in 25, but they are all implemented in Hotspot (not much in jdk.jfr). We do not support them yet so they won't have an effect in Native Image. I'll dig deeper and try to figure out what could be causing this.

roberttoyonaga avatar Sep 29 '25 14:09 roberttoyonaga

@Karm Has the overall RSS consumption changed between 24 and 25?

Side note: It's very strange that it's consistently 38-40 % larger. I think there's usually more variation.

roberttoyonaga avatar Sep 29 '25 15:09 roberttoyonaga

~~@Karm do you know if Quarkus is running with more threads than before? I searched the repository for quarkus.vertx.event-loops-pool-size but could not find anything. If more threads are running, then there will be additional JFR 2 buffers for each platform thread (500kB each by default).~~

Update 1: I dumped JFR snapshots and compared the thread count at startup between 24 and 25. They are both 21. So no difference there.

Running the jfrPerfTest on my computer I get this for RSS: Mandrel 24: (app with JFR) 54919 vs (no JFR) 41537 kB Mandrel 25: (app with JFR) 54232 vs (no JFR) 39084 kB The thresholds have actually been misnamed, they are % differences, not absolute differences in kB (PR fixing this). This makes the JFR tests different than the other tests that compare against thresholds. So this means that since the RSS without JFR decreased in Mandrel 25, the threshold is crossed (because the relative contribution of JFR to RSS is higher). Even though the RSS with JFR remained the same. It seems like whatever optimization reduced the RSS in Mandrel 25 is does not happen when JFR is included in the build. I'll try and figure out what the change was and why.

Update 2: I ran a Quarkus getting-started quickstart with native memory tracking enabled. Native memory sizes shown below are sampled at startup and shutdown. It looks like JFR is consuming the same amount of native memory in Mandrel 24 and 25. I also noticed that the Java heap committed size at start up is ~2MB less in Mandrel 25. This matches the decrease in RSS reported by jfrPerfTest logs.

In Mandrel 24 with Quarkus 3.18.4 Image

In Mandrel 25 with Quarkus 3.18.4 Image

It's still unclear why executables built with JFR do not experience the smaller heap usage in 25.

roberttoyonaga avatar Sep 29 '25 16:09 roberttoyonaga

This issue appears to be stale because it has been open 30 days with no activity. This issue will be closed in 7 days unless Stale label is removed, a new comment is made, or not-Stale label is added.

github-actions[bot] avatar Oct 30 '25 00:10 github-actions[bot]

I haven't forgotten about this. It's still on my todo list. I plan to revisit it next week.

roberttoyonaga avatar Oct 30 '25 13:10 roberttoyonaga

Adding the not-Stale label helps silence the stale bot :)

jerboaa avatar Oct 30 '25 18:10 jerboaa

Just an update: After some more digging, it seems like at least some of the JFR RSS increase is due to new features in JDK 25. Some of those features we support, some we do not. For the features we do not support, I made a PR https://github.com/oracle/graal/pull/12500 to make the feature code unreachable. This reduces code area and image heap size.

I'm also working on a PR to reduce JFR native memory consumption.

roberttoyonaga avatar Nov 07 '25 15:11 roberttoyonaga

I opened another PR to help reduce JFR native memory consumption https://github.com/oracle/graal/pull/12502

roberttoyonaga avatar Nov 07 '25 21:11 roberttoyonaga