jdk icon indicating copy to clipboard operation
jdk copied to clipboard

8337511: Implement JEP 404: Generational Shenandoah (Experimental)

Open earthling-amzn opened this issue 1 year ago • 13 comments

This PR merges JEP 404, a generational mode for the Shenandoah garbage collector. The JEP can be viewed here: https://openjdk.org/jeps/404. We would like to target JDK24 with this PR.


Progress

  • [x] Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • [x] Change must not contain extraneous whitespace
  • [x] Commit message must refer to an issue
  • [ ] Change requires CSR request JDK-8307343 to be approved
  • [ ] Change requires a JEP request to be targeted

Issues

  • JDK-8337511: Implement JEP 404: Generational Shenandoah (Experimental) (Sub-task - P4)
  • JDK-8260865: JEP 404: Generational Shenandoah (Experimental) (JEP)
  • JDK-8307343: Implementation: JEP 404: Generational Shenandoah (Experimental) (CSR)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/21273/head:pull/21273
$ git checkout pull/21273

Update a local copy of the PR:
$ git checkout pull/21273
$ git pull https://git.openjdk.org/jdk.git pull/21273/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 21273

View PR using the GUI difftool:
$ git pr show -t 21273

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/21273.diff

Webrev

Link to Webrev Comment

earthling-amzn avatar Sep 30 '24 22:09 earthling-amzn

:wave: Welcome back wkemper! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

bridgekeeper[bot] avatar Sep 30 '24 22:09 bridgekeeper[bot]

@earthling-amzn This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8337511: Implement JEP 404: Generational Shenandoah (Experimental)

Co-authored-by: Kelvin Nilsen <[email protected]>
Co-authored-by: Y. Srinivas Ramakrishna <[email protected]>
Co-authored-by: Bernd Mathiske <[email protected]>
Co-authored-by: Martin Doerr <[email protected]>
Co-authored-by: Fei Yang <[email protected]>
Reviewed-by: rkennke, shade, phh

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 3 new commits pushed to the master branch:

  • 2beb2b602bf20f1ec36e6244eca1a2eb50baccb4: 8345234: Build system erroneously treats 32-bit x86 Zero as deprecated
  • ed03f0d9d10518242a3dc6e3685f1bdb0550c723: 8345145: Display javap LineNumberTable and LocalVariableTable iff disassembled code output with -c or -v
  • e9136b5e08abc20038c7b2089ab8fe320e4faef0: 8345223: Remove stray doPrivileged in java.base java.net and sun.net classes after JEP 486 integration

Please see this link for an up-to-date comparison between the source branch of this pull request and the master branch. As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

openjdk[bot] avatar Sep 30 '24 22:09 openjdk[bot]

@earthling-amzn The following labels will be automatically applied to this pull request:

  • hotspot
  • serviceability
  • shenandoah

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

openjdk[bot] avatar Sep 30 '24 22:09 openjdk[bot]

Webrevs

mlbridge[bot] avatar Sep 30 '24 22:09 mlbridge[bot]

⚠️ @earthling-amzn This pull request contains merges that bring in commits not present in the target repository. Since this is not a "merge style" pull request, these changes will be squashed when this pull request in integrated. If this is your intention, then please ignore this message. If you want to preserve the commit structure, you must change the title of this pull request to Merge <project>:<branch> where <project> is the name of another project in the OpenJDK organization (for example Merge jdk:master).

openjdk[bot] avatar Oct 04 '24 18:10 openjdk[bot]

There seems to be something missing:

/home/runner/work/jdk/jdk/src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp: In member function ‘oop ShenandoahHeap::try_evacuate_object(oop, Thread*, ShenandoahHeapRegion*, ShenandoahAffiliation)’:
/home/runner/work/jdk/jdk/src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp:1276:3: error: ‘_evac_tracker’ was not declared in this scope; did you mean ‘_mmu_tracker’?
 1276 |   _evac_tracker->begin_evacuation(thread, size * HeapWordSize);
      |   ^~~~~~~~~~~~~
      |   _mmu_tracker
/home/runner/work/jdk/jdk/src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp: In member function ‘virtual void ShenandoahHeap::print_tracing_info() const’:
/home/runner/work/jdk/jdk/src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp:1534:5: error: ‘evac_tracker’ was not declared in this scope; did you mean ‘mmu_tracker’?
 1534 |     evac_tracker()->print_global_on(&ls);
      |     ^~~~~~~~~~~~
      |     mmu_tracker

rkennke avatar Oct 07 '24 13:10 rkennke

Congratulations! Good to see the great progress. I just built this PR for some testing and found something unexpected. I ran the genshen VS shenandoah(default) with jbb2015 on aarch64 N2 with 8 cores and Xmx8g. The critical-jOPS of genshen(5373) is behind shenandoah(6027). Do I miss something on the options? java -Xmx8g -XX:+UseShenandoahGC -XX:+UnlockExperimentalVMOptions -XX:ShenandoahGCHeuristics=adaptive -XX:ShenandoahGCMode=generational -Xlog:gc* -XX:MetaspaceSize=1g -jar specjbb2015.jar -m COMPOSITE

mmyxym avatar Oct 09 '24 03:10 mmyxym

/jep jep-404

liach avatar Oct 11 '24 01:10 liach

@liach This pull request will not be integrated until the JEP-404 has been targeted.

openjdk[bot] avatar Oct 11 '24 01:10 openjdk[bot]

@earthling-amzn this pull request can not be integrated into master due to one or more merge conflicts. To resolve these merge conflicts and update this pull request you can run the following commands in the local repository for your personal fork:

git checkout great-genshen-pr-redux
git fetch https://git.openjdk.org/jdk.git master
git merge FETCH_HEAD
# resolve conflicts and follow the instructions given by git merge
git commit -m "Merge master"
git push

openjdk[bot] avatar Oct 18 '24 18:10 openjdk[bot]

@mmyxym , Thank you for testing this out! I apologize for not responding to your comment sooner. We run specjbb2015 regularly in our integration pipeline. We see a slight improvement with the generational mode; certainly no regression:

Shen: RUN RESULT: hbIR (max attempted) = 4701, hbIR (settled) = 3934, max-jOPS = 3620, critical-jOPS = 2375
Gen:  RUN RESULT: hbIR (max attempted) = 3934, hbIR (settled) = 3295, max-jOPS = 4013, critical-jOPS = 2470
Shen: RUN RESULT: hbIR (max attempted) = 4701, hbIR (settled) = 3934, max-jOPS = 3667, critical-jOPS = 2397
Gen:  RUN RESULT: hbIR (max attempted) = 4701, hbIR (settled) = 3934, max-jOPS = 3996, critical-jOPS = 2414

These results were produced with these arguments:

-XX:+UseShenandoahGC -XX:+UnlockExperimentalVMOptions -XX:+UnlockDiagnosticV
MOptions -XX:-TieredCompilation -XX:-ShenandoahPacing -XX:+AlwaysPreTouch -XX:+DisableExplicitGC -Xmx10g -Xms10g -XX:ShenandoahGCMode=generational

These runs executed on a Graviton2 host with 4 cores, 16G. I'll run again on a host with more cores and with your exact command line parameters.

earthling-amzn avatar Oct 22 '24 15:10 earthling-amzn

@mmyxym , Thank you for testing this out! I apologize for not responding to your comment sooner. We run specjbb2015 regularly in our integration pipeline. We see a slight improvement with the generational mode; certainly no regression:

Shen: RUN RESULT: hbIR (max attempted) = 4701, hbIR (settled) = 3934, max-jOPS = 3620, critical-jOPS = 2375
Gen:  RUN RESULT: hbIR (max attempted) = 3934, hbIR (settled) = 3295, max-jOPS = 4013, critical-jOPS = 2470
Shen: RUN RESULT: hbIR (max attempted) = 4701, hbIR (settled) = 3934, max-jOPS = 3667, critical-jOPS = 2397
Gen:  RUN RESULT: hbIR (max attempted) = 4701, hbIR (settled) = 3934, max-jOPS = 3996, critical-jOPS = 2414

These results were produced with these arguments:

-XX:+UseShenandoahGC -XX:+UnlockExperimentalVMOptions -XX:+UnlockDiagnosticV
MOptions -XX:-TieredCompilation -XX:-ShenandoahPacing -XX:+AlwaysPreTouch -XX:+DisableExplicitGC -Xmx10g -Xms10g -XX:ShenandoahGCMode=generational

These runs executed on a Graviton2 host with 4 cores, 16G. I'll run again on a host with more cores and with your exact command line parameters.

What's the reason to disable tiered compilation?

zhengyu123 avatar Oct 22 '24 17:10 zhengyu123

We disabled tiered compilation to force everything to compile through C2 to get more consistent results.

earthling-amzn avatar Oct 22 '24 18:10 earthling-amzn

@earthling-amzn ,thanks for providing your options. Looks like disabling pacing would help the score.

-XX:+UseShenandoahGC -XX:+UnlockExperimentalVMOptions -XX:+UnlockDiagnosticVMOptions -XX:-TieredCompilation -XX:-ShenandoahPacing -XX:+AlwaysPreTouch -XX:+DisableExplicitGC

I guess 4 cores may not be a target configuration for pauseless GC since generational pauseless GC requires at least 2 concurrent GC threads that already occupy half of CPU resource. I did a rough test on 8 cores and 16 cores according to your options. GenShen doesn't have obvious advantages with 8 cores and Shenandoah seems to outperform genshen significantly with 16 cores. Would you please help verify the result in your environment?

8 cores -Xmx8g -Xms8g:

GenShen:
RUN RESULT: hbIR (max attempted) = 9640, hbIR (settled) = 8050, max-jOPS = 8098, critical-jOPS = 6132
RUN RESULT: hbIR (max attempted) = 8944, hbIR (settled) = 8574, max-jOPS = 7781, critical-jOPS = 6130

Shenandoah:
RUN RESULT: hbIR (max attempted) = 9640, hbIR (settled) = 8050, max-jOPS = 7616, critical-jOPS = 6194
RUN RESULT: hbIR (max attempted) = 9640, hbIR (settled) = 8050, max-jOPS = 7712, critical-jOPS = 6262

16 cores -Xmx20g -Xms20g:

GenShen:
RUN RESULT: hbIR (max attempted) = 19881, hbIR (settled) = 16584, max-jOPS = 17694, critical-jOPS = 13274
RUN RESULT: hbIR (max attempted) = 19881, hbIR (settled) = 18724, max-jOPS = 17694, critical-jOPS = 13445
Shenandoah:
RUN RESULT: hbIR (max attempted) = 23838, hbIR (settled) = 20446, max-jOPS = 18594, critical-jOPS = 15441
RUN RESULT: hbIR (max attempted) = 20138, hbIR (settled) = 19989, max-jOPS = 18728, critical-jOPS = 14967

mmyxym avatar Oct 30 '24 02:10 mmyxym

@mmyxym - I disabled Shenandoah's pacer, and find that the generational mode out performs the non-generational mode:

Shen: RUN RESULT: hbIR (max attempted) = 16584, hbIR (settled) = 13837, max-jOPS = 11609, critical-jOPS = 11062
Shen: RUN RESULT: hbIR (max attempted) = 16584, hbIR (settled) = 13837, max-jOPS = 11609, critical-jOPS = 10425

Gen:  RUN RESULT: hbIR (max attempted) = 16584, hbIR (settled) = 14144, max-jOPS = 13267, critical-jOPS = 12087
Gen:  RUN RESULT: hbIR (max attempted) = 16584, hbIR (settled) = 13837, max-jOPS = 13267, critical-jOPS = 12151

The shared options for these runs were:

"-Xmx8g -Xms8g -XX:ActiveProcessorCount=8 -XX:+UseShenandoahGC -XX:+UnlockExperimentalVMOptions -XX:MetaspaceSize=1g -XX:-ShenandoahPacing

This is an aarch64 host with 16 cores and 64G of memory.

earthling-amzn avatar Oct 30 '24 19:10 earthling-amzn

@zhengyu123 and @shipilev : could you please take another look and leave your review feedback and/or approval? We'd like to land the changes this week if possible. Currently internal testing is in progress after merging with the latest changes from tip. Thanks!

ysramakrishna avatar Nov 12 '24 02:11 ysramakrishna

The new test TestAllocOutOfMemory.java#large failed on some PPC64 machines:

Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "main"

java.lang.RuntimeException: 'java.lang.OutOfMemoryError: Java heap space' missing from stdout/stderr

Probably not a show-stopper. Should the test be adapted or disabled for this platform?

TheRealMDoerr avatar Nov 14 '24 23:11 TheRealMDoerr

@TheRealMDoerr - do you have the logs from the test failure? If so, could you open a ticket with them in JBS? Thank you!

earthling-amzn avatar Nov 15 '24 00:11 earthling-amzn

@TheRealMDoerr - do you have the logs from the test failure? If so, could you open a ticket with them in JBS? Thank you!

Filed JDK-8344312.

In addition, we see the following tests failing with "java.lang.OutOfMemoryError: Java heap space" on all Shenandoah platforms: gc/stress/jfr/TestStressBigAllocationGCEventsWithShenandoah.java#default gc/stress/jfr/TestStressBigAllocationGCEventsWithShenandoah.java#generational Can you reproduce those?

TheRealMDoerr avatar Nov 15 '24 15:11 TheRealMDoerr

@TheRealMDoerr - Yes, those tests failed for me. I'll look into them. I'm also working to get my hands on a PPC64 machine to investigate JDK-8344312. We appreciate your help here!

earthling-amzn avatar Nov 15 '24 23:11 earthling-amzn

@TheRealMDoerr , it looks like @ysramakrishna has filed a ticket already for the TestStressBigAllocationGCEventsWithShenandoah failures.

earthling-amzn avatar Nov 18 '24 22:11 earthling-amzn

/contributor add @kdnilsen /contributor add @ysramakrishna

earthling-amzn avatar Nov 26 '24 17:11 earthling-amzn

@earthling-amzn Contributor Kelvin Nilsen <[email protected]> successfully added.

openjdk[bot] avatar Nov 26 '24 17:11 openjdk[bot]

@earthling-amzn Contributor Y. Srinivas Ramakrishna <[email protected]> successfully added.

openjdk[bot] avatar Nov 26 '24 17:11 openjdk[bot]

Can you also add the platform code contributors to the Contributors, please?

TheRealMDoerr avatar Nov 26 '24 17:11 TheRealMDoerr

Thank you for getting us started: /contributor add bmathiske

Thank you for the PPC64 port: /contributor add @TheRealMDoerr

Thank you for the RISCV port: /contributor add @RealFYang

earthling-amzn avatar Nov 26 '24 19:11 earthling-amzn

@earthling-amzn Contributor Bernd Mathiske <[email protected]> successfully added.

openjdk[bot] avatar Nov 26 '24 19:11 openjdk[bot]

@earthling-amzn Contributor Martin Doerr <[email protected]> successfully added.

openjdk[bot] avatar Nov 26 '24 19:11 openjdk[bot]

@earthling-amzn Contributor Fei Yang <[email protected]> successfully added.

openjdk[bot] avatar Nov 26 '24 19:11 openjdk[bot]

Thanks! I've run another round of tests on all our platforms and the only failure I have seen is JDK-8344312. I can do more stress testing by making GenShen the default GC. But that can still be done after integration. Could you please add the failing test to the problem list to avoid failures in our CI? gc/shenandoah/oom/TestAllocOutOfMemory.java 8344312 linux-ppc64le

TheRealMDoerr avatar Nov 27 '24 09:11 TheRealMDoerr