jdk icon indicating copy to clipboard operation
jdk copied to clipboard

8335493: check_gc_overhead_limit should reset SoftRefPolicy::_should_clear_all_soft_refs

Open mmyxym opened this issue 1 year ago • 14 comments
trafficstars

It's for fixing the issue should_clear_all_soft_refs is not reset after Parallel Full GC.

Tested test/hotspot/jtreg/gc with -XX:+UseParallelGC


Progress

  • [ ] Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • [x] Change must not contain extraneous whitespace
  • [x] Commit message must refer to an issue

Issue

  • JDK-8335493: check_gc_overhead_limit should reset SoftRefPolicy::_should_clear_all_soft_refs (Bug - P4)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/19982/head:pull/19982
$ git checkout pull/19982

Update a local copy of the PR:
$ git checkout pull/19982
$ git pull https://git.openjdk.org/jdk.git pull/19982/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 19982

View PR using the GUI difftool:
$ git pr show -t 19982

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/19982.diff

Webrev

Link to Webrev Comment

mmyxym avatar Jul 02 '24 06:07 mmyxym

:wave: Welcome back lmao! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

bridgekeeper[bot] avatar Jul 02 '24 06:07 bridgekeeper[bot]

@mmyxym This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8335493: check_gc_overhead_limit should reset SoftRefPolicy::_should_clear_all_soft_refs

Reviewed-by: ayang

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 53 new commits pushed to the master branch:

  • f4fa35e28b9881729ac47c8518e758bba676fdec: 8330954: since-checker - Fix remaining @ since tags in java.base
  • 3050ba017687ac13e1bbccdd1544d25f8eb2a747: 8335654: Remove stale hyperlink in divnode.cpp
  • da0ffa8b7ff04eb5cbc0fcbe4b858f20d7e46405: 8334031: Generated JfrNativeSettings seems off
  • b0efd7740243916ba22178524ab2ede9e5436d94: 8314653: Metaspace: remove allocation guard feature
  • 6a472797a410a6fa27f50371b255054af0cd3c99: 8332072: Convert package.html files in java.naming to package-info.java
  • 7e378fccd8a4601c8b8e86aa2862c61e469c3a04: 8335667: Fix simple -Wzero-as-null-pointer-constant warnings in compiler code
  • ced99066354fc6a32c587b9e3c35b07e26d3452e: 8334371: [AIX] Beginning with AIX 7.3 TL1 mmap() supports 64K memory pages
  • 916db07e533cdc0fca2010751f7ebe54e6ada7b9: 8335532: [JVMCI] Export VM_Version::L1_line_size in JVMCI
  • c0604fb823d9f3b2e347a9857b11606b223ad8ec: 8334890: Missing unconditional cross modifying fence in nmethod entry barriers
  • cf1be87279ddfb2a9fd272e0b245fccd7ec10972: 8335663: Fix simple -Wzero-as-null-pointer-constant warnings in C2 code
  • ... and 43 more: https://git.openjdk.org/jdk/compare/5fe07b36d9eb296661692d903ed0b9b5afefba0f...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@albertnetymk) but any other Committer may sponsor as well.

➡️ To flag this PR as ready for integration with the above commit message, type /integrate in a new comment. (Afterwards, your sponsor types /sponsor in a new comment to perform the integration).

openjdk[bot] avatar Jul 02 '24 06:07 openjdk[bot]

@mmyxym The following label will be automatically applied to this pull request:

  • hotspot-gc

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

openjdk[bot] avatar Jul 02 '24 06:07 openjdk[bot]

Webrevs

mlbridge[bot] avatar Jul 02 '24 07:07 mlbridge[bot]

Inspecting the existing call chains of set_should_clear_all_soft_refs, I can see that Parallel full-gc already invokes it via adaptive-size-policy.

albertnetymk avatar Jul 02 '24 08:07 albertnetymk

Inspecting the existing call chains of set_should_clear_all_soft_refs, I can see that Parallel full-gc already invokes it via adaptive-size-policy.

Yes, set_should_clear_all_soft_refs will be set to true adaptively but never reset to false.

mmyxym avatar Jul 02 '24 08:07 mmyxym

set_should_clear_all_soft_refs will be set to true adaptively but never reset to false.

I see; does it make sense to reset to false inside check_gc_overhead_limit so that callers of set_should_clear_all_soft_refs are in the same scope? (Other callers of set_should_clear_all_soft_refs always invokes it in pairs.)

If this reset-to-false occurs in the destructor as in your PR, who sets it to true for Parallel or other GCs?

albertnetymk avatar Jul 02 '24 09:07 albertnetymk

set_should_clear_all_soft_refs will be set to true adaptively but never reset to false.

I see; does it make sense to reset to false inside check_gc_overhead_limit so that callers of set_should_clear_all_soft_refs are in the same scope? (Other callers of set_should_clear_all_soft_refs always invokes it in pairs.)

If this reset-to-false occurs in the destructor as in your PR, who sets it to true for Parallel or other GCs?

I guess the GC cannot be guranteed to happen because of GC Locker for Parallel GC so resetting in ~ClearedAllSoftRefs might be better. As you mentioned, check_gc_overhead_limit sets it for Parallel GC. Shenandoah explicitly sets true/false since it doesn't have the gc lock blocking issue.

mmyxym avatar Jul 02 '24 10:07 mmyxym

set_should_clear_all_soft_refs should match in pairs. If it's reset in the destructor, where is corresponding set-to-true?

(In the Parallel case, the destructor is invoked after check_gc_overhead_limit so the reset would overwrite the set-to-true there.)

albertnetymk avatar Jul 03 '24 10:07 albertnetymk

set_should_clear_all_soft_refs should match in pairs. If it's reset in the destructor, where is corresponding set-to-true?

(In the Parallel case, the destructor is invoked after check_gc_overhead_limit so the reset would overwrite the set-to-true there.)

check_gc_overhead_limit sets should_clear_all_soft_refs to true and the sucessful full gc will reset should_clear_all_soft_refs. I didn't the get the point...

mmyxym avatar Jul 03 '24 10:07 mmyxym

In PSParallelCompact::invoke_no_policy, the structure is essentially:

{
  ClearedAllSoftRefs ...
  check_gc_overhead_limit; // set-to-true
}

The destructor (as in this PR) would cancel out the clear-soft-ref in the upcoming GCs.

the sucessful full gc

When does "unsuccessful" full-gc occur?

albertnetymk avatar Jul 03 '24 10:07 albertnetymk

In PSParallelCompact::invoke_no_policy, the structure is essentially:

{
  ClearedAllSoftRefs ...
  check_gc_overhead_limit; // set-to-true
}

The destructor (as in this PR) would cancel out the clear-soft-ref in the upcoming GCs.

the sucessful full gc

When does "unsuccessful" full-gc occur?

Thanks, I see. Do you think reset should_clear_all_soft_refs in constructor of ClearedAllSoftRefs is a better way?

   ClearedAllSoftRefs(bool clear_all_soft_refs, SoftRefPolicy* soft_ref_policy) :
     _clear_all_soft_refs(clear_all_soft_refs),
-    _soft_ref_policy(soft_ref_policy) {}
+    _soft_ref_policy(soft_ref_policy) {
+    if (_clear_all_soft_refs) {
+      _soft_ref_policy->set_should_clear_all_soft_refs(false);
+    }
+  }

mmyxym avatar Jul 03 '24 10:07 mmyxym

Personally, I believe ClearedAllSoftRefs will not be necessary in the long run, as it is only used by the full GC of Parallel and G1.

To address the missing set_should_clear_all_soft_refs(false) in Parallel, a quick fix might be to modify the code around check_gc_overhead_limit so that both setters (true and false) are visible within the same scope. However, a more comprehensive improvement, in my opinion, would be to enhance the GC overhead checking in Parallel. This could be done by invoking the check at the beginning of the GC pause. This way, the decision to clear soft references can be made at the start of the GC pause and applied to the current GC pause, instead of deferring it to future GCs. While I believe this approach is viable, it would require further performance testing to confirm its effectiveness once a prototype is ready.

albertnetymk avatar Jul 03 '24 10:07 albertnetymk

Personally, I believe ClearedAllSoftRefs will not be necessary in the long run, as it is only used by the full GC of Parallel and G1.

To address the missing set_should_clear_all_soft_refs(false) in Parallel, a quick fix might be to modify the code around check_gc_overhead_limit so that both setters (true and false) are visible within the same scope. However, a more comprehensive improvement, in my opinion, would be to enhance the GC overhead checking in Parallel. This could be done by invoking the check at the beginning of the GC pause. This way, the decision to clear soft references can be made at the start of the GC pause and applied to the current GC pause, instead of deferring it to future GCs. While I believe this approach is viable, it would require further performance testing to confirm its effectiveness once a prototype is ready.

I have made the reset in check_gc_overhead_limit.

mmyxym avatar Jul 04 '24 05:07 mmyxym

/integrate

mmyxym avatar Jul 05 '24 02:07 mmyxym

@mmyxym Your change (at version 8dc625dcfce5af82e8d75ee86a24acc1266e8d92) is now ready to be sponsored by a Committer.

openjdk[bot] avatar Jul 05 '24 02:07 openjdk[bot]

/sponsor

D-D-H avatar Jul 05 '24 02:07 D-D-H

Going to push as commit cff9e246cc2fbd3914f40bb71daa85dcf7731396. Since your change was applied there have been 53 commits pushed to the master branch:

  • f4fa35e28b9881729ac47c8518e758bba676fdec: 8330954: since-checker - Fix remaining @ since tags in java.base
  • 3050ba017687ac13e1bbccdd1544d25f8eb2a747: 8335654: Remove stale hyperlink in divnode.cpp
  • da0ffa8b7ff04eb5cbc0fcbe4b858f20d7e46405: 8334031: Generated JfrNativeSettings seems off
  • b0efd7740243916ba22178524ab2ede9e5436d94: 8314653: Metaspace: remove allocation guard feature
  • 6a472797a410a6fa27f50371b255054af0cd3c99: 8332072: Convert package.html files in java.naming to package-info.java
  • 7e378fccd8a4601c8b8e86aa2862c61e469c3a04: 8335667: Fix simple -Wzero-as-null-pointer-constant warnings in compiler code
  • ced99066354fc6a32c587b9e3c35b07e26d3452e: 8334371: [AIX] Beginning with AIX 7.3 TL1 mmap() supports 64K memory pages
  • 916db07e533cdc0fca2010751f7ebe54e6ada7b9: 8335532: [JVMCI] Export VM_Version::L1_line_size in JVMCI
  • c0604fb823d9f3b2e347a9857b11606b223ad8ec: 8334890: Missing unconditional cross modifying fence in nmethod entry barriers
  • cf1be87279ddfb2a9fd272e0b245fccd7ec10972: 8335663: Fix simple -Wzero-as-null-pointer-constant warnings in C2 code
  • ... and 43 more: https://git.openjdk.org/jdk/compare/5fe07b36d9eb296661692d903ed0b9b5afefba0f...master

Your commit was automatically rebased without conflicts.

openjdk[bot] avatar Jul 05 '24 02:07 openjdk[bot]

@D-D-H @mmyxym Pushed as commit cff9e246cc2fbd3914f40bb71daa85dcf7731396.

:bulb: You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

openjdk[bot] avatar Jul 05 '24 02:07 openjdk[bot]

@D-D-H - Did you review this change before sponsoring? HotSpot changes generally require 2 reviewers.

kimbarrett avatar Jul 05 '24 07:07 kimbarrett

@D-D-H - Did you review this change before sponsoring? HotSpot changes generally require 2 reviewers.

@kimbarrett Sorry, I think it's not critical and kind of trivial so integrate without another reviewer. Do you agree the fix makes sense?

mmyxym avatar Jul 05 '24 08:07 mmyxym

At a quick glance it looks okay, but I've not looked at it carefully enough to say I've reviewed it. Given the amount of discussion I can't agree with it being "trivial". Small, yes, but with potentially significant impact. And HotSpot rules around trivial changes and bypassing some of the review process requires agreement by the Reviewer, of which there is none recorded in the discussion here.

kimbarrett avatar Jul 05 '24 08:07 kimbarrett

@kimbarrett

Sorry about that. I overlooked the rule that hotspot non-trivial changes require reviews from at least two reviewers. I'll be more careful in the future to follow the rules strictly. 

D-D-H avatar Jul 05 '24 08:07 D-D-H

At a quick glance it looks okay, but I've not looked at it carefully enough to say I've reviewed it. Given the amount of discussion I can't agree with it being "trivial". Small, yes, but with potentially significant impact. And HotSpot rules around trivial changes and bypassing some of the review process requires agreement by the Reviewer, of which there is none recorded in the discussion here.

@kimbarrett Thanks for the clarification. I would follow the exact progress next time.

mmyxym avatar Jul 05 '24 08:07 mmyxym

Hi @D-D-H / @mmyxym / @albertnetymk / @kimbarrett

My name is Leela Mohan Venati. I am part of Service Now JDK team (@zhengyu123 is also part of it) . I found this issue few weeks back. We were trying to file this bug upstream to fix it. In the meantime, Good to see this fix coming. Thanks.

As part of fixing the bug, I did some archeology to understand how this issue occurred. This bug was introduced as part of 8198515: Extract SoftReferencePolicy code out of CollectorPolicy . As the name suggests, this change introduced new classes SoftRefPolicy and SoftRefGenPolicy. These classes factored out CollectorPolicy classes. These new classes importantly hold two important attributes - a) "_all_soft_refs_clear" - Boolean value enabled after the successful completion of "clear all" soft ref policy enabled Full GC.
b) " _should_clear_all_soft_refs" - Boolean value which is enabled ergonomically by overhead checker. Note that Overhead checker is only applicable for parallelGC and only after full GC and when adaptive sizing policy is enabled only. This value is used for future FullGC.

Note that first flag is generally be applicable for any GC algorithm irrespective of overhead checker is available or not. Reason: "Allocation failure" GC requests can ask for "clear all" soft ref policy Full GCs. It is internally called "max compaction" Full GC.
However, second flag is only applicable if there is overhead checker which ergonomically enables "clear all" policy.

GC algorithms which don't have the overhead checker are expected to use SoftRefPolicy type for "_soft_ref_policy" field. Using SoftRefPolicy, there is no need of "resetting" of "_should_clear_all_soft_refs" after Full GC is completed. Check the SoftRefPolicy::cleared_all_soft_refs(). GenCollectedHeap (which is for Serial GC) should have used SoftRefPolicy instead of SoftRefGenPolicy here

GC algorithms which have the overhead checker are expected to use SoftRefGenPolicy for "_soft_ref_policy" field. This allows "resetting" of "_should_clear_all_soft_refs" after Full GC is completed. Check the SoftRefGenPolicy::cleared_all_soft_refs . Also note that , it handles "near overhead limit" to reset for not. ParallelScavengeHeap (which is for Parallel GC) should have used SoftRefGenPolicy instead of SoftRefPolicy here. That's the bug.

Except for the introduction of the bug, change itself is nice cleanup change.

Now why am i explaining this back story ?

I would have liked this change to be fixed using SoftRefGenPolicy type for "_soft_ref_policy" field in ParallelScavengeHeap. However, i do notice that SoftRefGenPolicy is removed in the master that means, we need add the file which was removed.

On jdk17u branch, i made a patch (see the attachment) which renames SoftGenRefPolicy as "PSSoftRefPolicy" (following the convention of parallel GC specific class) and used PSSoftRefPolicy for "soft_ref_policy" field . patch.txt

leelamv avatar Jul 11 '24 01:07 leelamv

Hi @leelamv , I just want to fix the unpaired set/reset of clear-soft-ref for Parallel GC. You can still refactor the policy on Parallel GC and @albertnetymk had some suggestions in review comments. I think that would be quite welcome.

mmyxym avatar Jul 12 '24 05:07 mmyxym

/backport jdk21u-dev

mmyxym avatar Jul 22 '24 05:07 mmyxym

@mmyxym the backport was successfully created on the branch backport-mmyxym-cff9e246-master in my personal fork of openjdk/jdk21u-dev. To create a pull request with this backport targeting openjdk/jdk21u-dev:master, just click the following link:

:arrow_right: Create pull request

The title of the pull request is automatically filled in correctly and below you find a suggestion for the pull request body:

Hi all,

This pull request contains a backport of commit cff9e246 from the openjdk/jdk repository.

The commit being backported was authored by Liang Mao on 5 Jul 2024 and was reviewed by Albert Mingkun Yang.

Thanks!

If you need to update the source branch of the pull then run the following commands in a local clone of your personal fork of openjdk/jdk21u-dev:

$ git fetch https://github.com/openjdk-bots/jdk21u-dev.git backport-mmyxym-cff9e246-master:backport-mmyxym-cff9e246-master
$ git checkout backport-mmyxym-cff9e246-master
# make changes
$ git add paths/to/changed/files
$ git commit --message 'Describe additional changes made'
$ git push https://github.com/openjdk-bots/jdk21u-dev.git backport-mmyxym-cff9e246-master

openjdk[bot] avatar Jul 22 '24 05:07 openjdk[bot]

/backport jdk23u

mmyxym avatar Jul 22 '24 08:07 mmyxym

@mmyxym the backport was successfully created on the branch backport-mmyxym-cff9e246-master in my personal fork of openjdk/jdk23u. To create a pull request with this backport targeting openjdk/jdk23u:master, just click the following link:

:arrow_right: Create pull request

The title of the pull request is automatically filled in correctly and below you find a suggestion for the pull request body:

Hi all,

This pull request contains a backport of commit cff9e246 from the openjdk/jdk repository.

The commit being backported was authored by Liang Mao on 5 Jul 2024 and was reviewed by Albert Mingkun Yang.

Thanks!

If you need to update the source branch of the pull then run the following commands in a local clone of your personal fork of openjdk/jdk23u:

$ git fetch https://github.com/openjdk-bots/jdk23u.git backport-mmyxym-cff9e246-master:backport-mmyxym-cff9e246-master
$ git checkout backport-mmyxym-cff9e246-master
# make changes
$ git add paths/to/changed/files
$ git commit --message 'Describe additional changes made'
$ git push https://github.com/openjdk-bots/jdk23u.git backport-mmyxym-cff9e246-master

⚠️ @mmyxym You are not yet a collaborator in my fork openjdk-bots/jdk23u. An invite will be sent out and you need to accept it before you can proceed.

openjdk[bot] avatar Jul 22 '24 08:07 openjdk[bot]