valhalla icon indicating copy to clipboard operation
valhalla copied to clipboard

8370479: [lworld] OOP-related crashes via mvn surefire

Open Arraying opened this issue 2 weeks ago • 3 comments
trafficstars

Hi all,

This PR fixes JDK-8370479, as we currently don't emit barriers correctly. Tested tiers 1-3.

Context

If we consider a value class/record that contains a single field, which is a reference to an identity object:

public static value record Element(Identity underlying) {}
public static class Identity {}

We can create a flattened array via the JDK-internal ValueClass API:

Object[] array = ValueClass.newNullableAtomicArray(Element.class, 16);

This will indeed be flattened when running with compressed oops. T the reference to underlying will be four bytes, and the null-marker an additional byte. Hence, we are below the 64-bit limit. Copying this array via Arrays.copyOf will trigger Valhalla-specific copying.

When running with G1, there are various crashes and verification errors. This should not impact ZGC, as the pointers are too large to be flattened in a nullable array.

New Barrier Emission

We do not emit a post-write barrier when copying to an uninitialized memory destination. The tables below summarize what barriers, if any, are emitted both in the old and new versions of the copy implementation. Note that Serial, Parallel and G1 have the notion of post-write barriers to track intergenerational references. G1 is the only GC requiring pre-write barriers.

Old G1 barrier emission during flat array copying:

oopless contains oops
uninitialized
initialized pre, post

New G1 barrier emission during flat array copying:

oopless contains oops
uninitialized post
initialized pre, post

As mentioned, when copying to uninitialized memory, a cross-generational could be "lost" due to the lack of a post-write barrier. We should not use a pre-write barrier when copying to uninitialized memory when running with G1. Doing so means we may get garbage in our SATB buffers.

New Test Cases

I introduce a test scenario where we grow a flat array similar to how one would grow an ArrayList. This should generate plenty of garbage, and triggers this crash even without the whitebox GC. I test the three GCs that use ModRefBarrierSet: Serial, Parallel and G1. These are tweaked to be less concurrent/parallel to aid with reproducability in case of crashes.

Fixed Oop Printing

When G1 verification fails, it tries to print diagnostic information. This will eventually end up printing oops. We handle the case of String oops specially, and for that we need to check the klass. However, in this failed verification state, we can't guarantee that the class isn't garbage (either through a race or literal garbage). While debugging this issue, I ran into a scenario where the klass does not pass assertion. Consequently, we crash before the helpful diagnostic error messages finish printing. I've introduced a klass_without_asserts version of the string check, intended to be used only for diagnostics, which will perform the String check even if the VM is metaphorically on fire after a failed GC. That way, G1 is able to finish printing what it wants to print.


Progress

  • [x] Change must not contain extraneous whitespace

Issue

  • JDK-8370479: [lworld] OOP-related crashes via mvn surefire (Bug - P4)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/valhalla.git pull/1713/head:pull/1713
$ git checkout pull/1713

Update a local copy of the PR:
$ git checkout pull/1713
$ git pull https://git.openjdk.org/valhalla.git pull/1713/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 1713

View PR using the GUI difftool:
$ git pr show -t 1713

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/valhalla/pull/1713.diff

Using Webrev

Link to Webrev Comment

Arraying avatar Nov 03 '25 13:11 Arraying

:wave: Welcome back phubner! A progress list of the required criteria for merging this PR into lworld will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

bridgekeeper[bot] avatar Nov 03 '25 13:11 bridgekeeper[bot]

@Arraying This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8370479: [lworld] OOP-related crashes via mvn surefire

Reviewed-by: fparain, coleenp

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 8 new commits pushed to the lworld branch:

  • 047e2327071e7f5a78f09f3a19980df2a82c00ba: 8370450: [lworld] Alternate implementation of the substitutability test method
  • f638741248d57152bd9f07f338db36798d9a2697: 8370951: [lworld] Value record ClassCircularityError
  • 7b5f1056363a4e636d21467ae28f99cf48d8d1f4: 8370484: [lworld] PhaseOutput::FillLocArray asserts with Unexpected type: anyptr
  • ... and 5 more: https://git.openjdk.org/valhalla/compare/a60678c089387f1a72f176008f9bfef9fcad947b...lworld

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@fparain, @coleenp) but any other Committer may sponsor as well.

➡️ To flag this PR as ready for integration with the above commit message, type /integrate in a new comment. (Afterwards, your sponsor types /sponsor in a new comment to perform the integration).

openjdk[bot] avatar Nov 03 '25 13:11 openjdk[bot]

Webrevs

mlbridge[bot] avatar Nov 04 '25 08:11 mlbridge[bot]

Thanks for your reviews @fparain @coleenp! /integrate

Arraying avatar Nov 05 '25 09:11 Arraying

@Arraying Your change (at version 18472fd274267d167635ddf89cb0b5d118086efb) is now ready to be sponsored by a Committer.

openjdk[bot] avatar Nov 05 '25 09:11 openjdk[bot]

/sponsor

MrSimms avatar Nov 05 '25 09:11 MrSimms

Going to push as commit 6f5c72ecd048ca9d980248a2f4af87eab0a9b996. Since your change was applied there have been 8 commits pushed to the lworld branch:

  • 047e2327071e7f5a78f09f3a19980df2a82c00ba: 8370450: [lworld] Alternate implementation of the substitutability test method
  • f638741248d57152bd9f07f338db36798d9a2697: 8370951: [lworld] Value record ClassCircularityError
  • 7b5f1056363a4e636d21467ae28f99cf48d8d1f4: 8370484: [lworld] PhaseOutput::FillLocArray asserts with Unexpected type: anyptr
  • ... and 5 more: https://git.openjdk.org/valhalla/compare/a60678c089387f1a72f176008f9bfef9fcad947b...lworld

Your commit was automatically rebased without conflicts.

openjdk[bot] avatar Nov 05 '25 09:11 openjdk[bot]

@MrSimms @Arraying Pushed as commit 6f5c72ecd048ca9d980248a2f4af87eab0a9b996.

:bulb: You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

openjdk[bot] avatar Nov 05 '25 09:11 openjdk[bot]