jdk 8331865: Consolidate size and alignment checks in LayoutPath

When creating a nested memory access var handle, we ensure that the segment is accessed at the correct alignment for the root layout being accessed. But we do not ensure that the segment has at least the size of the accessed root layout. Example:

MemoryLayout LAYOUT = sequenceLayout(2, structLayout(JAVA_INT.withName("x"), JAVA_INT.withName("y")));
VarHandle X_HANDLE = LAYOUT.varHandle(PathElement.sequenceElement(), PathElement.groupElement("x"));

If I have a memory segment whose size is 12, I can successfully call X_HANDLE on it with index 1, like so:

MemorySegment segment = Arena.ofAuto().allocate(12);
int x = (int)X_HANDLE.get(segment, 0, 1);

This seems incorrect: after all, the provided segment doesn't have enough space to fit two elements of the nested structs.

This PR adds checks to detect this kind of scenario earlier, thus improving usability. To achieve this we could, in principle, just tweak LayoutPath::checkEnclosingLayout and add the required size check.

But doing so will add yet another redundant check on top of the other already added by pull/19124. Instead, this PR rethinks how memory segment var handles are created, in order to eliminate redundant checks.

The main observation is that size and alignment checks depend on an enclosing layout. This layout might be the very same value layout being accessed (this is the case when e.g. using JAVA_INT.varHandle()). But, in the general case, the accessed value layout and the enclosing layout might differ (e.g. think about accessing a struct field).

Furthermore, the enclosing layout check depends on the base offset at which the segment is accessed, prior to any index computation that occurs if the accessed layout path has any open elements. It is important to notice that this base offset is only available when looking at the var handle that is returned to the user. For instance, an indexed var handle with coordinates:

(MemorySegment, long /* base */, long /* index 1 */, long /* index 2 */, long /* index 3 */)

Is obtained through adaptation, by taking a more basic var handle of the form:

(MemorySegment, long /* offset */)

And then injecting the result of the index multiplication into offset. As such, we can't add an enclosing layout check inside the var handle returned by VarHandles::memorySegmentViewHandle, as doing so will end up seeing the wrong offset (e.g. an offset in which the index part has already been mixed in).

The only possibility then, is to remove size and alignment checks from the raw var handles returned by VarHandles::memorySegmentViewHandle, and perform such checks outside (e.g. in LayoutPath::dereferenceHandle). The only checks left in the raw var handles are:

an alignment check to make sure that the access mode selected by the client is really available - this alignment check is against the alignment of the value layout being selected, and not the enclosing layout alignment (which might be stricter)
a read-only check, to make sure that write access modes are blocked if the accessed segment is read-only
liveness/confinement checks, as mandated by ScopedMemoryAccess

Since these check depends on the particular access mode selected by the client, we can't move these checks away from the raw var handle.

These changes come with some consequences:

Now it is always necessary to adapt a raw var handle, and to insert appropriate size and alignment checks (otherwise OOB access might be possible). As such, ValueLayouts also need to call the path layout variant of MemoryLayout::varHandle, to make sure that the raw var handle is adapted accordingly, before it is cached in its stable field.
The var handle cache has been moved from Utils to ValueLayouts::varHandle. The cache is used (a) to reduce the number of var handle instances created and (b) to make sure that the cached var handle in the ValueLayout has stable identity. With respect to (a), while it makes sense to cache "toplevel" var handles (e.g. JAVA_INT.varHandle()) the cost/benefit ratio for caching nested var handles seem more unfavourable, as the same var handle can be reused with different enclosing layouts, leading to different checks. Ultimately, all nested var handles will require some kind of adaptation, so it doesn't seem too crucial to have a deeper level of caching here.
The order in which exceptions are thrown might be slightly different. This is because the size/alignment checks now take place before the raw var handle is even called. This caused a bunch of small updates in code and tests.
It used to be possible to create a sequence layout with maximal size, like MemoryLayout.sequenceLayout(Long.MAX_VALUE, JAVA_LONG), and derive an indexed var handle from that layout. Since there used to be no enclosing layout size check, access to the sequence element was allowed, as long as the index computation did not result in an offset outside the boundary of the accessed memory segment. This is now no longer the case: when selecting an element from the above layout, the implementation will make sure that the accessed segment indeed has the size of that sequence layout (which will probably lead to a IndexOutOfBoundException). To do indexed access on an unbounded sequence, the MemoryLayout::arrayElementVarHandle should be used instead (but this is the norm anyway for such cases).

Progress

[ ] Change must be properly reviewed (1 review required, with at least 1 Reviewer)
[x] Change must not contain extraneous whitespace
[x] Commit message must refer to an issue
[ ] Change requires a CSR request matching fixVersion 23 to be approved (needs to be created)

Issue

JDK-8331865: Consolidate size and alignment checks in LayoutPath (Enhancement - P3)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/19251/head:pull/19251
$ git checkout pull/19251

Update a local copy of the PR:
$ git checkout pull/19251
$ git pull https://git.openjdk.org/jdk.git pull/19251/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 19251

View PR using the GUI difftool:
$ git pr show -t 19251

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/19251.diff

Webrev

Link to Webrev Comment

May 15 '24 15:05 mcimadamore

:wave: Welcome back mcimadamore! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

May 15 '24 15:05 bridgekeeper[bot]

@mcimadamore This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8331865: Consolidate size and alignment checks in LayoutPath

Reviewed-by: psandoz, jvernee

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 162 new commits pushed to the master branch:

da6aa2a86c86ba5fce747b36dcb2d6001cfcc44e: 8332849: Update doc/testing.{md,html} (spelling and stale information)
b8f2ec9091f9f7e5f4611991d04dd8aa113b94fd: 8195675: Call to insertText with single character from custom Input Method ignored
0f3e2cc334e5926d53bbbce22e4a6bfeb2752140: 8331670: Deprecate the Memory-Access Methods in sun.misc.Unsafe for Removal
51ae08f72b879bc611177ea643cd88e36185d9e8: 8333093: Incorrect comment in zAddress_aarch64.cpp
4754f059f99a426cc8c5d94b0809e79d563ffc2e: 8333035: Parallel: Remove ParMarkBitMap::IterationStatus
87a06b6ce41f8623d9111b4e41c72f0ddf842acd: 8325805: Compiler Implementation for Flexible Constructor Bodies (Second Preview)
e708d135e3af7e0652cdbb680388a0735582ba74: 8332064: Implementation of Structured Concurrency (Third Preview)
7b52d0acfc7d6083b407efa0877c139e9837f86b: 8332265: RISC-V: Materialize pointers faster by using a temp register
aa4c83a5bfe146714a46fb454aafc7393d2d8453: 8332505: JEP 457: ClassRemapper forgets to remap bootstrap method references
cabe337400a0bd61d73bf3ca66e16266267299c7: 8331921: Hotspot assembler files should use common logic to setup exported functions
... and 152 more: https://git.openjdk.org/jdk/compare/61aff6db15d5bdda77427af5ce34d0fe43373197...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

May 15 '24 15:05 openjdk[bot]

@mcimadamore The core-libs label was successfully added.

May 15 '24 15:05 openjdk[bot]

/csr

May 16 '24 10:05 mcimadamore

@mcimadamore has indicated that a compatibility and specification (CSR) request is needed for this pull request.

@mcimadamore please create a CSR request for issue JDK-8331865 with the correct fix version. This pull request cannot be integrated until the CSR request is approved.

May 16 '24 10:05 openjdk[bot]

Webrevs

May 16 '24 11:05 mlbridge[bot]

Do we have any performance tests available to see if there are any potential impacts?

May 16 '24 11:05 minborg

Do we have any performance tests available to see if there are any potential impacts?

I've run all micro benchmarks whose name match LoopOver*. No regression was found. Few benchmarks seems a tad faster, but the percentages are so tiny that I'm sure it has nothing to do with the patch. I compared results before/after using JMH visualizer.

May 16 '24 13:05 mcimadamore

Do we have any performance tests available to see if there are any potential impacts?

I've run all micro benchmarks whose name match LoopOver*. No regression was found. Few benchmarks seems a tad faster, but the percentages are so tiny that I'm sure it has nothing to do with the patch. I compared results before/after using JMH visualizer.

Results here: https://cr.openjdk.org/~mcimadamore/jdk/8331865/ Unfortunately I can't seem to be able to upload the results in the visualizer (which would be handy, so I could share a link to the results). Not sure if it's an issue in the visualizer, or in the cr server itself. But, it is possible to at least download the above results locally, and then upload them to the visualizer tool.

May 16 '24 14:05 mcimadamore

Separately, we might be missing a few long argument accepting guard methods for simpler cases as I suspect they are still focused more on int index types.

Not sure I understand what guard methods you are referring to here?

May 20 '24 09:05 mcimadamore

Separately, we might be missing a few long argument accepting guard methods for simpler cases as I suspect they are still focused more on int index types.

Not sure I understand what guard methods you are referring to here?

Ah, got it. You mean more support in VarHandleGuards. Yes, I have a separate patch for that (actually had that for quite a while), but we're not super sure how to evaluate what impact it has :-)

May 20 '24 16:05 mcimadamore

Ah, got it. You mean more support in VarHandleGuards. Yes, I have a separate patch for that (actually had that for quite a while), but we're not super sure how to evaluate what impact it has :-)

Ah, i did not realize that. Yes its tricky, it was designed for VHs to fields/arrays, to really minimize their overhead. With segments there is already some additional overhead e.g.,

        if (derefAdapters.length == 0) {
            // insert align check for the root layout on the initial MS + offset
            List<Class<?>> coordinateTypes = handle.coordinateTypes();
            MethodHandle alignCheck = MethodHandles.insertArguments(MH_CHECK_ENCL_LAYOUT, 2, rootLayout());
            handle = MethodHandles.collectCoordinates(handle, 0, alignCheck);
            int[] reorder = IntStream.concat(IntStream.of(0, 1), IntStream.range(0, coordinateTypes.size())).toArray();
            handle = MethodHandles.permuteCoordinates(handle, coordinateTypes, reorder);
        }

So perhaps it does not make much difference.

May 20 '24 16:05 PaulSandoz

some additional overhead e.g.,

        if (derefAdapters.length == 0) {
            // insert align check for the root layout on the initial MS + offset
            List<Class<?>> coordinateTypes = handle.coordinateTypes();
            MethodHandle alignCheck = MethodHandles.insertArguments(MH_CHECK_ENCL_LAYOUT, 2, rootLayout());
            handle = MethodHandles.collectCoordinates(handle, 0, alignCheck);
            int[] reorder = IntStream.concat(IntStream.of(0, 1), IntStream.range(0, coordinateTypes.size())).toArray();
            handle = MethodHandles.permuteCoordinates(handle, coordinateTypes, reorder);
        }

True, the chain for segment var handle is quite long (and that is not a result of this patch, it has always been more or less like that). Funnily, I was staring at this piece of code the other day, and I think this can be dealt with morally with a foldArguments, but we don't have the equivalent of that in the VH/coordinates world. Maybe we should add that (or at least implement internally) as that would simplify the adaptation, avoiding the permute step (which is typically rather heavy).

May 20 '24 18:05 mcimadamore

/integrate

May 29 '24 11:05 mcimadamore

Going to push as commit c003c1207fae07bcfe5a6f642a9c05e6c591e7a6. Since your change was applied there have been 173 commits pushed to the master branch:

6d718ae51aeb7143ebfa561501b87fe1ba48039a: 8324341: Remove redundant preprocessor #if's checks
9b64ece514cf941ebc727991d97c43453d8a488d: 8332904: ubsan ppc64le: c1_LIRGenerator_ppc.cpp:581:21: runtime error: signed integer overflow: 9223372036854775807 + 1 cannot be represented in type 'long int'
3d4eb159e6d597f37081faf21b7e3f0f1af299e5: 8302744: Refactor Hotspot container detection code
2cca83bc82eb6b090ae96b8c072b986b93d9244a: 8332880: JFR GCHelper class recognizes "Archive" regions as valid
b8ae11e99b99866888ad090c98c96e6d0c33a3c9: 8332960: ubsan: classListParser.hpp:159:12: runtime error: load of value 2101478704, which is not a valid value for type 'ParseMode'
9a83dfee14f4cd9cda476d11a027294a810953cb: 8332431: NullPointerException in JTable of SwingSet2
01060ad4ab18581aa46bc16e64c7f12a591a682b: 8325083: jdk/incubator/vector/Double512VectorTests.java crashes in Assembler::vex_prefix_and_encode
673f767dadc8f3a784b9c31c406422846df3279b: 8285506: Unify os::vsnprintf implementations
91ab088d5e64e068bafcda8d08f1769c39ba10d6: 8333116: test/jdk/tools/jpackage/share/ServiceTest.java test fails
9ac8d05a2567fbf65b944660739e5f8ad1fc2020: 8332228: TypePollution.java: Unrecognized VM option 'UseSecondarySuperCache'
... and 163 more: https://git.openjdk.org/jdk/compare/61aff6db15d5bdda77427af5ce34d0fe43373197...master

Your commit was automatically rebased without conflicts.

May 29 '24 11:05 openjdk[bot]

@mcimadamore Pushed as commit c003c1207fae07bcfe5a6f642a9c05e6c591e7a6.

:bulb: You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

May 29 '24 11:05 openjdk[bot]

Do we have any performance tests available to see if there are any potential impacts?

I've run all micro benchmarks whose name match LoopOver*. No regression was found. Few benchmarks seems a tad faster, but the percentages are so tiny that I'm sure it has nothing to do with the patch. I compared results before/after using JMH visualizer.

Results here: https://cr.openjdk.org/~mcimadamore/jdk/8331865/ Unfortunately I can't seem to be able to upload the results in the visualizer (which would be handy, so I could share a link to the results). Not sure if it's an issue in the visualizer, or in the cr server itself. But, it is possible to at least download the above results locally, and then upload them to the visualizer tool.

Here's a link to the results: https://jmh.morethan.io/?sources=https://corsproxy.io/?https://cr.openjdk.org/~mcimadamore/jdk/8331865/loop_over_00_baseline.json,https://corsproxy.io/?https://cr.openjdk.org/~mcimadamore/jdk/8331865/loop_over_01_8331865.json

May 29 '24 15:05 mcimadamore

jdk jdk copied to clipboard

8331865: Consolidate size and alignment checks in LayoutPath

Progress

Issue

Reviewing

Webrev

Webrevs

jdk
jdk copied to clipboard