jdk
jdk copied to clipboard
8335392: C2 MergeStores: enhanced pointer parsing
Background
I am introducing the MemPointer
, for enhanced pointer parsing. For now, it replaces the much more limited ArrayPointer
in MergeStores
(see https://github.com/openjdk/jdk/pull/16245), but eventually it is supposed to be used widely in optimizations for pointer analysis: adjacency, aliasing, etc. I also plan to refactor the VPointer
from auto-vectorization with it, and unlock more pointer patterns that way - possibly including scatter/gather.
Details
The MemPointer
decomposes a pointer into the form pointer = con + sum_i(scale_i * variable_i)
- a linear form with a sum of variables and scale-coefficients, plus some constant offset.
This form allows us to perform aliasing checks - basically we can check if two pointers are always at a constant offset. This allows us to answer many questions, including if two pointers are adjacent. MergeStores
needs to know if two stores are adjacent, so that we can safely merge them.
More details can be found in the description in mempointer.hpp
. Please read them when reviewing!
MemPointer
is more powerful than the previous ArrayPointer
: the latter only allows arrays, the former also allows native memory accesses, Unsafe
and MemorySegement
.
What this change enables
Before this change, we only allowed merging stores to arrays, where the store had to have the same type as the array element (StoreB
on byte[]
, StoreI
on int[]
).
Now we can do:
- Merging
Unsafe
stores to array. Including "mismatched size": e.g.putChar
tobyte[]
. - Merging
Unsafe
stores to native memory. - Merging
MemorySegment
: with array, native, ByteBuffer backing types.- However: there is still some problem with RangeCheck smearing (a type of RC elimination) for the examples I have tried. Without RC's smeared, we can only ever merge 2 neighbouring stores. I hope we can improve this with better RangeCheck smearing.
MemorySegment
introducecheckIndexL
, the long-variant of the RangeCheck. Normal array accesses only use the equivalent ofcheckIndex
, the int-variant that we already optimize away much better.
- However: there is still some problem with RangeCheck smearing (a type of RC elimination) for the examples I have tried. Without RC's smeared, we can only ever merge 2 neighbouring stores. I hope we can improve this with better RangeCheck smearing.
Dealing with Overflows
We have to be very careful with overflows when dealing with pointers. For this, I introduced a NoOverflowInt
. It allows us to do "normal" int operations on it, and tracks if there was ever an overflow. This way, we can do all overflow checks implicitly, and do not clutter the code with overflow-checks or - God forbid - forget overflow-checks.
Benchmarks
I added a few new benchmarks, to show the merging of Unsafe
and native
stores. We an see that 8 byte stores are now merged, and have the same performance as a long store. The same for 4 char stores that are merged into a single long store.
And here the whole MergeStores
benchmark, master
and with patch
:
All old benchmarks remain in the level of noise. A few have heavy errors - the benchmark ran on my laptop and maybe something slowed down for a bit. We can see the significant speedup with all the unsafe
benchmarks at the bottom.
This is great, now we can perform putChar
even on byte[]
. This was an issue we discovered in this PR https://github.com/openjdk/jdk/pull/19626.
Progress
- [x] Change must be properly reviewed (1 review required, with at least 1 Reviewer)
- [x] Change must not contain extraneous whitespace
- [x] Commit message must refer to an issue
Issue
- JDK-8335392: C2 MergeStores: enhanced pointer parsing (Enhancement - P4)
Reviewers
- Vladimir Kozlov (@vnkozlov - Reviewer)
Reviewing
Using git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/19970/head:pull/19970
$ git checkout pull/19970
Update a local copy of the PR:
$ git checkout pull/19970
$ git pull https://git.openjdk.org/jdk.git pull/19970/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 19970
View PR using the GUI difftool:
$ git pr show -t 19970
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/19970.diff