jdk
jdk copied to clipboard
8342540: InterfaceCalls micro-benchmark gives misleading results
InterfaceCalls.java makes highly predictable memory accesses, which leads to a gross time underestimate of the case where a megamorphic access is unpredictable.
Here's one example, with and without randomization. The unpredictable megamorphic call takes more than 4* as long as the benchmark.
Benchmark (randomized) Mode Cnt Score Error Units
InterfaceCalls.test2ndInt3Types false avgt 4 5.013 ± 0.081 ns/op
InterfaceCalls.test2ndInt3Types true avgt 4 23.421 ± 0.102 ns/op
This patch adds the "randomized" parameter, which allows the measurement of predictable and unpredictable megamorphic calls.
Progress
- [x] Change must be properly reviewed (1 review required, with at least 1 Reviewer)
- [x] Change must not contain extraneous whitespace
- [x] Commit message must refer to an issue
Issue
- JDK-8342540: InterfaceCalls micro-benchmark gives misleading results (Enhancement - P4)
Reviewers
- Aleksey Shipilev (@shipilev - Reviewer)
Reviewing
Using git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/21581/head:pull/21581
$ git checkout pull/21581
Update a local copy of the PR:
$ git checkout pull/21581
$ git pull https://git.openjdk.org/jdk.git pull/21581/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 21581
View PR using the GUI difftool:
$ git pr show -t 21581
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/21581.diff
Webrev
:wave: Welcome back aph! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.
@theRealAph This change now passes all automated pre-integration checks.
ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.
After integration, the commit message for the final commit will be:
8342540: InterfaceCalls micro-benchmark gives misleading results
Reviewed-by: shade, kvn
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.
At the time when this comment was updated there had been 221 new commits pushed to the master branch:
- 9cfb0f7f7ad31081c917be1eb0e39e2552e45382: 8341527: AVX-512 intrinsic for SHA3
- 4ce19ca110b6e1eeed7483a1ec7c75fbc1d1b773: 8343190: GHA: Try building JTReg several times
- 7c800e6bae388dd87986f366787398fe99b4e2ee: 8343026: JFR: Index into fields in the topFrame
- d8b3685d36873904248e9701f66459e074a4a8ab: 8342607: Enhance register printing on x86_64 platforms
- d8430efb5e159b8e08d2cac66b46cb4ff1112927: 8339573: Update CodeCacheSegmentSize and CodeEntryAlignment for ARM
- 6332e258f91789cf50d07a6929f32ff3aaef1a92: 8343183: [s390x]: Problemlist runtime/Monitor/SyncOnValueBasedClassTest.java Failure
- 79a07ad726f4e4b0502a22a55832960aa1561911: 8343149: Cleanup os::print_tos_pc on AIX
- beff8bfe2a5334823b67cb748bc8652dc6a3f3d4: 8342823: Ubsan: ciEnv.cpp:1614:65: runtime error: member call on null pointer of type 'struct CompileTask'
- e389f82b1b2365a43fef744936b222328d71494b: 8343137: C2: VerifyLoopOptimizations fails with "Was reachable in only one"
- 0abfa3ba8f72538f62be838c1ebac8cfbdd14cdf: 8304824: NMT should not use ThreadCritical
- ... and 211 more: https://git.openjdk.org/jdk/compare/1581508988141bfb420d97759138203f30926b35...master
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.
➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.
@theRealAph The following label will be automatically applied to this pull request:
hotspot-compiler
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.
Is there even a point to do non-randomized test then?
That's a very interesting question, and one that has been occupying me since I discovered this problem a few days ago.
I have no idea how often well-predicted megamorphic calls occur. I can speculate that the "typical" megamorphic case lies somewhere between these extremes, but that is all. It may well be that normal behaviour is chaotic, but I strongly suspect that the cases are unlikely to be equally probable, as they are here. So, it is possible that an utterly unpredictable access pattern is just as unrealistic as a perfectly predictable one.
I have no idea how often well-predicted megamorphic calls occur. I can speculate that the "typical" megamorphic case lies somewhere between these extremes, but that is all. It may well be that normal behaviour is chaotic, but I strongly suspect that the cases are unlikely to be equally probable, as they are here. So, it is possible that an utterly unpredictable access pattern is just as unrealistic as a perfectly predictable one.
Yeah, all right. We can have a test that brackets the real-world performance between "best" and "worst" cases.
If you want to make "best" and "worst" only differ in the actual payload, maybe we should compute both scrambled and non-scrambled indexes, feed both to Blackhole, and only then select the index based on randomized. This way we would always compute both indexes.
Also stats question: Do you know if scrambling actually produces the full period in requested range, and whether the frequency for individual cases is roughly the same? We kinda assume this xor-step is well distributed and has good enthropy in lower bits, but is it in practice?
I have no idea how often well-predicted megamorphic calls occur. I can speculate that the "typical" megamorphic case lies somewhere between these extremes, but that is all. It may well be that normal behaviour is chaotic, but I strongly suspect that the cases are unlikely to be equally probable, as they are here. So, it is possible that an utterly unpredictable access pattern is just as unrealistic as a perfectly predictable one.
Yeah, all right. We can have a test that brackets the real-world performance between "best" and "worst" cases.
If you want to make "best" and "worst" only differ in the actual payload, maybe we should compute both scrambled and non-scrambled indexes, feed both to
Blackhole, and only then select the index based onrandomized. This way we would always compute both indexes.
Sure, I can try that, but I doubt it'll affect much. The great virtue of the xorshift generator is to be so lightweight, just a few clocks, that it barely affects anything.
Also stats question: Do you know if scrambling actually produces the full period in requested
range, and whether the frequency for individual cases is roughly the same? We kinda assume this xor-step is well distributed and has good enthropy in lower bits, but is it in practice?
I don't think we assume it: the analysis is in Marsaglia's Xorshift RNGs paper. The full period (except 0) is proven there.
The generator I've used here is a compromise: it has to be just good enough to throw off a branch predictor, while still being of very low overhead. There are several elaborations of Xorshift with better statistical properties (see "xoroshiro*") but they are less successful at just getting out of the way.
I don't think we assume it: the analysis is in Marsaglia's Xorshift RNGs paper. The full period (except 0) is proven there.
Yeah, all right. Seeing is believing, and indeed the distributions look okay:
public class XorShiftHisto {
static int l;
static int step(int range) {
l = scramble(l);
return (l & Integer.MAX_VALUE) % range;
}
static int scramble(int n) {
int x = n;
x ^= x << 13;
x ^= x >>> 17;
x ^= x << 5;
return x == 0 ? 1 : x;
}
public static void main(String... args) {
for (int range = 1; range <= 10; range++) {
l = 0;
int[] histo = new int[range];
for (int c = 0; c < 1000000; c++) {
histo[step(range)]++;
}
System.out.println("Histo " + range + ": " + Arrays.toString(histo));
}
}
}
$ java XorShiftHisto.java
Histo 1: [1000000]
Histo 2: [500093, 499907]
Histo 3: [333036, 333888, 333076]
Histo 4: [250116, 249824, 249977, 250083]
Histo 5: [200577, 199994, 199940, 199945, 199544]
Histo 6: [166273, 166493, 166425, 166763, 167395, 166651]
Histo 7: [142746, 142358, 142815, 143556, 143004, 142908, 142613]
Histo 8: [124841, 124510, 125431, 124673, 125275, 125314, 124546, 125410]
Histo 9: [110846, 111088, 110885, 110834, 111145, 111201, 111356, 111655, 110990]
Histo 10: [100197, 100061, 99901, 99953, 100070, 100380, 99933, 100039, 99992, 99474]
/integrate
Going to push as commit 78b378ad03d0f6c85468ac208e84fabea79fc7de.
Since your change was applied there have been 341 commits pushed to the master branch:
- c0e6c3b93c0d21debc538e0135805c2957053108: 8343214: Fix encoding errors in APX New Data Destination Instructions Support
- 0be7118b2f761b416ebf8cbb11473d51e80be409: 8279016: JFR Leak Profiler is broken with Shenandoah
- 6811a11e278118b8b2781f1eaf45d363a3d2db49: 8341408: Implement JEP 488: Primitive Types in Patterns, instanceof, and switch (Second Preview)
- 72a45ddbad9c343200197348ccfcf74105e6fefa: 8341834: C2 compilation fails with "bad AD file" due to Replicate
- 57c3bb6091f8ba0caced6f5ecf21dc998ffeee9f: 8343068: C2: CastX2P Ideal transformation not always applied
- 83f3d42d6bcefac80449987f4d951f8280eeee3a: 8339303: C2: dead node after failing to match cloned address expression
- ead0116f2624e0e34529e47e4f509142d588b994: 8331341: secondary_super_cache does not scale well: C1 and interpreter
- 06d8216a4ef6b883119459da7e52b37d16cd2f03: 8318442: java/net/httpclient/ManyRequests2.java fails intermittently on Linux
- bdd68163df4d9b63694bfc0900e4b5ddb2475834: 8343502: RISC-V: SIGBUS in updateBytesCRC32 after JDK-8339738
- 4431852a880b06241231d346311170331c20ab2d: 8342943: Replace predicate walking and cloning code for main/post loops with a predicate visitor
- ... and 331 more: https://git.openjdk.org/jdk/compare/1581508988141bfb420d97759138203f30926b35...master
Your commit was automatically rebased without conflicts.
@theRealAph Pushed as commit 78b378ad03d0f6c85468ac208e84fabea79fc7de.
:bulb: You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.