This PR improves both the performance of nextExponential and nextGaussian and the distribution of output at the tails. It fixes the following imperfections:

Repeatedly adding DoubleZigguratTables.exponentialX0 to extra causes a rounding error to accumulate at the tail of the distribution (probably starting around 2*exponentialX0 == 0x1.e46eff20739afp3 ~ 15.1); this PR fixes that by tracking the multiple of exponentialX0 as a long. (This distortion is worst when x > 0x1.0p56 since in that case, a rounding error means extra + x == extra.
Reduces several equations using Math.fma. (This will almost certainly improve performance, and may or may not improve output distribution.)
Uses the newly-extracted computeWinsorizedNextExponential function to greatly reduce the probability that nextGaussian suffers from two tail cases of nextExponential.

Progress

[ ] Change must be properly reviewed (1 review required, with at least 1 Reviewer)
[x] Change must not contain extraneous whitespace
[x] Commit message must refer to an issue

Issue

JDK-8284493: Fix rounding error in computeNextExponential

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk pull/8131/head:pull/8131
$ git checkout pull/8131

Update a local copy of the PR:
$ git checkout pull/8131
$ git pull https://git.openjdk.org/jdk pull/8131/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 8131

View PR using the GUI difftool:
$ git pr show -t 8131

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/8131.diff

Apr 06 '22 17:04 Pr0methean

Hi @Pr0methean, welcome to this OpenJDK project and thanks for contributing!

We do not recognize you as Contributor and need to ensure you have signed the Oracle Contributor Agreement (OCA). If you have not signed the OCA, please follow the instructions. Please fill in your GitHub username in the "Username" field of the application. Once you have signed the OCA, please let us know by writing /signed in a comment in this pull request.

If you already are an OpenJDK Author, Committer or Reviewer, please click here to open a new issue so that we can record that fact. Please use "Add GitHub user Pr0methean" as summary for the issue.

If you are contributing this work on behalf of your employer and your employer has signed the OCA, please let us know by writing /covered in a comment in this pull request.

Apr 06 '22 17:04 bridgekeeper[bot]

@Pr0methean The following label will be automatically applied to this pull request:

core-libs

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

Apr 06 '22 17:04 openjdk[bot]

/covered @Pr0methean is an employee of Amazon.

Apr 07 '22 00:04 navyxliu

/covered

Apr 07 '22 01:04 Pr0methean

Thank you! Please allow for a few business days to verify that your employer has signed the OCA. Also, please note that pull requests that are pending an OCA check will not usually be evaluated, so your patience is appreciated!

Apr 07 '22 01:04 bridgekeeper[bot]

hi, @JesperIRL, Could you help us on OCA coverage?

Apr 12 '22 16:04 navyxliu

Should we extract a computeWinsorizedNextExponential that can compute a smaller value for line 1216 to compare the output to, so that line 1218 can realistically be covered in a unit test? Such a method might even be worth exposing in the RandomGenerator interface, in case an approximately exponential distribution is ever needed in a hard real-time system.

Apr 14 '22 04:04 Pr0methean

/covered

@robilad, can you please verify Chris Hennick's (@Pr0methean) OCA status? He's an Amazon employee and covered by Amazons OCA.

Apr 14 '22 11:04 simonis

Webrevs

May 10 '22 14:05 mlbridge[bot]

In addition to the changes discussed heretofore, I've also changed line 1382 to eliminate unneeded tail exploration; this should make nextGaussian faster at high percentiles (probably measurable at 99%ile; should definitely be measurable at at 99.99%ile).

Jun 02 '22 03:06 Pr0methean

@Pr0methean Please do not rebase or force-push to an active PR as it invalidates existing review comments. All changes will be squashed into a single commit automatically when integrating. See OpenJDK Developers’ Guide for more information.

Jun 06 '22 15:06 openjdk-notifier[bot]

@Pr0methean Please do not rebase or force-push to an active PR as it invalidates existing review comments. All changes will be squashed into a single commit automatically when integrating. See OpenJDK Developers’ Guide for more information.

Jun 19 '22 00:06 openjdk-notifier[bot]

@Pr0methean Please do not rebase or force-push to an active PR as it invalidates existing review comments. All changes will be squashed into a single commit automatically when integrating. See OpenJDK Developers’ Guide for more information.

Jun 19 '22 00:06 openjdk-notifier[bot]

@Pr0methean Please do not rebase or force-push to an active PR as it invalidates existing review comments. All changes will be squashed into a single commit automatically when integrating. See OpenJDK Developers’ Guide for more information.

Jun 19 '22 00:06 openjdk-notifier[bot]

@Pr0methean This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

Jul 18 '22 03:07 bridgekeeper[bot]

Keep open; I'll submit some benchmarks when I have a chance.

Jul 18 '22 05:07 Pr0methean

@Pr0methean This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

Aug 15 '22 09:08 bridgekeeper[bot]

Keep open

Aug 16 '22 23:08 Pr0methean

@Pr0methean This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

Sep 13 '22 23:09 bridgekeeper[bot]

Keep open

Sep 19 '22 17:09 Pr0methean

Update: just waiting for approval from work to obtain c6a.metal, c6i.metal and c6g.metal instances for long enough to run the benchmarks before and after.

Oct 03 '22 22:10 Pr0methean

@Pr0methean Please do not rebase or force-push to an active PR as it invalidates existing review comments. All changes will be squashed into a single commit automatically when integrating. See OpenJDK Developers’ Guide for more information.

Oct 04 '22 17:10 openjdk-notifier[bot]

@Pr0methean This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

Nov 01 '22 21:11 bridgekeeper[bot]

Keep open

Nov 02 '22 01:11 Pr0methean

@Pr0methean This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

Nov 30 '22 03:11 bridgekeeper[bot]

Keep open. Testing is in progress right now.

Nov 30 '22 04:11 Pr0methean

JMH benchmark results (make test TEST="micro:java.util.random") with fixedSeed = true on EC2 are below. Average time improves only for nextExponential() and only on non-Intel CPUs, but p100 improved more consistently (7 of the 12 tests improved by at least 1000ns, only 2 deteriorated by that amount.)

c6i.metal

L64X128MixRandom.nextExponential() average ns/op: 26.984 ± 0.015 before, 34.914 ± 0.021 after L64X128MixRandom.nextExponential() p100 ns/op: 8224 before, 5624 after L64X1024MixRandom.nextExponential() average ns/op: 30.026 ± 0.020 before, 36.730 ± 0.024 after L64X1024MixRandom.nextExponential() p100 ns/op: 8736 before, 7464 after L64X128MixRandom.nextGaussian() average ns/op: 29.995 ± 0.017 before, 30.016 ± 0.019 after L64X128MixRandom.nextGaussian() p100 ns/op: 5392 before, 3024 after L64X1024MixRandom.nextGaussian() average ns/op: 31.394 ± 0.020 before, 31.660 ± 0.021 after L64X1024MixRandom.nextGaussian() p100 ns/op: 5984 before, 6072 after

c7g.16xlarge

L64X128MixRandom.nextExponential() average ns/op: 41.006 ± 0.056 before, 40.288 ± 0.055 after L64X128MixRandom.nextExponential() p100 ns/op: 6384 before, 6416 after L64X1024MixRandom.nextExponential() average ns/op: 45.218 ± 0.071 before, 40.549 ± 0.062 after L64X1024MixRandom.nextExponential() p100 ns/op: 560 before, 3760 after L64X128MixRandom.nextGaussian() average ns/op: 39.643 ± 0.052 before, 40.473 ± 0.070 after L64X128MixRandom.nextGaussian() p100 ns/op: 7728 before, 1264 after L64X1024MixRandom.nextGaussian() average ns/op: 40.393 ± 0.061 before, 40.274 ± 0.061 after L64X1024MixRandom.nextGaussian() p100 ns/op: 9184 before, 1280 after

m6a.metal

L64X128MixRandom.nextExponential() average ns/op: 31.241 ± 0.039 before, 29.004 ± 0.018 after L64X128MixRandom.nextExponential() p100 ns/op: 4224 before, 7224 after L64X1024MixRandom.nextExponential() average ns/op: 31.903 ± 0.017 before, 31.146 | ± | 0.021 | ns/op | L64X1024MixRandom.nextExponential() p100 ns/op: 9744 before, 8688 after L64X128MixRandom.nextGaussian() average ns/op: 29.164 ± 0.017 before, 29.073 ± 0.021 after L64X128MixRandom.nextGaussian() p100 ns/op: 5920 before, 1504 after L64X1024MixRandom.nextGaussian() average ns/op: 29.503 ± 0.022 before, 29.639 ± 0.018 after L64X1024MixRandom.nextGaussian() p100 ns/op: 4256 before, 4288 after

Dec 01 '22 21:12 Pr0methean

@simonis @navyxliu @turbanoff @rgiulietti This is ready for review with the benchmark results.

Dec 01 '22 21:12 Pr0methean

hi, @Pr0methean

I notice that there is a big regression in L64X1024MixRandom.nextExponential p100 on c7g.16xlarge

L64X1024MixRandom.nextExponential() p100 ns/op: 560 before, 3760 after

but the average ns/op improves. Can you explain the discrepancy?

btw: I updated the JBS issue to ensure the subject is aligned with your PR.

Thanks, --lx

Dec 02 '22 20:12 navyxliu

It's probably just a statistical fluke involving a very large return value, but to know for sure I'd need a way to gather statistics about the return values and how they correlate with running-time outliers.

Dec 02 '22 20:12 Pr0methean

jdk jdk copied to clipboard

8284493: Fix rounding error in computeNextExponential

Progress

Issue

Reviewing

Webrevs

c6i.metal

c7g.16xlarge

m6a.metal

jdk
jdk copied to clipboard