8315585: Optimization for decimal to string
Continue to complete PR #16006 and PR #21593 to improve BigDecimal::toString and BigDecimal::toPlainString performance and reduce duplicate code
Progress
- [x] Change must not contain extraneous whitespace
- [x] Commit message must refer to an issue
- [ ] Change must be properly reviewed (2 reviews required, with at least 2 Reviewers)
Issue
- JDK-8315585: Optimization for decimal to string (Enhancement - P4)
Reviewing
Using git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/23310/head:pull/23310
$ git checkout pull/23310
Update a local copy of the PR:
$ git checkout pull/23310
$ git pull https://git.openjdk.org/jdk.git pull/23310/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 23310
View PR using the GUI difftool:
$ git pr show -t 23310
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/23310.diff
Using Webrev
:wave: Welcome back swen! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.
❗ This change is not yet ready to be integrated. See the Progress checklist in the description for automated requirements.
@wenshao The following label will be automatically applied to this pull request:
-
core-libs
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.
A great cleanup that consolidates scale2, unscaledAbsString, and unscaledString.
The performance numbers as follows. In the smallScale2EngineeringToString and smallScale2LayoutCharsToString scenarios, the performance is the same as before, but there are very significant performance improvements in other scenarios.
1. Script
git remote add wenshao [email protected]:wenshao/jdk.git
git fetch wenshao
#baseline
git checkout 4040d766e9cecf781be1d790e5e7836368acc7bd
make test TEST="micro:java.math.BigDecimals.huge"
make test TEST="micro:java.math.BigDecimals.large"
make test TEST="micro:java.math.BigDecimals.small"
# current
git checkout cb88e0efe6cc3e241b49951b076dccb1c690f7cb
make test TEST="micro:java.math.BigDecimals.huge"
make test TEST="micro:java.math.BigDecimals.large"
make test TEST="micro:java.math.BigDecimals.small"
2. aliyun_ecs_c8a_x64 (CPU AMD EPYC™ Genoa)
-# baseline
-Benchmark Mode Cnt Score Error Units (4040d766e9c)
-BigDecimals.hugeEngineeringToString avgt 15 170.621 ± 0.834 ns/op
-BigDecimals.hugeLayoutCharsToString avgt 15 169.597 ± 0.373 ns/op
-BigDecimals.hugePlainToString avgt 15 156.598 ± 1.141 ns/op
-BigDecimals.largeScale2EngineeringToString avgt 15 34.733 ± 2.767 ns/op
-BigDecimals.largeScale2LayoutCharsToString avgt 15 33.044 ± 0.153 ns/op
-BigDecimals.largeScale2PlainToString avgt 15 30.302 ± 0.042 ns/op
-BigDecimals.largeScale3EngineeringToString avgt 15 39.054 ± 0.084 ns/op
-BigDecimals.largeScale3LayoutCharsToString avgt 15 39.180 ± 0.065 ns/op
-BigDecimals.largeScale3PlainToString avgt 15 31.261 ± 0.081 ns/op
-BigDecimals.smallScale2EngineeringToString avgt 15 11.364 ± 0.019 ns/op
-BigDecimals.smallScale2LayoutCharsToString avgt 15 11.387 ± 0.038 ns/op
-BigDecimals.smallScale2PlainToString avgt 15 29.513 ± 0.045 ns/op
-BigDecimals.smallScale3EngineeringToString avgt 15 33.208 ± 0.166 ns/op
-BigDecimals.smallScale3LayoutCharsToString avgt 15 33.171 ± 0.084 ns/op
-BigDecimals.smallScale3PlainToString avgt 15 28.995 ± 0.037 ns/op
+# current
+Benchmark Mode Cnt Score Error Units (cb88e0efe6c)
+BigDecimals.hugeEngineeringToString avgt 15 158.130 ± 2.980 ns/op
+BigDecimals.hugeLayoutCharsToString avgt 15 156.163 ± 0.591 ns/op
+BigDecimals.hugePlainToString avgt 15 157.425 ± 1.655 ns/op
+BigDecimals.largeScale2EngineeringToString avgt 15 18.381 ± 0.037 ns/op
+BigDecimals.largeScale2LayoutCharsToString avgt 15 18.158 ± 0.048 ns/op
+BigDecimals.largeScale2PlainToString avgt 15 18.416 ± 0.034 ns/op
+BigDecimals.largeScale3EngineeringToString avgt 15 19.319 ± 0.035 ns/op
+BigDecimals.largeScale3LayoutCharsToString avgt 15 19.081 ± 0.021 ns/op
+BigDecimals.largeScale3PlainToString avgt 15 19.771 ± 0.063 ns/op
+BigDecimals.smallScale2EngineeringToString avgt 15 11.122 ± 0.031 ns/op
+BigDecimals.smallScale2LayoutCharsToString avgt 15 11.105 ± 0.037 ns/op
+BigDecimals.smallScale2PlainToString avgt 15 11.531 ± 0.030 ns/op
+BigDecimals.smallScale3EngineeringToString avgt 15 15.830 ± 0.032 ns/op
+BigDecimals.smallScale3LayoutCharsToString avgt 15 15.821 ± 0.047 ns/op
+BigDecimals.smallScale3PlainToString avgt 15 15.430 ± 0.176 ns/op
| baseline | current | delta | |
|---|---|---|---|
| BigDecimals.hugeEngineeringToString | 170.621 | 158.130 | 7.90% |
| BigDecimals.hugeLayoutCharsToString | 169.597 | 156.163 | 8.60% |
| BigDecimals.hugePlainToString | 156.598 | 157.425 | -0.53% |
| BigDecimals.largeScale2EngineeringToString | 34.733 | 18.381 | 88.96% |
| BigDecimals.largeScale2LayoutCharsToString | 33.044 | 18.158 | 81.98% |
| BigDecimals.largeScale2PlainToString | 30.302 | 18.416 | 64.54% |
| BigDecimals.largeScale3EngineeringToString | 39.054 | 19.319 | 102.15% |
| BigDecimals.largeScale3LayoutCharsToString | 39.180 | 19.081 | 105.34% |
| BigDecimals.largeScale3PlainToString | 31.261 | 19.771 | 58.12% |
| BigDecimals.smallScale2EngineeringToString | 11.364 | 11.122 | 2.18% |
| BigDecimals.smallScale2LayoutCharsToString | 11.387 | 11.105 | 2.54% |
| BigDecimals.smallScale2PlainToString | 29.513 | 11.531 | 155.94% |
| BigDecimals.smallScale3EngineeringToString | 33.208 | 15.830 | 109.78% |
| BigDecimals.smallScale3LayoutCharsToString | 33.171 | 15.821 | 109.66% |
| BigDecimals.smallScale3PlainToString | 28.995 | 15.430 | 87.91% |
3. The performance numbers under MacBook M1 Pro
-# baseline
-Benchmark Mode Cnt Score Error Units (4040d766e9c)
-BigDecimals.hugeEngineeringToString avgt 15 193.054 ? 26.472 ns/op
-BigDecimals.hugeLayoutCharsToString avgt 15 212.770 ? 6.918 ns/op
-BigDecimals.hugePlainToString avgt 15 230.857 ? 4.276 ns/op
-BigDecimals.largeScale2EngineeringToString avgt 15 45.413 ? 1.318 ns/op
-BigDecimals.largeScale2LayoutCharsToString avgt 15 46.862 ? 0.878 ns/op
-BigDecimals.largeScale2PlainToString avgt 15 33.184 ? 2.787 ns/op
-BigDecimals.largeScale3EngineeringToString avgt 15 71.579 ? 3.913 ns/op
-BigDecimals.largeScale3LayoutCharsToString avgt 15 70.623 ? 4.559 ns/op
-BigDecimals.largeScale3PlainToString avgt 15 30.200 ? 1.164 ns/op
-BigDecimals.smallScale2EngineeringToString avgt 15 9.788 ? 0.097 ns/op
-BigDecimals.smallScale2LayoutCharsToString avgt 15 9.741 ? 0.046 ns/op
-BigDecimals.smallScale2PlainToString avgt 15 35.357 ? 1.161 ns/op
-BigDecimals.smallScale3EngineeringToString avgt 15 53.001 ? 2.682 ns/op
-BigDecimals.smallScale3LayoutCharsToString avgt 15 52.704 ? 2.706 ns/op
-BigDecimals.smallScale3PlainToString avgt 15 35.690 ? 2.847 ns/op
+# current
+Benchmark Mode Cnt Score Error Units (cb88e0efe6c)
+BigDecimals.hugeEngineeringToString avgt 15 194.490 ? 39.908 ns/op
+BigDecimals.hugeLayoutCharsToString avgt 15 170.158 ? 39.788 ns/op
+BigDecimals.hugePlainToString avgt 15 139.038 ? 0.640 ns/op
+BigDecimals.largeScale2EngineeringToString avgt 15 15.172 ? 0.186 ns/op
+BigDecimals.largeScale2LayoutCharsToString avgt 15 15.118 ? 0.082 ns/op
+BigDecimals.largeScale2PlainToString avgt 15 15.247 ? 0.125 ns/op
+BigDecimals.largeScale3EngineeringToString avgt 15 16.643 ? 0.085 ns/op
+BigDecimals.largeScale3LayoutCharsToString avgt 15 16.653 ? 0.229 ns/op
+BigDecimals.largeScale3PlainToString avgt 15 16.970 ? 0.115 ns/op
+BigDecimals.smallScale2EngineeringToString avgt 15 9.893 ? 0.051 ns/op
+BigDecimals.smallScale2LayoutCharsToString avgt 15 9.952 ? 0.149 ns/op
+BigDecimals.smallScale2PlainToString avgt 15 10.058 ? 0.023 ns/op
+BigDecimals.smallScale3EngineeringToString avgt 15 14.146 ? 0.198 ns/op
+BigDecimals.smallScale3LayoutCharsToString avgt 15 14.147 ? 0.035 ns/op
+BigDecimals.smallScale3PlainToString avgt 15 14.068 ? 0.029 ns/op
| baseline | current | delta | |
|---|---|---|---|
| BigDecimals.hugeEngineeringToString | 193.054 | 194.490 | -0.74% |
| BigDecimals.hugeLayoutCharsToString | 212.770 | 170.158 | 25.04% |
| BigDecimals.hugePlainToString | 230.857 | 139.038 | 66.04% |
| BigDecimals.largeScale2EngineeringToString | 45.413 | 15.172 | 199.32% |
| BigDecimals.largeScale2LayoutCharsToString | 46.862 | 15.118 | 209.97% |
| BigDecimals.largeScale2PlainToString | 33.184 | 15.247 | 117.64% |
| BigDecimals.largeScale3EngineeringToString | 71.579 | 16.643 | 330.08% |
| BigDecimals.largeScale3LayoutCharsToString | 70.623 | 16.653 | 324.09% |
| BigDecimals.largeScale3PlainToString | 30.200 | 16.970 | 77.96% |
| BigDecimals.smallScale2EngineeringToString | 9.788 | 9.893 | -1.06% |
| BigDecimals.smallScale2LayoutCharsToString | 9.741 | 9.952 | -2.12% |
| BigDecimals.smallScale2PlainToString | 35.357 | 10.058 | 251.53% |
| BigDecimals.smallScale3EngineeringToString | 53.001 | 14.146 | 274.67% |
| BigDecimals.smallScale3LayoutCharsToString | 52.704 | 14.147 | 272.55% |
| BigDecimals.smallScale3PlainToString | 35.690 | 14.068 | 153.70% |
Webrevs
Can we please have a pause on the sequence of "make XYZ toString faster" PRs until there is some wider discussion of goals, etc.? Thanks.
Mailing list message from Archie Cobbs on core-libs-dev:
On Tue, Feb 4, 2025 at 2:40?PM Joe Darcy <darcy at openjdk.org> wrote:
Can we please have a pause on the sequence of "make XYZ toString faster" PRs until there is some wider discussion of goals, etc.? Thanks.
I agree with this sentiment... It was surprising to see how easily a VM crash can sneak in.
There is always a trade-off between A and B, where:
A = Code clarity, robustness vs. future changes, friendliness to new developers, minimizing obscure bugs (and security holes), etc... B = Performance
Where should the line be drawn? Personally (as a Java user) I'd accept 1% slower vs. 1% less likely to crash any day...
Performance is important but there should be some general guidelines and maybe some specific policies. E.g. should there be a higher number of reviews required whenever Unsafe is used purely for performance reasons?
It's also worth pondering what's implied by the Java team evangelizing to the rest of the world to stop using Unsafe, while at the same time adding it more and more ourselves (when not strictly required). In theory we should instead be eating our own dog food (or better yet, improving its quality).
Also: when does it become more appropriate to address a performance issue in Hotspot instead of in Java source? If some optimization eliminates an array range check that is clearly not needed, it might be feasible (and much more widely beneficial) to teach Hotstpot how to figure that out itself, etc.
Just some random thoughts...
-Archie
-- Archie L. Cobbs -------------- next part -------------- An HTML attachment was scrubbed... URL: <https://mail.openjdk.org/pipermail/core-libs-dev/attachments/20250204/116c8508/attachment.htm>
I think you are talking about the problem of PR #23420, which is caused by the use of thread-unsafe StringBuilder in multi-threaded scenarios. This problem is very obscure and I didn't consider it before. I have started to solve this problem and have submitted PR #23427. After it is completed, I will continue to submit PR to redo PR #19626 in a thread-safe way.
The above problem does not affect toString, because it only occurs when StringBuilder is used in a multi-threaded scenario.
/reviewers 2 reviewer
@AlanBateman The total number of required reviews for this PR (including the jcheck configuration and the last /reviewers command) is now set to 2 (with at least 2 Reviewers).
Mailing list message from Archie Cobbs on core-libs-dev:
On Tue, Feb 4, 2025 at 5:26?PM Shaojin Wen <swen at openjdk.org> wrote:
I think you are talking about the problem of PR #23420, which is caused by the use of thread-unsafe StringBuilder in multi-threaded scenarios. This problem is very obscure and I didn't consider it before. I have started to solve this problem and have submitted PR #23427. After it is completed, I will continue to submit PR to redo PR #19626 in a thread-safe way.
Yes - apologies if it sounded like I was trying to single you out. The optimizations you've been doing are looking great. It's just that this example is a good data point in the larger discussion about what the general policy should be, etc.
The above problem does not affect toString, because it only occurs when StringBuilder is used in a multi-threaded scenario.
Good point, but frankly, an irrelevant one. The key issue here is that if plain, ordinary, non-native-invoking Java bytecode can corrupt memory and/or crash the JVM, then that's a Big Problem??. It doesn't matter how contrived the code that makes it happen is.
-Archie
-- Archie L. Cobbs -------------- next part -------------- An HTML attachment was scrubbed... URL: <https://mail.openjdk.org/pipermail/core-libs-dev/attachments/20250207/67028683/attachment.htm>
Mailing list message from Paul Sandoz on core-libs-dev:
I would like to amplify this point ? undermining Java?s integrity is a big deal. Every time we use unsafe mechanisms within the JDK we risk doing that. The more complex such code is the harder it is reason about whether overall it is safe [*]. We need to balance reasoning about code, quality, and maintenance of against narrowly measured performance benefits that increase the risk of some integrity violation.
Paul.
[*] And even if it is not so complex, others may not be aware of the subtleties when refactoring. Unsafe allocation that does not zero memory is particular worrisome in this regard.
On Feb 7, 2025, at 7:42?AM, Archie Cobbs <archie.cobbs at gmail.com> wrote: Good point, but frankly, an irrelevant one. The key issue here is that if plain, ordinary, non-native-invoking Java bytecode can corrupt memory and/or crash the JVM, then that's a Big Problem??. It doesn't matter how contrived the code that makes it happen is.
-Archie
-- Archie L. Cobbs
@wenshao This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!
Keep it alive.
@wenshao This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!
@wenshao This pull request has been inactive for more than 8 weeks and will now be automatically closed. If you would like to continue working on this pull request in the future, feel free to reopen it! This can be done using the /open pull request command.
/open
@wenshao This pull request is now open
@wenshao This pull request has been inactive for more than 8 weeks and will be automatically closed if another 8 weeks passes without any activity. To avoid this, simply issue a /touch or /keepalive command to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!
/touch
@wenshao The pull request is being re-evaluated and the inactivity timeout has been reset.
@wenshao this pull request can not be integrated into master due to one or more merge conflicts. To resolve these merge conflicts and update this pull request you can run the following commands in the local repository for your personal fork:
git checkout dec_to_str_202501
git fetch https://git.openjdk.org/jdk.git master
git merge FETCH_HEAD
# resolve conflicts and follow the instructions given by git merge
git commit -m "Merge master"
git push
@wenshao This pull request has been inactive for more than 8 weeks and will be automatically closed if another 8 weeks passes without any activity. To avoid this, simply issue a /touch or /keepalive command to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!
/touch
@wenshao The pull request is being re-evaluated and the inactivity timeout has been reset.
@wenshao This pull request has been inactive for more than 8 weeks and will be automatically closed if another 8 weeks passes without any activity. To avoid this, simply issue a /touch or /keepalive command to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!
/touch
@wenshao The pull request is being re-evaluated and the inactivity timeout has been reset.