jdk icon indicating copy to clipboard operation
jdk copied to clipboard

8315585: Optimization for decimal to string

Open wenshao opened this issue 11 months ago • 29 comments

Continue to complete PR #16006 and PR #21593 to improve BigDecimal::toString and BigDecimal::toPlainString performance and reduce duplicate code


Progress

  • [x] Change must not contain extraneous whitespace
  • [x] Commit message must refer to an issue
  • [ ] Change must be properly reviewed (2 reviews required, with at least 2 Reviewers)

Issue

  • JDK-8315585: Optimization for decimal to string (Enhancement - P4)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/23310/head:pull/23310
$ git checkout pull/23310

Update a local copy of the PR:
$ git checkout pull/23310
$ git pull https://git.openjdk.org/jdk.git pull/23310/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 23310

View PR using the GUI difftool:
$ git pr show -t 23310

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/23310.diff

Using Webrev

Link to Webrev Comment

wenshao avatar Jan 25 '25 07:01 wenshao

:wave: Welcome back swen! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

bridgekeeper[bot] avatar Jan 25 '25 07:01 bridgekeeper[bot]

❗ This change is not yet ready to be integrated. See the Progress checklist in the description for automated requirements.

openjdk[bot] avatar Jan 25 '25 07:01 openjdk[bot]

@wenshao The following label will be automatically applied to this pull request:

  • core-libs

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

openjdk[bot] avatar Jan 25 '25 07:01 openjdk[bot]

A great cleanup that consolidates scale2, unscaledAbsString, and unscaledString.

liach avatar Jan 26 '25 18:01 liach

The performance numbers as follows. In the smallScale2EngineeringToString and smallScale2LayoutCharsToString scenarios, the performance is the same as before, but there are very significant performance improvements in other scenarios.

1. Script

git remote add wenshao [email protected]:wenshao/jdk.git
git fetch wenshao

#baseline
git checkout 4040d766e9cecf781be1d790e5e7836368acc7bd
make test TEST="micro:java.math.BigDecimals.huge"
make test TEST="micro:java.math.BigDecimals.large"
make test TEST="micro:java.math.BigDecimals.small"

# current
git checkout cb88e0efe6cc3e241b49951b076dccb1c690f7cb
make test TEST="micro:java.math.BigDecimals.huge"
make test TEST="micro:java.math.BigDecimals.large"
make test TEST="micro:java.math.BigDecimals.small"

2. aliyun_ecs_c8a_x64 (CPU AMD EPYC™ Genoa)

-# baseline
-Benchmark                                   Mode  Cnt     Score   Error  Units (4040d766e9c)
-BigDecimals.hugeEngineeringToString         avgt   15   170.621 ± 0.834  ns/op
-BigDecimals.hugeLayoutCharsToString         avgt   15   169.597 ± 0.373  ns/op
-BigDecimals.hugePlainToString               avgt   15   156.598 ± 1.141  ns/op
-BigDecimals.largeScale2EngineeringToString  avgt   15    34.733 ± 2.767  ns/op
-BigDecimals.largeScale2LayoutCharsToString  avgt   15    33.044 ± 0.153  ns/op
-BigDecimals.largeScale2PlainToString        avgt   15    30.302 ± 0.042  ns/op
-BigDecimals.largeScale3EngineeringToString  avgt   15    39.054 ± 0.084  ns/op
-BigDecimals.largeScale3LayoutCharsToString  avgt   15    39.180 ± 0.065  ns/op
-BigDecimals.largeScale3PlainToString        avgt   15    31.261 ± 0.081  ns/op
-BigDecimals.smallScale2EngineeringToString  avgt   15    11.364 ± 0.019  ns/op
-BigDecimals.smallScale2LayoutCharsToString  avgt   15    11.387 ± 0.038  ns/op
-BigDecimals.smallScale2PlainToString        avgt   15    29.513 ± 0.045  ns/op
-BigDecimals.smallScale3EngineeringToString  avgt   15    33.208 ± 0.166  ns/op
-BigDecimals.smallScale3LayoutCharsToString  avgt   15    33.171 ± 0.084  ns/op
-BigDecimals.smallScale3PlainToString        avgt   15    28.995 ± 0.037  ns/op


+# current
+Benchmark                                   Mode  Cnt     Score    Error  Units (cb88e0efe6c)
+BigDecimals.hugeEngineeringToString         avgt   15   158.130 ±  2.980  ns/op
+BigDecimals.hugeLayoutCharsToString         avgt   15   156.163 ±  0.591  ns/op
+BigDecimals.hugePlainToString               avgt   15   157.425 ±  1.655  ns/op
+BigDecimals.largeScale2EngineeringToString  avgt   15    18.381 ±  0.037  ns/op
+BigDecimals.largeScale2LayoutCharsToString  avgt   15    18.158 ±  0.048  ns/op
+BigDecimals.largeScale2PlainToString        avgt   15    18.416 ±  0.034  ns/op
+BigDecimals.largeScale3EngineeringToString  avgt   15    19.319 ±  0.035  ns/op
+BigDecimals.largeScale3LayoutCharsToString  avgt   15    19.081 ±  0.021  ns/op
+BigDecimals.largeScale3PlainToString        avgt   15    19.771 ±  0.063  ns/op
+BigDecimals.smallScale2EngineeringToString  avgt   15    11.122 ±  0.031  ns/op
+BigDecimals.smallScale2LayoutCharsToString  avgt   15    11.105 ±  0.037  ns/op
+BigDecimals.smallScale2PlainToString        avgt   15    11.531 ±  0.030  ns/op
+BigDecimals.smallScale3EngineeringToString  avgt   15    15.830 ±  0.032  ns/op
+BigDecimals.smallScale3LayoutCharsToString  avgt   15    15.821 ±  0.047  ns/op
+BigDecimals.smallScale3PlainToString        avgt   15    15.430 ±  0.176  ns/op
baseline current delta
BigDecimals.hugeEngineeringToString 170.621 158.130 7.90%
BigDecimals.hugeLayoutCharsToString 169.597 156.163 8.60%
BigDecimals.hugePlainToString 156.598 157.425 -0.53%
BigDecimals.largeScale2EngineeringToString 34.733 18.381 88.96%
BigDecimals.largeScale2LayoutCharsToString 33.044 18.158 81.98%
BigDecimals.largeScale2PlainToString 30.302 18.416 64.54%
BigDecimals.largeScale3EngineeringToString 39.054 19.319 102.15%
BigDecimals.largeScale3LayoutCharsToString 39.180 19.081 105.34%
BigDecimals.largeScale3PlainToString 31.261 19.771 58.12%
BigDecimals.smallScale2EngineeringToString 11.364 11.122 2.18%
BigDecimals.smallScale2LayoutCharsToString 11.387 11.105 2.54%
BigDecimals.smallScale2PlainToString 29.513 11.531 155.94%
BigDecimals.smallScale3EngineeringToString 33.208 15.830 109.78%
BigDecimals.smallScale3LayoutCharsToString 33.171 15.821 109.66%
BigDecimals.smallScale3PlainToString 28.995 15.430 87.91%

3. The performance numbers under MacBook M1 Pro

-# baseline
-Benchmark                                   Mode  Cnt     Score    Error  Units (4040d766e9c)
-BigDecimals.hugeEngineeringToString         avgt   15   193.054 ? 26.472  ns/op
-BigDecimals.hugeLayoutCharsToString         avgt   15   212.770 ?  6.918  ns/op
-BigDecimals.hugePlainToString               avgt   15   230.857 ?  4.276  ns/op
-BigDecimals.largeScale2EngineeringToString  avgt   15    45.413 ?  1.318  ns/op
-BigDecimals.largeScale2LayoutCharsToString  avgt   15    46.862 ?  0.878  ns/op
-BigDecimals.largeScale2PlainToString        avgt   15    33.184 ?  2.787  ns/op
-BigDecimals.largeScale3EngineeringToString  avgt   15    71.579 ?  3.913  ns/op
-BigDecimals.largeScale3LayoutCharsToString  avgt   15    70.623 ?  4.559  ns/op
-BigDecimals.largeScale3PlainToString        avgt   15    30.200 ?  1.164  ns/op
-BigDecimals.smallScale2EngineeringToString  avgt   15     9.788 ?  0.097  ns/op
-BigDecimals.smallScale2LayoutCharsToString  avgt   15     9.741 ?  0.046  ns/op
-BigDecimals.smallScale2PlainToString        avgt   15    35.357 ?  1.161  ns/op
-BigDecimals.smallScale3EngineeringToString  avgt   15    53.001 ?  2.682  ns/op
-BigDecimals.smallScale3LayoutCharsToString  avgt   15    52.704 ?  2.706  ns/op
-BigDecimals.smallScale3PlainToString        avgt   15    35.690 ?  2.847  ns/op


+# current
+Benchmark                                   Mode  Cnt     Score    Error  Units (cb88e0efe6c)
+BigDecimals.hugeEngineeringToString         avgt   15   194.490 ? 39.908  ns/op
+BigDecimals.hugeLayoutCharsToString         avgt   15   170.158 ? 39.788  ns/op
+BigDecimals.hugePlainToString               avgt   15   139.038 ?  0.640  ns/op
+BigDecimals.largeScale2EngineeringToString  avgt   15    15.172 ?  0.186  ns/op
+BigDecimals.largeScale2LayoutCharsToString  avgt   15    15.118 ?  0.082  ns/op
+BigDecimals.largeScale2PlainToString        avgt   15    15.247 ?  0.125  ns/op
+BigDecimals.largeScale3EngineeringToString  avgt   15    16.643 ?  0.085  ns/op
+BigDecimals.largeScale3LayoutCharsToString  avgt   15    16.653 ?  0.229  ns/op
+BigDecimals.largeScale3PlainToString        avgt   15    16.970 ?  0.115  ns/op
+BigDecimals.smallScale2EngineeringToString  avgt   15     9.893 ?  0.051  ns/op
+BigDecimals.smallScale2LayoutCharsToString  avgt   15     9.952 ?  0.149  ns/op
+BigDecimals.smallScale2PlainToString        avgt   15    10.058 ?  0.023  ns/op
+BigDecimals.smallScale3EngineeringToString  avgt   15    14.146 ?  0.198  ns/op
+BigDecimals.smallScale3LayoutCharsToString  avgt   15    14.147 ?  0.035  ns/op
+BigDecimals.smallScale3PlainToString        avgt   15    14.068 ?  0.029  ns/op
baseline current delta
BigDecimals.hugeEngineeringToString 193.054 194.490 -0.74%
BigDecimals.hugeLayoutCharsToString 212.770 170.158 25.04%
BigDecimals.hugePlainToString 230.857 139.038 66.04%
BigDecimals.largeScale2EngineeringToString 45.413 15.172 199.32%
BigDecimals.largeScale2LayoutCharsToString 46.862 15.118 209.97%
BigDecimals.largeScale2PlainToString 33.184 15.247 117.64%
BigDecimals.largeScale3EngineeringToString 71.579 16.643 330.08%
BigDecimals.largeScale3LayoutCharsToString 70.623 16.653 324.09%
BigDecimals.largeScale3PlainToString 30.200 16.970 77.96%
BigDecimals.smallScale2EngineeringToString 9.788 9.893 -1.06%
BigDecimals.smallScale2LayoutCharsToString 9.741 9.952 -2.12%
BigDecimals.smallScale2PlainToString 35.357 10.058 251.53%
BigDecimals.smallScale3EngineeringToString 53.001 14.146 274.67%
BigDecimals.smallScale3LayoutCharsToString 52.704 14.147 272.55%
BigDecimals.smallScale3PlainToString 35.690 14.068 153.70%

wenshao avatar Jan 28 '25 07:01 wenshao

Webrevs

mlbridge[bot] avatar Jan 28 '25 08:01 mlbridge[bot]

Can we please have a pause on the sequence of "make XYZ toString faster" PRs until there is some wider discussion of goals, etc.? Thanks.

jddarcy avatar Feb 04 '25 20:02 jddarcy

Mailing list message from Archie Cobbs on core-libs-dev:

On Tue, Feb 4, 2025 at 2:40?PM Joe Darcy <darcy at openjdk.org> wrote:

Can we please have a pause on the sequence of "make XYZ toString faster" PRs until there is some wider discussion of goals, etc.? Thanks.

I agree with this sentiment... It was surprising to see how easily a VM crash can sneak in.

There is always a trade-off between A and B, where:

A = Code clarity, robustness vs. future changes, friendliness to new developers, minimizing obscure bugs (and security holes), etc... B = Performance

Where should the line be drawn? Personally (as a Java user) I'd accept 1% slower vs. 1% less likely to crash any day...

Performance is important but there should be some general guidelines and maybe some specific policies. E.g. should there be a higher number of reviews required whenever Unsafe is used purely for performance reasons?

It's also worth pondering what's implied by the Java team evangelizing to the rest of the world to stop using Unsafe, while at the same time adding it more and more ourselves (when not strictly required). In theory we should instead be eating our own dog food (or better yet, improving its quality).

Also: when does it become more appropriate to address a performance issue in Hotspot instead of in Java source? If some optimization eliminates an array range check that is clearly not needed, it might be feasible (and much more widely beneficial) to teach Hotstpot how to figure that out itself, etc.

Just some random thoughts...

-Archie

-- Archie L. Cobbs -------------- next part -------------- An HTML attachment was scrubbed... URL: <https://mail.openjdk.org/pipermail/core-libs-dev/attachments/20250204/116c8508/attachment.htm>

mlbridge[bot] avatar Feb 04 '25 21:02 mlbridge[bot]

I think you are talking about the problem of PR #23420, which is caused by the use of thread-unsafe StringBuilder in multi-threaded scenarios. This problem is very obscure and I didn't consider it before. I have started to solve this problem and have submitted PR #23427. After it is completed, I will continue to submit PR to redo PR #19626 in a thread-safe way.

The above problem does not affect toString, because it only occurs when StringBuilder is used in a multi-threaded scenario.

wenshao avatar Feb 04 '25 23:02 wenshao

/reviewers 2 reviewer

AlanBateman avatar Feb 05 '25 12:02 AlanBateman

@AlanBateman The total number of required reviews for this PR (including the jcheck configuration and the last /reviewers command) is now set to 2 (with at least 2 Reviewers).

openjdk[bot] avatar Feb 05 '25 12:02 openjdk[bot]

Mailing list message from Archie Cobbs on core-libs-dev:

On Tue, Feb 4, 2025 at 5:26?PM Shaojin Wen <swen at openjdk.org> wrote:

I think you are talking about the problem of PR #23420, which is caused by the use of thread-unsafe StringBuilder in multi-threaded scenarios. This problem is very obscure and I didn't consider it before. I have started to solve this problem and have submitted PR #23427. After it is completed, I will continue to submit PR to redo PR #19626 in a thread-safe way.

Yes - apologies if it sounded like I was trying to single you out. The optimizations you've been doing are looking great. It's just that this example is a good data point in the larger discussion about what the general policy should be, etc.

The above problem does not affect toString, because it only occurs when StringBuilder is used in a multi-threaded scenario.

Good point, but frankly, an irrelevant one. The key issue here is that if plain, ordinary, non-native-invoking Java bytecode can corrupt memory and/or crash the JVM, then that's a Big Problem??. It doesn't matter how contrived the code that makes it happen is.

-Archie

-- Archie L. Cobbs -------------- next part -------------- An HTML attachment was scrubbed... URL: <https://mail.openjdk.org/pipermail/core-libs-dev/attachments/20250207/67028683/attachment.htm>

mlbridge[bot] avatar Feb 07 '25 15:02 mlbridge[bot]

Mailing list message from Paul Sandoz on core-libs-dev:

I would like to amplify this point ? undermining Java?s integrity is a big deal. Every time we use unsafe mechanisms within the JDK we risk doing that. The more complex such code is the harder it is reason about whether overall it is safe [*]. We need to balance reasoning about code, quality, and maintenance of against narrowly measured performance benefits that increase the risk of some integrity violation.

Paul.

[*] And even if it is not so complex, others may not be aware of the subtleties when refactoring. Unsafe allocation that does not zero memory is particular worrisome in this regard.

On Feb 7, 2025, at 7:42?AM, Archie Cobbs <archie.cobbs at gmail.com> wrote: Good point, but frankly, an irrelevant one. The key issue here is that if plain, ordinary, non-native-invoking Java bytecode can corrupt memory and/or crash the JVM, then that's a Big Problem??. It doesn't matter how contrived the code that makes it happen is.

-Archie

-- Archie L. Cobbs

mlbridge[bot] avatar Feb 08 '25 00:02 mlbridge[bot]

@wenshao This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

bridgekeeper[bot] avatar Mar 05 '25 16:03 bridgekeeper[bot]

Keep it alive.

wenshao avatar Mar 09 '25 15:03 wenshao

@wenshao This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

bridgekeeper[bot] avatar Apr 06 '25 17:04 bridgekeeper[bot]

@wenshao This pull request has been inactive for more than 8 weeks and will now be automatically closed. If you would like to continue working on this pull request in the future, feel free to reopen it! This can be done using the /open pull request command.

bridgekeeper[bot] avatar May 04 '25 19:05 bridgekeeper[bot]

/open

wenshao avatar May 25 '25 09:05 wenshao

@wenshao This pull request is now open

openjdk[bot] avatar May 25 '25 09:05 openjdk[bot]

@wenshao This pull request has been inactive for more than 8 weeks and will be automatically closed if another 8 weeks passes without any activity. To avoid this, simply issue a /touch or /keepalive command to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

bridgekeeper[bot] avatar Jul 20 '25 14:07 bridgekeeper[bot]

/touch

wenshao avatar Jul 20 '25 23:07 wenshao

@wenshao The pull request is being re-evaluated and the inactivity timeout has been reset.

openjdk[bot] avatar Jul 20 '25 23:07 openjdk[bot]

@wenshao this pull request can not be integrated into master due to one or more merge conflicts. To resolve these merge conflicts and update this pull request you can run the following commands in the local repository for your personal fork:

git checkout dec_to_str_202501
git fetch https://git.openjdk.org/jdk.git master
git merge FETCH_HEAD
# resolve conflicts and follow the instructions given by git merge
git commit -m "Merge master"
git push

openjdk[bot] avatar Aug 20 '25 22:08 openjdk[bot]

@wenshao This pull request has been inactive for more than 8 weeks and will be automatically closed if another 8 weeks passes without any activity. To avoid this, simply issue a /touch or /keepalive command to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

bridgekeeper[bot] avatar Oct 19 '25 04:10 bridgekeeper[bot]

/touch

wenshao avatar Oct 20 '25 00:10 wenshao

@wenshao The pull request is being re-evaluated and the inactivity timeout has been reset.

openjdk[bot] avatar Oct 20 '25 00:10 openjdk[bot]

@wenshao This pull request has been inactive for more than 8 weeks and will be automatically closed if another 8 weeks passes without any activity. To avoid this, simply issue a /touch or /keepalive command to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

bridgekeeper[bot] avatar Dec 15 '25 02:12 bridgekeeper[bot]

/touch

wenshao avatar Dec 15 '25 05:12 wenshao

@wenshao The pull request is being re-evaluated and the inactivity timeout has been reset.

openjdk[bot] avatar Dec 15 '25 05:12 openjdk[bot]