jdk 8333396: Performance regression of DecimalFormat.format

Performance regression of DecimalFormat.format

From the output of perf, we can see the hottest regions contain atomic instructions. But when run with JDK 11, there is no such problem. The reason is the removed biased locking.
The DecimalFormat uses StringBuffer everywhere, and StringBuffer itself contains many synchronized methods. So I added support for some new methods that accept StringBuilder which is lock-free.

Benchmark testcase

@BenchmarkMode(Mode.AverageTime)
@Warmup(iterations = 5, time = 500, timeUnit = TimeUnit.MILLISECONDS)
@Measurement(iterations = 10, time = 500, timeUnit = TimeUnit.MILLISECONDS)
@State(Scope.Thread)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
public class JmhDecimalFormat {

    private DecimalFormat format;

    @Setup(Level.Trial)
    public void setup() {
        format = new DecimalFormat("#0.00000");
    }

    @Benchmark
    public void testNewAndFormat() throws InterruptedException {
        new DecimalFormat("#0.00000").format(9524234.1236457);
    }

    @Benchmark
    public void testNewOnly() throws InterruptedException {
        new DecimalFormat("#0.00000");
    }

    @Benchmark
    public void testFormatOnly() throws InterruptedException {
        format.format(9524234.1236457);
    }
}

Test result

Current JDK before optimize

 Benchmark                             Mode  Cnt    Score   Error  Units
JmhDecimalFormat.testFormatOnly       avgt   50  642.099 ? 1.253  ns/op
JmhDecimalFormat.testNewAndFormat     avgt   50  989.307 ? 3.676  ns/op
JmhDecimalFormat.testNewOnly          avgt   50  303.381 ? 5.252  ns/op

Current JDK after optimize

Benchmark                          Mode  Cnt    Score   Error  Units
JmhDecimalFormat.testFormatOnly    avgt   50  351.499 ? 0.761  ns/op
JmhDecimalFormat.testNewAndFormat  avgt   50  615.145 ? 2.478  ns/op
JmhDecimalFormat.testNewOnly       avgt   50  209.874 ? 9.951  ns/op

JDK 11

Benchmark                          Mode  Cnt    Score   Error  Units
JmhDecimalFormat.testFormatOnly    avgt   50  364.214 ? 1.191  ns/op
JmhDecimalFormat.testNewAndFormat  avgt   50  658.699 ? 2.311  ns/op
JmhDecimalFormat.testNewOnly       avgt   50  248.300 ? 5.158  ns/op

Progress

[ ] Change must be properly reviewed (1 review required, with at least 1 Reviewer)
[x] Change must not contain extraneous whitespace
[x] Commit message must refer to an issue

Issue

JDK-8333396: Performance regression of DecimalFormat.format (Bug - P3)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/19513/head:pull/19513
$ git checkout pull/19513

Update a local copy of the PR:
$ git checkout pull/19513
$ git pull https://git.openjdk.org/jdk.git pull/19513/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 19513

View PR using the GUI difftool:
$ git pr show -t 19513

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/19513.diff

Webrev

Link to Webrev Comment

Jun 03 '24 04:06 lingjun-cg

:wave: Welcome back lingjun-cg! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

Jun 03 '24 04:06 bridgekeeper[bot]

@lingjun-cg This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8333396: Use StringBuilder internally for java.text.Format.* formatting

Reviewed-by: naoto, liach, jlu

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 385 new commits pushed to the master branch:

fd741a88e8bc73a9db6d4283bb54daab1760b442: 8327538: The SSLExtension class specifies incorrect values for heartbeat per RFC 6520 and post_handshake_auth per RFC 8446
ad498f57fcead174306c6e6e3b2d1f9916821b84: 8335896: Source launcher should set TCCL
b21cb44e38ee8ea75e3a1c51e3a28388056a492d: 8329398: Links in InetAddress class description show "#format"
c5b7af73d07f7458e970f5752eb75640562ddc7b: 8336692: Redo fix for JDK-8284620
491b9f5efc01fa36fb3c174e130b46bc69c8d707: 8336706: Optimize LocalDate.toString with StringBuilder.repeat
e3acf4c627c3c75f9a2ef29647daa6f4746fdc62: 8336792: DateTimeFormatterBuilder append zeros based on StringBuilder.repeat
e7e48a780a34007994f830869fdb74ba1cb5b3fe: 8248609: [Graal] vmTestbase/nsk/jdi/VoidValue/toString/tostring001/TestDescription.java failed with Unexpected com.sun.jdi.ObjectCollectedException
939fe000a96bc7c92c7b8814eb6ee66856718e4e: 8336786: VerifyError with lambda capture and enclosing instance references
3ade2b6114bbe81eb03e3a49c08b5401f70a2367: 8336777: BufferedMethodBuilder not initialized with static flag
c25c4896ad9ef031e3cddec493aef66ff87c48a7: 8333812: ClassFile.verify() can throw exceptions instead of returning VerifyErrors
... and 375 more: https://git.openjdk.org/jdk/compare/e95f092862307c248bbd93e7026cbd92053fb4c9...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@justin-curtis-lu, @naotoj, @liach) but any other Committer may sponsor as well.

➡️ To flag this PR as ready for integration with the above commit message, type /integrate in a new comment. (Afterwards, your sponsor types /sponsor in a new comment to perform the integration).

Jun 03 '24 04:06 openjdk[bot]

@lingjun-cg The following labels will be automatically applied to this pull request:

core-libs
i18n

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

Jun 03 '24 04:06 openjdk[bot]

Webrevs

Jun 03 '24 04:06 mlbridge[bot]

Hi,

Can you please provide more details? As to StringBuffer, I think it is being used since those classes in java.text package have been created. I am not sure why that contributes to what you described as the "performance regression".

Separately, please split this PR into two, as combining two different issues into a single JBS issue/PR is not right. The second issue is likely due to loading stream classes for the first time at JVM startup.

Jun 03 '24 22:06 naotoj

Hi naotaoj, What I mean "performance regression" is compare to JDK 11. We have an server side application that use DecimalFormat.format API seriously. When migrate it from JDK 11 to JDK 21, we found a performance degradation. So I write the JMH test case "JmhDecimalFormat". It show that there a performance regression since JDK 21.

These are the perfasm output for running JMH on both JDK 11 and JDK21. There are some hot regions around the atomic instructions in JDK 21, but no such problem in JDK 11. jdk11.txt jdk21.txt

Maybe the JEP 374: Deprecate and Disable Biased Locking is the reason? So I run the benchmark on JDK 11 again but with option '-XX:-UseBiasedLocking', there only a minor gap between jdk11 and jdk 21.

OK, return to my patch. java.text use StringBuffer internally, but nearly all methods in StringBuffer are synchronized:

    @Override
    public synchronized StringBuffer append(Object obj) {
        ....
    }

    @Override
    @IntrinsicCandidate
    public synchronized StringBuffer append(String str) {
        ...
    }

From the above analysis, the atomic instructions slow down DecimalFormat.format, and StringBuffer's synchronized methods generate there atomic instructions. So If remove these synchronized methods, it will get a better performance. So I replace StringBuffer with StringBuilder in java.text.NumberFormat.

    public final String format(double number) {
        // Use fast-path for double result if that works
        String result = fastFormat(number);
        if (result != null)
            return result;

        -return format(number, new StringBuffer(),
        +return format(number, new StringBuilder(),
                      DontCareFieldPosition.INSTANCE).toString();
    }

@@ -367,7 +367,7 @@ public final String format(double number) {
     * @see java.text.Format#format
     */
    public final String format(long number) {
       - return format(number, new StringBuffer(),
       + return format(number, new StringBuilder(),
                      DontCareFieldPosition.INSTANCE).toString();
    }

Separately, please split this PR into two, as combining two different issues into a single JBS issue/PR is not right. The second issue is likely due to loading stream classes for the first time at JVM startup. OK. I will create a separate issue.

Jun 04 '24 01:06 lingjun-cg

Hi, I think you can add the jmh test case.

Jun 04 '24 07:06 kuaiwei

Hi, I think you can add the jmh test case.

https://github.com/openjdk/jdk/pull/19513#issue-2330131051 already contains the test case.

Jun 04 '24 07:06 lingjun-cg

Hi, I think you can add the jmh test case.

#19513 (comment) already contains the test case.

I mean it can be added as a test case in test/micro and be evaluated in future build.

Jun 04 '24 07:06 kuaiwei

Hi, I think you can add the jmh test case.

#19513 (comment) already contains the test case.

I mean it can be added as a test case in test/micro and be evaluated in future build.

Thanks for your remainder. There exist a JMH test case which named 'test/micro/org/openjdk/bench/java/text/DefFormatterBench.java'.

This JMH test case run in throughput mode. If run without the patch, the output is:

Benchmark                                  Mode  Cnt     Score    Error   Units
DefFormatterBench.testDefNumberFormatter  thrpt    5  1842.666 ? 83.694  ops/ms

If run with after apply the patch, the output is:

Benchmark                                  Mode  Cnt     Score     Error   Units
DefFormatterBench.testDefNumberFormatter  thrpt    5  3988.274 ? 513.434  ops/ms

The score is increased by (3988.274-1842.666)/1842.666 = 116%.

Jun 04 '24 08:06 lingjun-cg

Hi, I think you can add the jmh test case.

#19513 (comment) already contains the test case.

I mean it can be added as a test case in test/micro and be evaluated in future build.

Thanks for your remainder. There exist a JMH test case which named 'test/micro/org/openjdk/bench/java/text/DefFormatterBench.java'.

This JMH test case run in throughput mode. If run without the patch, the output is:
Benchmark                                  Mode  Cnt     Score    Error   Units
DefFormatterBench.testDefNumberFormatter  thrpt    5  1842.666 ? 83.694  ops/ms
If run with after apply the patch, the output is:
Benchmark                                  Mode  Cnt     Score     Error   Units
DefFormatterBench.testDefNumberFormatter  thrpt    5  3988.274 ? 513.434  ops/ms
The score is increased by (3988.274-1842.666)/1842.666 = 116%.

Yes, it's what I expected.

Jun 04 '24 09:06 kuaiwei

Thanks for splitting up the PR/JBS. For this one, I am still not sure why this StringBuffer/Builder locking became an issue, as the code is basically the same in JDK11 (meaning StringBuffer has always been used). Would it be possible if the regression was caused by some other changes?

Jun 04 '24 18:06 naotoj

Also it would be helpful to compare the performance without biased locking in JDK11.

Jun 04 '24 19:06 naotoj

(Same comment as on the other PR.)

Would it be better if the benchmarks returned the created DecimalFormats ? Just thinking about dead code elimination: https://github.com/openjdk/jmh/blob/master/jmh-samples/src/main/java/org/openjdk/jmh/samples/JMHSample_08_DeadCode.java

Jun 04 '24 23:06 bchristi-git

Also it would be helpful to compare the performance without biased locking in JDK11.

If run in JDK11 without biased locking, the performance is as same as run in current JDK.

Run benchmark with average time mode.

JDK 11 with disabled biased locking:

~/jdk-11.0.2/bin/java -XX:-UseBiasedLocking -cp "./JmhDecimalFormat-Dist/*" org.openjdk.jmh.Main jmhtest.JmhDecimalFormat
Benchmark                          Mode  Cnt    Score   Error  Units
JmhDecimalFormat.testFormatOnly    avgt   50  659.601 ? 1.882  ns/op

JDK 11 with enabled biased locking:

 ~/jdk-11.0.2/bin/java -cp "./JmhDecimalFormat-Dist/*" org.openjdk.jmh.Main jmhtest.JmhDecimalFormat
Benchmark                          Mode  Cnt    Score   Error  Units
JmhDecimalFormat.testFormatOnly    avgt   50  366.064 ? 1.747  ns/op

Current JDK:

~/jdk23-bin/jdk/bin/java -cp "./JmhDecimalFormat-Dist/*" org.openjdk.jmh.Main jmhtest.JmhDecimalFormat
Benchmark                          Mode  Cnt    Score    Error  Units
JmhDecimalFormat.testFormatOnly    avgt   50  641.010 ?  1.230  ns/op

@naotoj

Jun 05 '24 01:06 lingjun-cg

@naotoj The above contains the result of JDK11 with disabled and enabled biased locking. It seems the key is the biased locking. Now our application use DecimalFormat.format() seriously, the performance is declining when migrate from JDK 11 to JDK 21. Could you give some suggestion?Thanks.

Jun 12 '24 11:06 lingjun-cg

@naotoj Has there been any discussion about adding overloads that take a StringBuilder?

Jun 12 '24 12:06 AlanBateman

@AlanBateman yes, we've discussed this over time, but it wasn't a priority. I still have one issue open (https://bugs.openjdk.org/browse/JDK-8313205).

@lingjun-cg Sorry for the late reply. Since it involves a lot of classes, I would want to make sure it does not include any incompatibilities.

Jun 12 '24 17:06 naotoj

I second Justin's suggestion here. The change should benefit every implementation in the JDK, not only NumberFormat. Also, I might suggest renaming the new class, as it is kind of long/redundant. How about using something as simple as "StringBuf"?

Jun 13 '24 19:06 naotoj

I second Justin's suggestion here. The change should benefit every implementation in the JDK, not only NumberFormat. Also, I might suggest renaming the new class, as it is kind of long/redundant. How about using something as simple as "StringBuf"?

Thanks for your comment. The long name bother me for a while. I have changed it to ""StringBuf".

Jun 14 '24 03:06 lingjun-cg

Hi @lingjun-cg

Thanks for your work here.

While this would benefit the performance of NumberFormat subclasses, we are hoping that if we are going to make the internal switch to StringBuilder, we can also make similar changes to other Format subclasses as well where applicable. So, we would want isInternalSubclass() to be defined at the Format level. Some methods that I can immediately think of would be String DateFormat.format(Date date), String ListFormat.format(List<String> input) and, String MessageFormat.format(Object obj).

I believe @naotoj shares the same sentiment.

Jun 17 '24 18:06 justin-curtis-lu

Hi @lingjun-cg

Thanks for your work here.

While this would benefit the performance of NumberFormat subclasses, we are hoping that if we are going to make the internal switch to StringBuilder, we can also make similar changes to other Format subclasses as well where applicable. So, we would want isInternalSubclass() to be defined at the Format level. Some methods that I can immediately think of would be String DateFormat.format(Date date), String ListFormat.format(List<String> input) and, String MessageFormat.format(Object obj).

I believe @naotoj shares the same sentiment.

Yes, that is what I would expect in this change.

Jun 17 '24 18:06 naotoj

For StringBuf proxy, is it acceptible for us to introduce a new jdk.internal public interface (accessible only within java.base module) to expose common public methods in AbstractStringBuilder? We have public types extending or implementing non-public-types in the JDK (AbstractStringBuilder, NamedPackage) so I guess having a new module-specific superinterface would be fine? Need verification from API experts.

Jun 17 '24 18:06 liach

Hi @liach Do you know any other places within java.base where we would need the same proxy for StringBuffer?

Jun 17 '24 19:06 naotoj

Hi @liach Do you know any other places within java.base where we would need the same proxy for StringBuffer?

Good question! I looked at https://docs.oracle.com/en/java/javase/22/docs/api/java.base/java/lang/class-use/StringBuffer.html. I think such a new interface indeed is of limited usefulness, as we don't really have that many non-java.lang APIs closely tied to StringBuffer. Matcher is like one, but it lives mostly fine without the shadowing because it is using Appendable. And this has enlightened me.

In fact, we can use Appendable too, as we just need 2 append from Appendable and subSequence (replacing substring) and length from CharSequence. We can declare method like:

<T extends Appendable & CharSequence> T format(double number, T toAppendTo,
                         FieldPosition status) {

This signature accepts both StringBuilder and StringBuffer; all use sites can be according updated. The only thing need to change is that substring should now become subSequence, but it's just used in CharacterIteratorFieldDelegate so the impact is small.

Jun 17 '24 20:06 liach

Hi @liach Do you know any other places within java.base where we would need the same proxy for StringBuffer?

Good question! I looked at https://docs.oracle.com/en/java/javase/22/docs/api/java.base/java/lang/class-use/StringBuffer.html. I think such a new interface indeed is of limited usefulness, as we don't really have that many non-java.lang APIs closely tied to StringBuffer. Matcher is like one, but it lives mostly fine without the shadowing because it is using Appendable. And this has enlightened me.

In fact, we can use Appendable too, as we just need 2 append from Appendable and subSequence (replacing substring) and length from CharSequence. We can declare method like:
<T extends Appendable & CharSequence> T format(double number, T toAppendTo,
                         FieldPosition status) {
This signature accepts both StringBuilder and StringBuffer; all use sites can be according updated. The only thing need to change is that substring should now become subSequence, but it's just used in CharacterIteratorFieldDelegate so the impact is small.

That looks promising! Much cleaner than having dual methods

Jun 17 '24 22:06 naotoj

@lingjun-cg Please do not rebase or force-push to an active PR as it invalidates existing review comments. Note for future reference, the bots always squash all changes into a single commit automatically as part of the integration. See OpenJDK Developers’ Guide for more information.

Jun 18 '24 09:06 openjdk[bot]

@justin-curtis-lu The method isInternalSubclass in DecimalFormat already pulls up to Format. @liach @naotoj Using Appendable & CharSequence instead of an interface with a limited method is a great idea. Following this idea, I have updated the PR. But there are some notes when implementing the idea.

Rename format to formatWithGeneric in DecimalFormat that accepts StringBuilder. If not that, throw a StackOverflowException when run with DecimalFormat.format(double). If not rename the format, there are 2 methods in DecimalFormat:

   @Override
    public StringBuffer format(double number, StringBuffer result,
                               FieldPosition fieldPosition) {
        return format(number, result, fieldPosition);
    }
    @Override
    <T extends Appendable & CharSequence> T format(double number, T result,
                                                   FieldPosition fieldPosition) {
   }

Because the second parameter type StringBuffer of the 1st method is more concrete than type T of the 2nd method, so the 1st method will call itself recursively.

The Appendable#append(java.lang.CharSequence) method can throws IOException, so there is some messy code handling IOException.

Jun 18 '24 10:06 lingjun-cg

Using <T extends Appendable & CharSequence> instead of StringBuf seems a good idea, but it has a disadvantage. The Appendable defines only 3 append methods which cannot cover some cases. For example, this method SimpleDateFormat#format(Date, StringBuffer, FieldDelegate) contains the following code:

            case TAG_QUOTE_CHARS:
                toAppendTo.append(compiledPattern, i, count);
                i += count;
                break;

It requires append(char[], int, int), but the Appendable has no such method. So it's difficult to use <T extends Appendable & CharSequence> in methods String DateFormat.format(Date date), String ListFormat.format(List<String> input) and, String MessageFormat.format(Object obj).

The method in java.text.MessageFormat#subformat contains the following code:

            int argumentNumber = argumentNumbers[i];
            if (arguments == null || argumentNumber >= arguments.length) {
                result.append('{').append(argumentNumber).append('}');
                continue;
            }

It requires append(int), but the Appendable has no such method.

To sum up

<T extends Appendable & CharSequence> is clear, but lack of extensibility
StringBuf is not so clear, but it's more extensibility. We can add append(char[], int, int) to StringBuf to support SimpleDateFormat.
New interface StringBuf no need to handle IOException, but use <T extends Appendable & CharSequence> need to add a lot of try...catch statements around the append method.

So it seems StringBuf is better if we want to fix the problem in DateFormat,ListFormat,MessageFormat in unique way. I look forward to your reply. @naotoj @liach @justin-curtis-lu

By the way, we can let the new classes StringBuf and StringBufFactory as the inner class of Format,because StringBuf is only used by the Format`'s subclasses.

Jun 19 '24 02:06 lingjun-cg

If change the visibility of java.lang.AbstractStringBuilder from package-private to public, it seems more simple. After that user code cannot extend AbstractStringBuilder because it's sealed class.

Jun 20 '24 01:06 lingjun-cg