jdk icon indicating copy to clipboard operation
jdk copied to clipboard

8357289: Break down the String constructor into smaller methods

Open wenshao opened this issue 8 months ago • 16 comments

Through JVM Option +PrintInlining, we found that String has a constructor codeSize of 852, which is too large. This caused failed to inline.

The following is the output information of PrintInlining:

                @ 9   java.lang.String::<init> (12 bytes)   inline (hot)
!m                 @ 1   java.nio.charset.Charset::defaultCharset (52 bytes)   inline (hot)
!                  @ 8   java.lang.String::<init> (852 bytes)   failed to inline: hot method too big

In Java code, the big method that cannot be inlined is the following constructor

String(Charset charset, byte[] bytes, int offset, int length) {}

The above String constructor is too large; break it down into smaller methods with a codeSize under 325 to allow them to be inlined by the C2.


Progress

  • [x] Change must not contain extraneous whitespace
  • [x] Commit message must refer to an issue
  • [x] Change must be properly reviewed (2 reviews required, with at least 2 Reviewers)

Issue

  • JDK-8357289: Break down the String constructor into smaller methods (Enhancement - P4)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/25290/head:pull/25290
$ git checkout pull/25290

Update a local copy of the PR:
$ git checkout pull/25290
$ git pull https://git.openjdk.org/jdk.git pull/25290/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 25290

View PR using the GUI difftool:
$ git pr show -t 25290

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/25290.diff

Using Webrev

Link to Webrev Comment

wenshao avatar May 18 '25 12:05 wenshao

:wave: Welcome back swen! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

bridgekeeper[bot] avatar May 18 '25 12:05 bridgekeeper[bot]

@wenshao This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8357289: Break down the String constructor into smaller methods

Reviewed-by: liach, rriggs

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 609 new commits pushed to the master branch:

  • 5a1301df19553c7ba04c746b4002164f3b833e70: 8360548: Parallel: Remove outdated comments in MutableNUMASpace::bias_region
  • 5039b42de170769797312969185ee9d67f34cf24: 8359437: Make users and test suite not able to set LockingMode flag
  • 1ca008fd02496dc33e2707c102560cae1690fba5: 8360255: runtime/jni/checked/TestLargeUTF8Length.java fails with -XX:-CompactStrings
  • ... and 606 more: https://git.openjdk.org/jdk/compare/6c42856b8d5039c14ba04a48c60d09039d5030fe...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

openjdk[bot] avatar May 18 '25 12:05 openjdk[bot]

@wenshao The following label will be automatically applied to this pull request:

  • core-libs

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

openjdk[bot] avatar May 18 '25 12:05 openjdk[bot]

Below are the performance numbers on the MacBook M1 Max. The numbers show that after the small split method, the performance has been significantly improved.

-# master 6c42856b8d5
-Benchmark                                                  (size)  Mode  Cnt   Score   Error  Units
-StringConstructor.newStringFromBytes                            7  avgt   15   6.628 ± 0.048  ns/op
-StringConstructor.newStringFromBytes                           64  avgt   15  10.182 ± 0.079  ns/op
-StringConstructor.newStringFromBytesRanged                      7  avgt   15  10.187 ± 0.871  ns/op
-StringConstructor.newStringFromBytesRanged                     64  avgt   15  11.304 ± 0.111  ns/op
-StringConstructor.newStringFromBytesRangedWithCharsetUTF8       7  avgt   15  10.869 ± 0.753  ns/op
-StringConstructor.newStringFromBytesRangedWithCharsetUTF8      64  avgt   15  11.348 ± 0.134  ns/op
-StringConstructor.newStringFromBytesWithCharsetNameUTF8         7  avgt   15   9.483 ± 0.119  ns/op
-StringConstructor.newStringFromBytesWithCharsetNameUTF8        64  avgt   15  12.755 ± 0.089  ns/op
-StringConstructor.newStringFromBytesWithCharsetUTF8             7  avgt   15   6.721 ± 0.107  ns/op
-StringConstructor.newStringFromBytesWithCharsetUTF8            64  avgt   15  10.208 ± 0.065  ns/op


+# current 4ebac0ddc64
+Benchmark                                                  (size)  Mode  Cnt   Score   Error  Units
+StringConstructor.newStringFromBytes                            7  avgt   15   4.715 ± 0.029  ns/op +40.57%
+StringConstructor.newStringFromBytes                           64  avgt   15   8.019 ± 0.152  ns/op +26.97%
+StringConstructor.newStringFromBytesRanged                      7  avgt   15   5.563 ± 0.059  ns/op +83.12%
+StringConstructor.newStringFromBytesRanged                     64  avgt   15   9.549 ± 0.217  ns/op +18.37%
+StringConstructor.newStringFromBytesRangedWithCharsetUTF8       7  avgt   15   5.579 ± 0.076  ns/op +94.81%
+StringConstructor.newStringFromBytesRangedWithCharsetUTF8      64  avgt   15   9.407 ± 0.047  ns/op +20.63%
+StringConstructor.newStringFromBytesWithCharsetNameUTF8         7  avgt   15   8.168 ± 0.084  ns/op +16.09%
+StringConstructor.newStringFromBytesWithCharsetNameUTF8        64  avgt   15  12.574 ± 0.268  ns/op + 1.43%
+StringConstructor.newStringFromBytesWithCharsetUTF8             7  avgt   15   4.722 ± 0.043  ns/op +42.33%
+StringConstructor.newStringFromBytesWithCharsetUTF8            64  avgt   15   8.077 ± 0.144  ns/op +26.38%

wenshao avatar May 18 '25 15:05 wenshao

I wonder if it would be better to first check COMPACT_STRINGS in a first-level if and then branch off to separate support methods? Looking at the comments near the declaration of COMPACT_STRINGS, this might provide additional benefits.

minborg avatar May 20 '25 08:05 minborg

I wonder if it would be better to first check COMPACT_STRINGS in a first-level if and then branch off to separate support methods? Looking at the comments near the declaration of COMPACT_STRINGS, this might provide additional benefits.

Now all the broken methods have CodeSize < 325 and can be inlined by C2. If we check COMPACT_STRINGS at the first level, we need to change more. I want to achieve the goal with minimal changes.

wenshao avatar May 20 '25 08:05 wenshao

@wenshao This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply issue a /touch or /keepalive command to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

bridgekeeper[bot] avatar Jun 23 '25 16:06 bridgekeeper[bot]

@cl4es @minborg Could you help me take a look again?

wenshao avatar Jun 24 '25 08:06 wenshao

/reviewers 2 reviewer

RogerRiggs avatar Jun 24 '25 19:06 RogerRiggs

@RogerRiggs The total number of required reviews for this PR (including the jcheck configuration and the last /reviewers command) is now set to 2 (with at least 2 Reviewers).

openjdk[bot] avatar Jun 24 '25 19:06 openjdk[bot]

We may have another case of optimizing for the JMH microbenchmark, and it would be more valuable to see improvements in higher level benchmarks. Such as DaCapo. BioJava. Where is the most benefit of inlining? Inlinlining increases the code size of the caller, but only results in a benefit if the resulting code can avoid some code paths, accesses, or branches. Smaller methods do give C2 more inlining options and can make the code more readable or composable.

RogerRiggs avatar Jun 24 '25 19:06 RogerRiggs

I think the main advantage here is that with smaller, static methods, it allows us to merge these replacement-decode methods with the no-replacement-decode methods. This is already how the encoding methods are arranged, and can simplify maintenance in the future.

liach avatar Jun 24 '25 19:06 liach

For startup time, I believe this should have little impact: String itself, without allocation of its storage array, should be as expensive as generic record objects. This should have similar impact of declaring new records (minimal), and in addition, this doesn't introduce new classes, so there will be even less of an effect.

liach avatar Jun 25 '25 22:06 liach

For startup time, I believe this should have little impact: String itself, without allocation of its storage array, should be as expensive as generic record objects. This should have similar impact of declaring new records (minimal), and in addition, this doesn't introduce new classes, so there will be even less of an effect.

Yes, but the comparison should be against the non-allocating (current implementation) not the initial implementation in the PR. Startup measurement are done periodically and if there is any impact it will show up.

RogerRiggs avatar Jun 26 '25 13:06 RogerRiggs

Scalar  293  CheckCastPP  === 290 288  [[ 430 367 367 430 357 357 ]]  #java/lang/String (java/io/Serializable,java/lang/Comparable,java/lang/CharSequence,java/lang/constant/Constable,java/lang/constant/ConstantDesc):NotNull:exact *,iid=276  Oop:java/lang/String (java/io/Serializable,java/lang/Comparable,java/lang/CharSequence,java/lang/constant/Constable,java/lang/constant/ConstantDesc):NotNull:exact *,iid=276 !jvms: String::iso88591 @ bci:6 (line 624) String::<init> @ bci:42 (line 562) String::<init> @ bci:12 (line 1463) StringConstructor::newStringFromBytesCharSet @ bci:12 (line 19) StringConstructorTest::newStringFromBytesCharSet @ bci:25 (line 27)
++++ Eliminated: 276 Allocate

By configuring the JVM startup parameters -XX:+UnlockDiagnosticVMOptions -XX:+PrintEliminateAllocations to print the log, the allocated String temporary object escapes successfully

wenshao avatar Jun 27 '25 04:06 wenshao

/integrate

wenshao avatar Jun 27 '25 14:06 wenshao

Going to push as commit 839cede1a46b05d27abeaffbbd82c241910035cd. Since your change was applied there have been 627 commits pushed to the master branch:

  • ecd2d83096a1fea7d5086736306770bcffa4fdb6: 8359435: AArch64: add support for SB instruction to MacroAssembler::spin_wait
  • d8f9b188fa488c9c6e343c62a148cfe9fc8a563b: 8268406: Deallocate jmethodID native memory
  • aa26cede635011f5cc075cd528934ce8d8e8eef9: 8360474: Add missing include guards for some HotSpot headers
  • ... and 624 more: https://git.openjdk.org/jdk/compare/6c42856b8d5039c14ba04a48c60d09039d5030fe...master

Your commit was automatically rebased without conflicts.

openjdk[bot] avatar Jun 27 '25 14:06 openjdk[bot]

@wenshao Pushed as commit 839cede1a46b05d27abeaffbbd82c241910035cd.

:bulb: You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

openjdk[bot] avatar Jun 27 '25 14:06 openjdk[bot]