ICU-23061 LowercaseTransliterator has poor scalability due to synchronized state
LowercaseTransliterator tries to reuse buffers to avoid allocation. Historically, this may have been an important optimization. However on modern machines you are much more likely to use multiple CPUs, and modern JVMs are very efficient at GC, especially for local temporary data. Contending on a monitor is a big deal nowadays, especially for something so low level and hidden.
I've prepared a benchmark using JMH to show the change in scalability:
Linux 6.12.13-200.fc41.x86_64 AMD Ryzen 7 3700X 8-Core Processor
JMH version: 1.37
JDK 23.0.2, OpenJDK 64-Bit Server VM, 23.0.2+7
Original `synchronized`:
1 Thread Benchmark Mode Cnt Score Error Units
LowercaseTransliteratorPerf.testSentence thrpt 5 789.169 ± 2.779 ops/ms
LowercaseTransliteratorPerf.testShort thrpt 5 5217.250 ± 35.501 ops/ms
8 Thread Benchmark Mode Cnt Score Error Units
LowercaseTransliteratorPerf.testSentence thrpt 5 759.934 ± 4.880 ops/ms
LowercaseTransliteratorPerf.testShort thrpt 5 3916.430 ± 70.545 ops/ms
New `stateless`:
1 Thread Benchmark Mode Cnt Score Error Units
LowercaseTransliteratorPerf.testSentence thrpt 5 779.504 ± 6.507 ops/ms
LowercaseTransliteratorPerf.testShort thrpt 5 5641.104 ± 29.747 ops/ms
8 Thread Benchmark Mode Cnt Score Error Units
LowercaseTransliteratorPerf.testSentence thrpt 5 6340.495 ± 240.436 ops/ms
LowercaseTransliteratorPerf.testShort thrpt 5 39132.015 ± 7102.311 ops/ms
The current implementation exhibits negative scalability: adding more threads hurts throughput (~5% loss going from 1T to 8T)
By removing the synchronized state, the implementation now scales perfectly (~8x increase going from 1T to 8T for the longer test case)
There is a very minor (~1%) penalty for pure single-threaded throughput since we no longer reuse state.
I believe this is acceptable in the context of allowing multiple threads to work without coordination.
Checklist
- [x] Required: Issue filed: ICU-23061
- [x] Required: The PR title must be prefixed with a JIRA Issue number. Example: "ICU-1234 Fix xyz"
- [x] Required: Each commit message must be prefixed with a JIRA Issue number. Example: "ICU-1234 Fix xyz"
- [x] Issue accepted (done by Technical Committee after discussion)
- [x] Tests included, if applicable
- [x] API docs and/or User Guide docs changed or added, if applicable
Notice: the branch changed across the force-push!
- icu4j/perf-tests/src/main/java/com/ibm/icu/dev/test/perf/LowercaseTransliteratorPerf.java is different
~ Your Friendly Jira-GitHub PR Checker Bot
Notice: the branch changed across the force-push!
- icu4j/perf-tests/README.txt is different
~ Your Friendly Jira-GitHub PR Checker Bot
Hooray! The files in the branch are the same across the force-push. 😃
~ Your Friendly Jira-GitHub PR Checker Bot
Notice: the branch changed across the force-push!
- icu4j/perf-tests/src/main/java/com/ibm/icu/dev/test/perf/LowercaseTransliteratorPerf.java is different
~ Your Friendly Jira-GitHub PR Checker Bot
I needed to add the license header to the performance test.
Notice: the branch changed across the force-push!
- icu4j/perf-tests/pom.xml is different
~ Your Friendly Jira-GitHub PR Checker Bot
I removed the : from my commit messages, I think it was tripping up the checker. Also rebased on top of main.
The change itself looks reasonable, assuming the performance advantages are as billed. I can't speak to the POM changes or the performance test, as that's not stuff I'm really familiar with.
Usually, bug fixes in ICU need to be done in both C++ and Java when appropriate. Do we have a similar problem on the C++ side? If so, I'd like to see a fix for that here too.
These changes remind me of something. I think this Java lowercase transliterator still uses the locale. The ICU4C implementation removed that long ago because of how the registry works with sharing the singleton objects. Though the implementation in Java hard codes it to be US English casing only. So it's not much different.
I thought the transliterator casing functionality in ICU4C was doing something funny with thread safety long ago, but I can't seem to find the concerning code anymore. Perhaps it was removed, or I missed it.
I can't speak to the POM changes or the performance test, as that's not stuff I'm really familiar with.
The introduction of JMH is optional, and if the project maintainers don't want to introduce it, I can remove these changes. I see that ICU already has its own benchmark harness. JMH is nice because it handles a lot of the boilerplate for you - help messages, warming up the JVM, timing runs - in a way that the "JVM experts" have thought long and hard about. The benchmark could be converted to the ICU runner, or not committed, if that's preferred. But I think you'll find that more and more outside contributors would already come in with JMH familiarity.
We really need one or more people from the ICU4J side of things to comment here...
Notice: the branch changed across the force-push!
- icu4j/perf-tests/pom.xml is different
~ Your Friendly Jira-GitHub PR Checker Bot
I rebased and resolved conflicts.
Anything I should do to move this PR forward? If the JMH benchmark is acceptable, I can squash the commits together. I didn't want to do that until we confirmed it would be accepted.
Notice: the branch changed across the force-push!
- icu4j/main/translit/src/main/java/com/ibm/icu/text/BreakTransliterator.java is now changed in the branch
- icu4j/main/translit/src/main/java/com/ibm/icu/text/CaseFoldTransliterator.java is now changed in the branch
- icu4j/main/translit/src/main/java/com/ibm/icu/text/TitlecaseTransliterator.java is now changed in the branch
- icu4j/main/translit/src/main/java/com/ibm/icu/text/UppercaseTransliterator.java is now changed in the branch
- icu4j/perf-tests/pom.xml is different
- icu4j/perf-tests/README.txt is different
- icu4j/perf-tests/src/main/java/com/ibm/icu/dev/test/perf/CaseFoldTransliteratorPerfTest.java is now changed in the branch
- icu4j/perf-tests/src/main/java/com/ibm/icu/dev/test/perf/TitlecaseTransliteratorPerf.java is now changed in the branch
- icu4j/perf-tests/src/main/java/com/ibm/icu/dev/test/perf/UppercaseTransliteratorPerf.java is now changed in the branch
- icu4j/pom.xml is now changed in the branch
~ Your Friendly Jira-GitHub PR Checker Bot
Notice: the branch changed across the force-push!
- icu4j/perf-tests/src/main/java/com/ibm/icu/dev/test/perf/CaseFoldTransliteratorPerfTest.java is different
- icu4j/perf-tests/src/main/java/com/ibm/icu/dev/test/perf/LowercaseTransliteratorPerf.java is different
- icu4j/perf-tests/src/main/java/com/ibm/icu/dev/test/perf/TitlecaseTransliteratorPerf.java is different
- icu4j/perf-tests/src/main/java/com/ibm/icu/dev/test/perf/UppercaseTransliteratorPerf.java is different
~ Your Friendly Jira-GitHub PR Checker Bot
Notice: the branch changed across the force-push!
- icu4j/perf-tests/src/main/java/com/ibm/icu/dev/test/perf/BreakTransliteratorPerf.java is now changed in the branch
- icu4j/perf-tests/src/main/java/com/ibm/icu/text/BreakTransliteratorAccess.java is now changed in the branch
~ Your Friendly Jira-GitHub PR Checker Bot
I re-ran the benchmarks with the latest changes.
synchronized, 1T:
Benchmark Mode Cnt Score Error Units
BreakTransliteratorPerf.testSentence thrpt 5 419.108 ± 26.005 ops/ms
BreakTransliteratorPerf.testShort thrpt 5 4022.897 ± 124.318 ops/ms
CaseFoldTransliteratorPerfTest.testSentence thrpt 5 802.483 ± 29.266 ops/ms
CaseFoldTransliteratorPerfTest.testShort thrpt 5 5383.847 ± 277.556 ops/ms
LowercaseTransliteratorPerf.testSentence thrpt 5 820.756 ± 54.370 ops/ms
LowercaseTransliteratorPerf.testShort thrpt 5 5015.698 ± 216.265 ops/ms
TitlecaseTransliteratorPerf.testSentence thrpt 5 820.914 ± 54.945 ops/ms
TitlecaseTransliteratorPerf.testShort thrpt 5 4716.300 ± 276.341 ops/ms
UppercaseTransliteratorPerf.testSentence thrpt 5 706.661 ± 29.485 ops/ms
UppercaseTransliteratorPerf.testShort thrpt 5 4634.731 ± 368.809 ops/ms
synchronized, 8T:
Benchmark Mode Cnt Score Error Units
BreakTransliteratorPerf.testSentence thrpt 5 594.238 ± 15.677 ops/ms
BreakTransliteratorPerf.testShort thrpt 5 3328.109 ± 58.666 ops/ms
CaseFoldTransliteratorPerfTest.testSentence thrpt 5 797.488 ± 19.831 ops/ms
CaseFoldTransliteratorPerfTest.testShort thrpt 5 4302.125 ± 36.354 ops/ms
LowercaseTransliteratorPerf.testSentence thrpt 5 801.179 ± 21.477 ops/ms
LowercaseTransliteratorPerf.testShort thrpt 5 4366.354 ± 21.683 ops/ms
TitlecaseTransliteratorPerf.testSentence thrpt 5 802.491 ± 5.279 ops/ms
TitlecaseTransliteratorPerf.testShort thrpt 5 3835.846 ± 205.297 ops/ms
UppercaseTransliteratorPerf.testSentence thrpt 5 598.344 ± 16.620 ops/ms
UppercaseTransliteratorPerf.testShort thrpt 5 3612.776 ± 10.186 ops/ms
local-alloc, 1T:
Benchmark Mode Cnt Score Error Units
BreakTransliteratorPerf.testSentence thrpt 5 378.975 ± 8.528 ops/ms
BreakTransliteratorPerf.testShort thrpt 5 2176.368 ± 66.654 ops/ms
CaseFoldTransliteratorPerfTest.testSentence thrpt 5 817.171 ± 40.881 ops/ms
CaseFoldTransliteratorPerfTest.testShort thrpt 5 5887.798 ± 463.620 ops/ms
LowercaseTransliteratorPerf.testSentence thrpt 5 847.376 ± 53.358 ops/ms
LowercaseTransliteratorPerf.testShort thrpt 5 5730.719 ± 377.115 ops/ms
TitlecaseTransliteratorPerf.testSentence thrpt 5 835.202 ± 55.587 ops/ms
TitlecaseTransliteratorPerf.testShort thrpt 5 5244.637 ± 246.148 ops/ms
UppercaseTransliteratorPerf.testSentence thrpt 5 617.542 ± 43.544 ops/ms
UppercaseTransliteratorPerf.testShort thrpt 5 5048.793 ± 404.342 ops/ms
local-alloc, 8T:
Benchmark Mode Cnt Score Error Units
BreakTransliteratorPerf.testSentence thrpt 5 2805.141 ± 33.812 ops/ms
BreakTransliteratorPerf.testShort thrpt 5 4805.065 ± 171.575 ops/ms
CaseFoldTransliteratorPerfTest.testSentence thrpt 5 6810.324 ± 56.592 ops/ms
CaseFoldTransliteratorPerfTest.testShort thrpt 5 44420.295 ± 890.433 ops/ms
LowercaseTransliteratorPerf.testSentence thrpt 5 6655.300 ± 20.258 ops/ms
LowercaseTransliteratorPerf.testShort thrpt 5 42945.707 ± 116.837 ops/ms
TitlecaseTransliteratorPerf.testSentence thrpt 5 5288.342 ± 97.473 ops/ms
TitlecaseTransliteratorPerf.testShort thrpt 5 32089.013 ± 633.706 ops/ms
UppercaseTransliteratorPerf.testSentence thrpt 5 4585.569 ± 66.147 ops/ms
UppercaseTransliteratorPerf.testShort thrpt 5 36655.642 ± 421.983 ops/ms
Notice: the branch changed across the force-push!
- icu4j/perf-tests/src/main/java/com/ibm/icu/dev/test/perf/BreakTransliteratorPerf.java is no longer changed in the branch
- icu4j/perf-tests/src/main/java/com/ibm/icu/dev/test/perf/BreakTransliteratorPerfTest.java is now changed in the branch
- icu4j/perf-tests/src/main/java/com/ibm/icu/dev/test/perf/LowercaseTransliteratorPerf.java is no longer changed in the branch
- icu4j/perf-tests/src/main/java/com/ibm/icu/dev/test/perf/LowercaseTransliteratorPerfTest.java is now changed in the branch
- icu4j/perf-tests/src/main/java/com/ibm/icu/dev/test/perf/TitlecaseTransliteratorPerf.java is no longer changed in the branch
- icu4j/perf-tests/src/main/java/com/ibm/icu/dev/test/perf/TitlecaseTransliteratorPerfTest.java is now changed in the branch
- icu4j/perf-tests/src/main/java/com/ibm/icu/dev/test/perf/UppercaseTransliteratorPerf.java is no longer changed in the branch
- icu4j/perf-tests/src/main/java/com/ibm/icu/dev/test/perf/UppercaseTransliteratorPerfTest.java is now changed in the branch
~ Your Friendly Jira-GitHub PR Checker Bot
I made the suggested renamings. Please let me know what you would like to do with the BreakTransliterator non-public access case.
Notice: the branch changed across the force-push!
- icu4j/perf-tests/src/main/java/com/ibm/icu/text/BreakTransliteratorAccess.java is different
~ Your Friendly Jira-GitHub PR Checker Bot
It looks like all of the issues have been addressed on this PR and we have been ready to merge, we just overlooked it. I did some minor bookkeeping details to push it over the finish line and finally merge it.
FYI on bookkeeping details needed before the merge: CI needed to be manually enabled again (not sure why, once has been good enough for new contributors). The CI copyright check job failed because we're adding a new file in this PR, and it didn't have that verbiage, so I amended the PR with the diff below to add it, and then did a force push back to the PR branch (and then reapproved and merged).
diff --git a/icu4j/perf-tests/src/main/java/com/ibm/icu/text/BreakTransliteratorAccess.java b/icu4j/perf-tests/src/main/java/com/ibm/icu/text/BreakTransliteratorAccess.java
index 5afbe1e92bd..a80b5da2fa7 100644
--- a/icu4j/perf-tests/src/main/java/com/ibm/icu/text/BreakTransliteratorAccess.java
+++ b/icu4j/perf-tests/src/main/java/com/ibm/icu/text/BreakTransliteratorAccess.java
@@ -1,3 +1,5 @@
+// © 2025 and later: Unicode, Inc. and others.
+// License & terms of use: http://www.unicode.org/copyright.html
package com.ibm.icu.text;
public class BreakTransliteratorAccess {