openj9
openj9 copied to clipboard
AArch64: Inline StringUTF16.compress([CI[BII)I
This commit recognizes StringUTF16.compress([CI[BII)I, and generates inlined code for AArch64.
Jenkins test sanity alinux64,amac jdk17
Jenkins test sanity alinux64,amac jdk11
Jenkins test sanity alinux64,amac jdk17
Jenkins test sanity amac jdk11
This method is used only in Java 17+.
Should I add #if JAVA_SPEC_VERSION >= 17 to the code?
The Java method copies elements of a char[] array to a byte[] array.
If any one of the array elements is larger than 255, the method fails and returns 0.
https://github.com/ibmruntimes/openj9-openjdk-jdk17/blob/764b7257fe90012db56000a556f2b835542a6063/src/java.base/share/classes/java/lang/StringUTF16.java#L176-L190
The loop in the vectorized code loads 16 elements from the char[] array at once, using two SIMD registers. It then collects the upper bytes of the char elements using the uzp2 instruction, and checks the result to see if there is any element larger than 255 or not. If no element exceeds 255, the code collects the lower bytes using the uzp1 instruction, and stores the result to the byte[] array.
When there are 15 elements or less remaining, the code copies the array by 8-, 4-, 2-, and 1-elements using a single SIMD register. The instructions for collecting the upper and lower bytes are trn2 and xtn, respectively, instead of uzp2 and uzp1.
For Z we can borrow some logic / refactor code from [1] with some changes to handle return value, which was used to accelerate implEncode* intrinsics [2].
@knn-k Let me know if you want to take a crack at this / need some help from Z team.
[1]. https://github.com/eclipse/omr/blob/9b3e65e264a66efca05c8ab283a81686ef03c876/compiler/z/codegen/OMRTreeEvaluator.cpp#L15390 [2]. https://github.com/ibmruntimes/openj9-openjdk-jdk17/blob/764b7257fe90012db56000a556f2b835542a6063/src/java.base/share/classes/java/lang/StringCoding.java#L47-L58
Jenkins test sanity alinux64,amac jdk11,jdk17
I updated the code:
- Improved the performance by changing the checks of the failure condition
- Added comments
- Added
#if JAVA_SPEC_VERSION >= 11-#endif:StringUTF16class is available in Java 11+, but it is not in Java 8.
Jenkins compile alinux64 jdk8
Because StringUTF16.compress() is a public static method, it can be called from anywhere with any inputs. The Java implementation will throw NPEs and AIOBs if insufficiently sized source/dest arrays are passed in. Unless VP is able to prove otherwise, I think you will need to null and bound check the arrays before entering your intrinsified code. What we've done in other situations like this is to do those checks in the mainline code, and if they fail then branch to an OOL section that calls the original, unintrinsified Java method where the appropriate exception would be thrown.
My other comment is around testing and how you ensured the different sections (i.e., source array lengths) of your intrinsic were exercised. It's worth creating a unit test if one doesn't already exist.
Otherwise, the actual logic itself looks fine.
I wrote a program and used it for making sure that the StringUTF16.compress() works as expected for various offsets and lengths. I need to rewrite its code for running it as a unit test.
This commit is superseded by #20406 and https://github.com/eclipse/omr/pull/7499 that implement arraytranslateTRTO255.