openj9
openj9 copied to clipboard
Inconsistent execution results after System.arraycopy
Java -version output
openjdk version "11.0.22" 2024-01-16
IBM Semeru Runtime Open Edition 11.0.22.0 (build 11.0.22+7)
Eclipse OpenJ9 VM 11.0.22.0 (build openj9-0.43.0, JRE 11 Mac OS X amd64-64-Bit Compressed References 20240131_864 (JIT enabled, AOT enabled)
OpenJ9 - 2c3d78b48
OMR - ea8124dbc
JCL - 7876cac747 based on jdk-11.0.22+7)
openjdk version "17.0.10" 2024-01-16
IBM Semeru Runtime Open Edition 17.0.10.0 (build 17.0.10+7)
Eclipse OpenJ9 VM 17.0.10.0 (build openj9-0.43.0, JRE 17 Mac OS X amd64-64-Bit Compressed References 20240116_636 (JIT enabled, AOT enabled)
OpenJ9 - 2c3d78b48
OMR - ea8124dbc
JCL - 2aad089841f based on jdk-17.0.10+7)
openjdk version "21.0.2" 2024-01-16 LTS
IBM Semeru Runtime Open Edition 21.0.2.0 (build 21.0.2+13-LTS)
Eclipse OpenJ9 VM 21.0.2.0 (build openj9-0.43.0, JRE 21 Mac OS X amd64-64-Bit Compressed References 20240116_95 (JIT enabled, AOT enabled)
OpenJ9 - 2c3d78b48
OMR - ea8124dbc
JCL - 78c4500a434 based on jdk-21.0.2+13)
openjdk version "1.8.0_402"
IBM Semeru Runtime Open Edition (build 1.8.0_402-b06)
Eclipse OpenJ9 VM (build openj9-0.43.0, JRE 1.8.0 Mac OS X amd64-64-Bit Compressed References 20240131_892 (JIT enabled, AOT enabled)
OpenJ9 - 2c3d78b48
OMR - ea8124dbc
JCL - 0fa9d9c532 based on jdk8u402-b06)
Summary of problem
public class ArrayCopyTest {
public static int[] src1 = new int[1];
public static int[] dst1 = new int[536870913];
public static int sum = 0;
public static void main(String[] var0) {
for(int var1 = 0; var1 < 11000; var1++) {
m1();
m1();
}
System.out.println(sum);
}
public static void m1() {
int var1 = 536870912;
System.arraycopy(src1, 0, dst1, var1, 1);
sum = checksum(sum, var1);
}
public static int checksum(int var0, int var1) {
var0 += Integer.hashCode(var1);
return Integer.hashCode(var0);
}
}
The test program above produces different execution results across different runs. This seems to be a JIT bug since the inconsistent execution results do not occur when the -Xnojit option is specified. The test program and the reproduction script can be found at JITIssues-1.zip. Please replace the Java cmd path in test.sh with your Java cmd. This bug affects OpenJDK 8, OpenJDK 11, OpenJDK 17, and OpenJDK 21.
@hzongaro fyi
@Brad0309, what differences in behaviour have you seen? I'm consistently seeing an OutOfMemoryError being thrown in my attempts to run the test.
jdk8u402-b06/bin/java -Xmx8g ArrayCopyTest
536870912
jdk8u402-b06/bin/java -Xmx8g ArrayCopyTest
-1073741824
jdk8u402-b06/bin/java -Xmx8g ArrayCopyTest
0
With -Xint the result is 0.
Thanks, @pshipton!
@jmesyou, may I ask you to look at this problem?
I cannot reproduce the problem with -Xjit:limit=m1,optlevel=noOpt, but with -Xjit:limit=m1,optlevel=cold.
The trace file with the noOpt level has the node istore ArrayCopyTest.sum I, but at the cold level the node disappears in the local value propagation phase as the following log shows.
Performing 24: localValuePropagation
[ 17] O^O VALUE PROPAGATION: Changing call call [000000011F544500] to arraycopy
[ 18] O^O VALUE PROPAGATION: Removing rest of block after ArrayCopyBNDCHK [000000011F640230]
[ 19] O^O VALUE PROPAGATION: Removing unreachable block_3 at [000000011F6006A0]
Thanks for taking a look @knn-k ! I am also able to reproduce the problem using the same parameters on HEAD. It is however quite intermittent, take > 100 tries sometimes to get the incorrect behavior.
@knn-k Do you happen to have a log from when you reproduced the issue? I've been able to reproduce the issue but when I turn logging on, it goes away!
@jmesyou Attached please find my JIT trace files for noOpt and cold. 19247.zip
There is no store to ArrayCopyTest.sum after localValuePropagation in m1-cold.txt.
I used the following command lines:
java -Xnoaot -Xjit:limit=m1,optlevel=noOpt,traceFull,log=m1-noOpt.txt,count=0 ArrayCopyTestjava -Xnoaot -Xjit:limit=m1,optlevel=cold,traceFull,log=m1-cold.txt,count=0 ArrayCopyTest
openjdk version "11.0.23-internal" 2024-04-16
OpenJDK Runtime Environment (build 11.0.23-internal+0-adhoc.kaz.openj9-openjdk-jdk11)
Eclipse OpenJ9 VM (build master-4a7b54d13, JRE 11 Mac OS X aarch64-64-Bit 20240405_000000 (JIT enabled, AOT enabled)
OpenJ9 - 4a7b54d13
OMR - 974bec3a2
JCL - 0f9c69e64d based on jdk-11.0.23+7)
@dylanjtuttle, may I ask you to look at this problem?
The
[ 18] O^O VALUE PROPAGATION: Removing rest of block after ArrayCopyBNDCHK [000000011F640230]
trace message comes from OMR::ValuePropagation::mustTakeException(). This function calls OMR::ValuePropagation::removeRestOfBlock() which actually performs the (seemingly) erroneous removal of all trees after the current tree top.
constrainArrayCopyBndChk() calls mustTakeException when it has determined that "the exception will always be taken". From the context of the surrounding code, I am pretty sure that this means the compiler has placed value constraints on the ArrayCopyBNDCHK and its children, and those constraints indicate to the compiler that the check is guaranteed to fail, so it calls mustTakeException to remove what it thinks is dead code.
Some potential reasons this might be going wrong:
- The value numbering of the nodes is incorrect
- The constraints have been placed on the nodes incorrectly in some way
- The value numbers and constraints are correct, but the compiler is incorrectly interpreting them
- Some other reason that I haven't thought of
I'll look into these possibilities and see what I can figure out!
n1n BBStart <block_2> (freq 10000) [ 0x11fb50190] bci=[-1,0,17] rc=0 vc=34 vn=- li=- udi=- nc=0
n4n istore <auto slot 0>[#352 Auto] [flags 0x3 0x0 ] [ 0x11fb50280] bci=[-1,2,17] rc=0 vc=34 vn=- li=- udi=- nc=1
n3n iconst 0x20000000 (X!=0 X>=0 ) [ 0x11fb50230] bci=[-1,0,17] rc=2 vc=34 vn=- li=- udi=- nc=0 flg=0x104
n6n ResolveCHK [#281] [ 0x11fb50320] bci=[-1,3,18] rc=0 vc=34 vn=- li=- udi=- nc=1
n5n aload ArrayCopyTest.src1 [I[#353 unresolved notAccessed volatile Static] [flags 0x2307 0x0 ] [ 0x11fb502d0] bci=[-1,3,18] rc=2 vc=34 vn=- li=- udi=- nc=0
n9n ResolveCHK [#281] [ 0x11fb50410] bci=[-1,7,18] rc=0 vc=34 vn=- li=- udi=- nc=1
n8n aload ArrayCopyTest.dst1 [I[#354 unresolved notAccessed volatile Static] [flags 0x2307 0x0 ] [ 0x11fb503c0] bci=[-1,7,18] rc=2 vc=34 vn=- li=- udi=- nc=0
n53n NULLCHK on n5n [#32] [ 0x11fc4c050] bci=[-1,3,18] rc=0 vc=34 vn=- li=- udi=- nc=1
n52n arraylength (stride 4) (X>=0 cannotOverflow ) [ 0x11fc4c000] bci=[-1,3,18] rc=2 vc=34 vn=- li=- udi=- nc=1 flg=0x1100
n5n ==>aload
n55n NULLCHK on n8n [#32] [ 0x11fc4c0f0] bci=[-1,7,18] rc=0 vc=34 vn=- li=- udi=- nc=1
n54n arraylength (stride 4) (X>=0 cannotOverflow ) [ 0x11fc4c0a0] bci=[-1,7,18] rc=2 vc=34 vn=- li=- udi=- nc=1 flg=0x1100
n8n ==>aload
n57n ArrayCopyBNDCHK [#1] [ 0x11fc4c190] bci=[-1,3,18] rc=0 vc=34 vn=- li=- udi=- nc=2
n56n isub (cannotOverflow ) [ 0x11fc4c140] bci=[-1,3,18] rc=1 vc=34 vn=- li=- udi=- nc=2 flg=0x1000
n52n ==>arraylength
n11n iconst 1 (X!=0 X>=0 ) [ 0x11fb504b0] bci=[-1,11,18] rc=2 vc=34 vn=- li=- udi=- nc=0 flg=0x104
n7n iconst 0 (X==0 X>=0 X<=0 ) [ 0x11fb50370] bci=[-1,6,18] rc=1 vc=34 vn=- li=- udi=- nc=0 flg=0x302
n59n ArrayCopyBNDCHK [#1] [ 0x11fc4c230] bci=[-1,7,18] rc=0 vc=34 vn=- li=- udi=- nc=2
n58n isub (cannotOverflow ) [ 0x11fc4c1e0] bci=[-1,7,18] rc=1 vc=34 vn=- li=- udi=- nc=2 flg=0x1000
n54n ==>arraylength
n11n ==>iconst 1
n3n ==>iconst 0x20000000
n60n return [ 0x11fc4c280] bci=[-1,7,18] rc=0 vc=34 vn=- li=- udi=- nc=0
n37n BBEnd </block_2> [ 0x11fb50cd0] bci=[0,2,23] rc=0 vc=33 vn=- li=- udi=- nc=0
From this block, we can tell that two ArrayCopyBNDCHKs are performed: one on ArrayCopyTest.src1 and one on ArrayCopyTest.dst1. The first compares 0 with src1.length - 1, while the second compares 0x20000000 (or 536870912 in decimal) with dst1.length - 1.
public class ArrayCopyTest {
public static int[] src1 = new int[1];
public static int[] dst1 = new int[536870913];
public static int sum = 0;
public static void main(String[] var0) {
for(int var1 = 0; var1 < 11000; var1++) {
m1();
m1();
}
System.out.println(sum);
}
public static void m1() {
int var1 = 536870912;
System.arraycopy(src1, 0, dst1, var1, 1);
sum = checksum(sum, var1);
}
public static int checksum(int var0, int var1) {
var0 += Integer.hashCode(var1);
return Integer.hashCode(var0);
}
}
As we can see from the test case, src1.length = 1 and dst1.length = 536870913, so both of our ArrayCopyBNDCHKs should pass no problem. However, when printing out the values of low and high in constrainArrayCopyBndChk(), I saw the following print statements:
low (0) <= high (536870910)
low (536870912) <= high (536870910)
It appears like the value of dst1.length - 1 is occasionally 536870910 instead of 536870912. I'm not entirely sure why the first ArrayCopyBNDCHK is comparing 0 to dst1.length - 1 instead of src1.length - 1, but I think the important takeaway is that for some reason, the length of dst1 is occasionally two elements shorter than it should be, causing our second ArrayCopyBNDCHK to fail and the optimizer to remove the trees that follow it.
This leads me to believe that the problem may actually be somewhere like constrainArraylength() instead. I will continue to investigate!
I believe I have found the source of the error! On line 1957 of constrainAload():
if (elementSize != 0)
{
constraint = TR::VPClass::create(vp, (TR::VPClassType*)constraint, NULL, NULL,
TR::VPArrayInfo::create(vp, 0, elementSize == 0 ? TR::getMaxSigned<TR::Int32>() : TR::getMaxSigned<TR::Int32>()/elementSize, elementSize),
TR::VPObjectLocation::create(vp, TR::VPObjectLocation::NotClassObject));
}
When we create the VPArrayInfo object, we pass a value of
elementSize == 0 ? TR::getMaxSigned<TR::Int32>() : TR::getMaxSigned<TR::Int32>()/elementSize
as our highBound. The ternary operator is redundant here because we already know that elementSize != 0, meaning our highBound is really just
TR::getMaxSigned<TR::Int32>()/elementSize
TR::getMaxSigned<TR::Int32>() = 2147483647 and elementSize = 4 here, so when you integer divide them, you obtain 536870911, which is our two-elements-too-short incorrect array length.
@hzongaro, do you know why we're doing this? Dividing the total size of the array by the element size seems like a reasonable way to obtain the length of an array, but why might we be using the maximum value of an Int32? Do you have any thoughts about what we should be doing instead to obtain a more accurate length?
Do you have any thoughts about what we should be doing instead to obtain a more accurate length?
This was probably an historical limitation. As @jdmpapin mentioned off-line, there are other places in VP that take the heap size into account in determining the maximum number of elements in an array. For instance, constrainArraylength uses TR::Compiler->om.maxArraySizeInElements.
I believe I have found the source of the error!
Good work! :-)
It looks like replacing the old method of calculating the length with
TR::Compiler->om.maxArraySizeInElements(elementSize, vp->comp())
like in constrainArrayLength() does fix the problem. Printing the values of low and high in constrainArrayCopyBndChk() like before, we can see that the high bound is now more than large enough:
low (0) <= high (1048330239)
low (536870912) <= high (1048330239)
I have not been able to reproduce the bug since making this change, but since it was sometimes difficult to reproduce before I don't want to say that it is gone for certain.
Reopen until the PR is added to 0.48
Issue Number: 19247 Status: Closed Actual Components: comp:jit, userRaised Actual Assignees: No one :( PR Assignees: No one :(