openj9 icon indicating copy to clipboard operation
openj9 copied to clipboard

OpenJDK NullPointerException: Cannot invoke "java.lang.CharacterData.of(int)" because "codePoint" is null

Open pshipton opened this issue 3 years ago • 38 comments

https://openj9-jenkins.osuosl.org/job/Test_openjdk17_j9_sanity.openjdk_aarch64_linux_Nightly/207 jdk_lang_1 -Xdump:system:none -Xdump:heap:none -Xdump:system:events=gpf+abort+traceassert+corruptcache -XX:-JITServerTechPreviewMessage -XX:-UseCompressedOops java/lang/String/UnicodeCasingTest.java

01:48:57  Testing on ti_ER locale....
01:48:57  java.lang.NullPointerException: Cannot invoke "java.lang.CharacterData.of(int)" because "codePoint" is null
01:48:57  	at java.base/java.lang.Character.digit(Character.java:10661)
01:48:57  	at java.base/java.lang.Character.digit(Character.java:10609)
01:48:57  	at java.base/java.lang.Integer.parseInt(Integer.java:666)
01:48:57  	at UnicodeCasingTest.convert(UnicodeCasingTest.java:210)
01:48:57  	at UnicodeCasingTest.test(UnicodeCasingTest.java:150)
01:48:57  	at UnicodeCasingTest.test(UnicodeCasingTest.java:96)
01:48:57  	at UnicodeCasingTest.main(UnicodeCasingTest.java:64)
01:48:57  	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
01:48:57  	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
01:48:57  	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
01:48:57  	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
01:48:57  	at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:312)
01:48:57  	at java.base/java.lang.Thread.run(Thread.java:862)

java/lang/StringBuilder/Comparison.java

01:48:57  STDERR:
01:48:57  java.lang.NullPointerException: Cannot invoke "java.lang.CharacterData.of(int)" because "codePoint" is null
01:48:57  	at java.base/java.lang.Character.digit(Character.java:10661)
01:48:57  	at java.base/java.lang.Character.digit(Character.java:10609)
01:48:57  	at java.base/java.lang.Integer.parseInt(Integer.java:666)
01:48:57  	at java.base/java.lang.Integer.parseInt(Integer.java:786)
01:48:57  	at org.testng.reporters.XMLReporterConfig$StackTraceLevels.parse(XMLReporterConfig.java:250)
01:48:57  	at org.testng.reporters.XMLReporterConfig.<init>(XMLReporterConfig.java:96)
01:48:57  	at org.testng.reporters.XMLReporter.<init>(XMLReporter.java:26)
01:48:57  	at com.sun.javatest.regtest.agent.TestNGRunner.main(TestNGRunner.java:92)
01:48:57  	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
01:48:57  	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
01:48:57  	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
01:48:57  	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
01:48:57  	at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:312)
01:48:57  	at java.base/java.lang.Thread.run(Thread.java:862)

pshipton avatar Jun 06 '22 12:06 pshipton

https://openj9-jenkins.osuosl.org/job/Test_openjdk19_j9_sanity.openjdk_aarch64_linux_OpenJDK19/4 jdk_lang_1 java/lang/String/UnicodeCasingTest.java

15:23:46  java.lang.NullPointerException: Cannot invoke "java.lang.CharacterData.of(int)" because "codePoint" is null
15:23:46  	at java.base/java.lang.Character.digit(Character.java:10935)
15:23:46  	at java.base/java.lang.Character.digit(Character.java:10883)
15:23:46  	at java.base/java.lang.Integer.parseInt(Integer.java:663)
15:23:46  	at UnicodeCasingTest.convert(UnicodeCasingTest.java:210)
15:23:46  	at UnicodeCasingTest.test(UnicodeCasingTest.java:150)
15:23:46  	at UnicodeCasingTest.test(UnicodeCasingTest.java:96)
15:23:46  	at UnicodeCasingTest.main(UnicodeCasingTest.java:64)

java/lang/StringBuilder/Comparison.java

15:23:46  java.lang.NullPointerException: Cannot invoke "java.lang.CharacterData.of(int)" because "codePoint" is null
15:23:46  	at java.base/java.lang.Character.digit(Character.java:10935)
15:23:46  	at java.base/java.lang.Character.digit(Character.java:10883)
15:23:46  	at java.base/java.lang.Integer.parseInt(Integer.java:663)
15:23:46  	at java.base/java.lang.Integer.parseInt(Integer.java:781)
15:23:46  	at org.testng.reporters.XMLReporterConfig$StackTraceLevels.parse(XMLReporterConfig.java:250)
15:23:46  	at org.testng.reporters.XMLReporterConfig.<init>(XMLReporterConfig.java:96)
15:23:46  	at org.testng.reporters.XMLReporter.<init>(XMLReporter.java:26)
15:23:46  	at com.sun.javatest.regtest.agent.TestNGRunner.main(TestNGRunner.java:92)
15:23:46  	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104)
15:23:46  	at java.base/java.lang.reflect.Method.invoke(Method.java:578)
15:23:46  	at com.sun.javatest.regtest.agent.MainActionHelper$AgentVMRunnable.run(MainActionHelper.java:312)
15:23:46  	at java.base/java.lang.Thread.run(Thread.java:1573)
15:23:46  
15:23:46  JavaTest Message: Test threw exception: java.lang.NullPointerException
15:23:46  JavaTest Message: shutting down test

pshipton avatar Jun 30 '22 15:06 pshipton

I tried to reproduce the failure locally with JDK 17, but no luck in 50 runs.

knn-k avatar Jul 01 '22 01:07 knn-k

https://openj9-jenkins.osuosl.org/job/Test_openjdk17_j9_sanity.openjdk_aarch64_linux_Nightly/226 jdk_lang_1 java/lang/String/UnicodeCasingTest.java

19:54:35  Testing on lrc_IR_#Arab locale....
19:54:35  java.lang.NullPointerException: Cannot invoke "java.lang.CharacterData.of(int)" because "codePoint" is null
19:54:35  	at java.base/java.lang.Character.digit(Character.java:10661)
19:54:35  	at java.base/java.lang.Character.digit(Character.java:10609)
19:54:35  	at java.base/java.lang.Integer.parseInt(Integer.java:666)
19:54:35  	at UnicodeCasingTest.convert(UnicodeCasingTest.java:210)
19:54:35  	at UnicodeCasingTest.test(UnicodeCasingTest.java:150)
19:54:35  	at UnicodeCasingTest.test(UnicodeCasingTest.java:96)
19:54:35  	at UnicodeCasingTest.main(UnicodeCasingTest.java:64)

pshipton avatar Jul 05 '22 18:07 pshipton

30x Grinder jobs with UnicodeCasingTest.java:

  • https://openj9-jenkins.osuosl.org/job/Grinder/1106/ (jdk_custom_0) No failures
  • https://openj9-jenkins.osuosl.org/job/Grinder/1107/ (jdk_custom_1) No failures

knn-k avatar Jul 07 '22 11:07 knn-k

Two more Grinder jobs (30x each) with UnicodeCasingTest.java:

  • https://openj9-jenkins.osuosl.org/job/Grinder/1109/ (jdk_custom_0) No failures
  • https://openj9-jenkins.osuosl.org/job/Grinder/1110/ (jdk_custom_1) No failures

knn-k avatar Jul 10 '22 23:07 knn-k

Reproduced in a 30x Grinder job: https://openj9-jenkins.osuosl.org/job/Grinder/1131/ (jdk_lang_1)

This failure seems to happen only in the large heap mode. All the reported failures above are from jdk_lang_1 jobs. I ran the test in the compressed refs mode (jdk_lang_0) as https://openj9-jenkins.osuosl.org/job/Grinder/1134/, and it did not fail.

I tried to get a JIT trace file for java.lang.Character.digit(), but my attempt has been unsuccessful.

knn-k avatar Jul 15 '22 07:07 knn-k

https://openj9-jenkins.osuosl.org/job/Test_openjdk18_j9_sanity.openjdk_aarch64_linux_Nightly/105 jdk_lang_0 -Xdump:system:none -Xdump:heap:none -Xdump:system:events=gpf+abort+traceassert+corruptcache -XX:-JITServerTechPreviewMessage -XX:+UseCompressedOops java/lang/String/UnicodeCasingTest.java

21:08:08  Testing on el_CY locale....
21:08:08  java.lang.NullPointerException: Cannot invoke "java.lang.CharacterData.of(int)" because "codePoint" is null
21:08:08  	at java.base/java.lang.Character.digit(Character.java:10658)
21:08:08  	at java.base/java.lang.Character.digit(Character.java:10606)
21:08:08  	at java.base/java.lang.Integer.parseInt(Integer.java:666)
21:08:08  	at UnicodeCasingTest.convert(UnicodeCasingTest.java:210)
21:08:08  	at UnicodeCasingTest.test(UnicodeCasingTest.java:150)
21:08:08  	at UnicodeCasingTest.test(UnicodeCasingTest.java:96)
21:08:08  	at UnicodeCasingTest.main(UnicodeCasingTest.java:64)

pshipton avatar Jul 18 '22 14:07 pshipton

This latest failure is with -XX:+UseCompressedOops

pshipton avatar Jul 18 '22 14:07 pshipton

I tried running with -Xdump option (dump upon NullPointerException), but the failure was not reproduced. https://openj9-jenkins.osuosl.org/job/Grinder/1140/ https://openj9-jenkins.osuosl.org/job/Grinder/1141/

knn-k avatar Jul 19 '22 23:07 knn-k

Failure reproduced in https://openj9-jenkins.osuosl.org/job/Grinder/1167/:

[2022-07-21T07:08:58.595Z] JVMDUMP039I Processing dump event "systhrow", detail "java/lang/NullPointerException" at 2022/07/21 02:51:58 - please wait.
[2022-07-21T07:08:58.595Z] JVMDUMP032I JVM requested System dump using '/home/jenkins/workspace/Grinder/aqa-tests/TKG/output_16583858471448/jdk_lang_1/work/scratch/2/core.20220721.025158.3097098.0001.dmp' in response to an event
[2022-07-21T07:08:58.595Z] JVMDUMP010I System dump written to /home/jenkins/workspace/Grinder/aqa-tests/TKG/output_16583858471448/jdk_lang_1/work/scratch/2/core.20220721.025158.3097098.0001.dmp
[2022-07-21T07:08:58.595Z] JVMDUMP032I JVM requested Java dump using '/home/jenkins/workspace/Grinder/aqa-tests/TKG/output_16583858471448/jdk_lang_1/work/scratch/2/javacore.20220721.025158.3097098.0002.txt' in response to an event
[2022-07-21T07:08:58.595Z] JVMDUMP010I Java dump written to /home/jenkins/workspace/Grinder/aqa-tests/TKG/output_16583858471448/jdk_lang_1/work/scratch/2/javacore.20220721.025158.3097098.0002.txt
[2022-07-21T07:08:58.595Z] JVMDUMP032I JVM requested JIT dump using '/home/jenkins/workspace/Grinder/aqa-tests/TKG/output_16583858471448/jdk_lang_1/work/scratch/2/jitdump.20220721.025158.3097098.0003.dmp' in response to an event
[2022-07-21T07:08:58.595Z] JVMDUMP051I JIT dump occurred in 'AgentVMThread' thread 0x0000FFFF7C713700
[2022-07-21T07:08:58.595Z] JVMDUMP053I JIT dump is recompiling java/lang/Integer.parseInt(Ljava/lang/String;I)I

Diag files are available at https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Grinder/1167/

knn-k avatar Jul 21 '22 23:07 knn-k

Findings from the Grinder job above:

  • The NPE was caused by an instruction in the JITed code for java/lang/Integer::parseInt(Ljava/lang/String;I)I.
  • The instruction is ldr x0, [x27] at address 0xffff4285dcb4, where x27 is zero at this point.
  • x27 is loaded from a stack slot [x27, #128]. The stack slot is initialized as zero by str xzr, [x20, #128] at the entry of the parseInt() code.
  • It is not clear what the stack slot is for. It looks like a local variable introduced by the JIT compilation.
  • I have been trying to get a JIT trace file for parseInt(), but it has been unsuccessful so far.

knn-k avatar Jul 25 '22 11:07 knn-k

https://openj9-jenkins.osuosl.org/job/Test_openjdk18_j9_sanity.openjdk_aarch64_linux_Release/19 java/lang/String/UnicodeCasingTest.java

07:11:49  Testing on ga_IE locale....
07:11:49  java.lang.NullPointerException: Cannot invoke "java.lang.CharacterData.of(int)" because "codePoint" is null
07:11:49  	at java.base/java.lang.Character.digit(Character.java:10658)
07:11:49  	at java.base/java.lang.Character.digit(Character.java:10606)
07:11:49  	at java.base/java.lang.Integer.parseInt(Integer.java:666)
07:11:49  	at UnicodeCasingTest.convert(UnicodeCasingTest.java:210)
07:11:49  	at UnicodeCasingTest.test(UnicodeCasingTest.java:150)
07:11:49  	at UnicodeCasingTest.test(UnicodeCasingTest.java:96)
07:11:49  	at UnicodeCasingTest.main(UnicodeCasingTest.java:64)

pshipton avatar Jul 25 '22 21:07 pshipton

I finally got a JIT trace file of parseInt() in https://openj9-jenkins.osuosl.org/job/Grinder/1179/.

knn-k avatar Jul 25 '22 23:07 knn-k

I got a JIT trace file of Integer.parseInt() from https://openj9-jenkins.osuosl.org/job/Grinder/1179/. This failure seems to occur only when the method is compiled at the scorching level. That should be the reason of its intermittent behavior.

knn-k avatar Jul 27 '22 02:07 knn-k

I got more JIT trace files in https://openj9-jenkins.osuosl.org/job/Grinder/1198/.

knn-k avatar Jul 27 '22 22:07 knn-k

This looks like an issue with the JIT optimization. Who can look at it?

Take a look at the following files in ./aqa-tests/TKG/output_16589343036023/jdk_lang_1/work/java/lang/String/UnicodeCasingTest/ in the artifacts of https://openj9-jenkins.osuosl.org/job/Grinder/1198/.

  • parseInt.txt.1.2730531.34891.20220727.151451.2730531
  • core.20220727.151500.2730531.0001.dmp

The JIT trace file contains the output for three compilations, at levels "warm", "hot", and "scorching".

The NPE is thrown at the ldrimmx instruction below in the "scorching" compilation, trying to load VFT with the register w27 being a NULL:

 \\ java/lang/Integer.parseInt(Ljava/lang/String;I)I
 \\  190 JBinvokestatic 56 java/lang/Character.digit(CI)I
 \\       2 JBinvokestatic 88 java/lang/Character.digit(II)I
 \\          6 JBinvokevirtual 89 java/lang/CharacterData.digit(II)I

    0xffff5f9fa548 00000144 [    0xfffedf2782b0] f9400360 6 	ldrimmx 	w0, [w27, 0]		# SymRef  <vft-symbol>[#263  Shadow] [flags 0x18607 0x0 ]

knn-k avatar Jul 28 '22 05:07 knn-k

The ldrimmx instruction above is for node n4141n in the following IL tree:

n4143n    ifacmpne --> block_423 BBStart at n4451n (inlineHierarchyGuard )                    [    0xffff24dafeb0] bci=[19,0,197] rc=0 vc=10 vn=- li=192 udi=- nc=3 flg=0x4020
n4141n      aloadi  <vft-symbol>[#263  Shadow] [flags 0x18607 0x0 ]                           [    0xffff24dafe10] bci=[17,6,10658] rc=4 vc=10 vn=- li=192 udi=- nc=1
n4400n        ==>aRegLoad
n4142n      aconst 0xffff982ff700 (java/lang/CharacterDataLatin1.class) (classPointerConstant X!=0 sharedMemory )  [    0xffff24dafe60] bci=[19,0,197] rc=4 vc=10 vn=- li=192 udi=- nc=0 flg=0x10004

Node n4400n loads the reference from <temp slot 5>[#566 Auto]. The stack slot is initialized as NULL at the method entry of the generated code. The value in the slot is not changed before the NPE is thrown at n4141n.

n4400n        aRegLoad x27   <temp slot 5>[#566  Auto] [flags 0x20000007 0x0 ] (X>=0 SeenRealReference sharedMemory )  [    0xffff250e4f10] bci=[17,6,10658] rc=3 vc=10 vn=- li=19 udi=- nc=0 flg=0x8100

Note that x27 and w27 represent the same register here. It is confusing, but please ignore it now (See Issue https://github.com/eclipse/omr/issues/3137).

The localValuePropagation phase (id=179) changes the following IL node n4030n into n4143n above.

n4030n    ifacmpne --> block_225 BBStart at n2782n (inlineProfiledGuard )                     [    0xffff24dadb60] bci=[19,0,197] rc=0 vc=4070 vn=- li=- udi=- nc=2 flg=0x1020
n4028n      ==>aloadi
n4031n      aconst 0xffff982ff700 (java/lang/CharacterDataLatin1.class) (classPointerConstant X!=0 sharedMemory )  [    0xffff24dadbb0] bci=[19,0,197] rc=1 vc=4070 vn=- li=- udi=- nc=0 flg=0x10004

n4029n, the child node of n4028n, loads the object reference from <temp slot 3>[#561 Auto] instead of <temp slot 5>[#566 Auto]. Is the transformation of the ifacmpne node as expected?

Another question: Why is this failure observed only on AArch64 Linux?

knn-k avatar Jul 28 '22 05:07 knn-k

The transformation of the guard node above seems to be related to the following message in localValuePropagation.

[  4893] E^AReplacing the old guard 0000FFFF24DAFEB0 with the shiny new overridden guard 0000FFFF24F7ED98 at treetop 0000FFFF9DCA26F4

By the way, the addresses in this message is wrong. I opened PR https://github.com/eclipse/omr/pull/6628 to fix it.

knn-k avatar Jul 28 '22 06:07 knn-k

@0xdaryl there is a question in an earlier comment that seems to be directed to you.

This looks like an issue with the JIT optimization. Who can look at it?

pshipton avatar Jul 28 '22 12:07 pshipton

I ran a 50x Grinder job with Java 11 in https://openj9-jenkins.osuosl.org/job/Grinder/1200/, and it reported no failures. It is reproducible only with Java 17 and newer.

knn-k avatar Jul 29 '22 00:07 knn-k

https://openj9-jenkins.osuosl.org/job/Test_openjdk17_j9_sanity.openjdk_aarch64_linux_Nightly/245 jdk_lang_1 java/lang/String/UnicodeCasingTest.java

19:50:51  Testing on bs__#Latn locale....
19:50:51  java.lang.NullPointerException: Cannot invoke "java.lang.CharacterData.of(int)" because "codePoint" is null
19:50:51  	at java.base/java.lang.Character.digit(Character.java:10661)
19:50:51  	at java.base/java.lang.Character.digit(Character.java:10609)
19:50:51  	at java.base/java.lang.Integer.parseInt(Integer.java:666)
19:50:51  	at UnicodeCasingTest.convert(UnicodeCasingTest.java:210)
19:50:51  	at UnicodeCasingTest.test(UnicodeCasingTest.java:150)
19:50:51  	at UnicodeCasingTest.test(UnicodeCasingTest.java:96)
19:50:51  	at UnicodeCasingTest.main(UnicodeCasingTest.java:64)

https://openj9-jenkins.osuosl.org/job/Test_openjdk17_j9_sanity.openjdk_aarch64_linux_Nightly/246 jdk_lang_0 java/lang/String/UnicodeCasingTest.java

19:44:58  Testing on en_AT locale....
19:44:58  java.lang.NullPointerException: Cannot invoke "java.lang.CharacterData.of(int)" because "codePoint" is null
19:44:58  	at java.base/java.lang.Character.digit(Character.java:10661)
19:44:58  	at java.base/java.lang.Character.digit(Character.java:10609)
19:44:58  	at java.base/java.lang.Integer.parseInt(Integer.java:666)
19:44:58  	at UnicodeCasingTest.convert(UnicodeCasingTest.java:210)
19:44:58  	at UnicodeCasingTest.test(UnicodeCasingTest.java:150)
19:44:58  	at UnicodeCasingTest.test(UnicodeCasingTest.java:96)
19:44:58  	at UnicodeCasingTest.main(UnicodeCasingTest.java:64)

https://openj9-jenkins.osuosl.org/job/Test_openjdk18_j9_sanity.openjdk_aarch64_linux_Nightly/115 test jdk_lang_0 java/lang/String/UnicodeCasingTest.java

21:11:19  Testing on fi_FI locale....
21:11:19  java.lang.NullPointerException: Cannot invoke "java.lang.CharacterData.of(int)" because "codePoint" is null
21:11:19  	at java.base/java.lang.Character.digit(Character.java:10658)
21:11:19  	at java.base/java.lang.Character.digit(Character.java:10606)
21:11:19  	at java.base/java.lang.Integer.parseInt(Integer.java:666)
21:11:19  	at UnicodeCasingTest.convert(UnicodeCasingTest.java:210)
21:11:19  	at UnicodeCasingTest.test(UnicodeCasingTest.java:150)
21:11:19  	at UnicodeCasingTest.test(UnicodeCasingTest.java:96)
21:11:19  	at UnicodeCasingTest.main(UnicodeCasingTest.java:64)

pshipton avatar Aug 03 '22 16:08 pshipton

@hzongaro : could you investigate this from the LVP angle please?

0xdaryl avatar Aug 03 '22 19:08 0xdaryl

@knn-k, I haven't been able to reproduce the failure in "internal" Grinder runs, and I don't have permission to submit Grinder runs on openj9-jenkins.osuosl.org. When you have a few minutes, may I ask you to submit another Grinder run adding the traceLocalVP option — -Xjit:{java/lang/Integer.parseInt(*}(traceFull,traceLocalVP,log=parseInt.txt)?

hzongaro avatar Aug 05 '22 15:08 hzongaro

Started a grinder Build Status

Again Build Status

pshipton avatar Aug 05 '22 15:08 pshipton

There are failures in the job above. See aqa-tests/TKG/output_16597288137941/jdk_lang_1/work/java/lang/String/UnicodeCasingTest/parseInt.txt.1.1185776.29255.20220805.155415.1185776 in the artifact of https://openj9-jenkins.osuosl.org/job/Grinder_iteration_3/108/ for the trace file.

Before local value propagation (id=180):

n3928n    NULLCHK on n1702n [#32]                                                             [    0xffff77aabb80] bci=[15,1,10661] rc=0 vc=4126 vn=- li=- udi=- nc=1
n3929n      aloadi  <vft-symbol>[#263  Shadow] [flags 0x18607 0x0 ] (X!=0 sharedMemory )      [    0xffff77aabbd0] bci=[15,1,10661] rc=2 vc=4126 vn=- li=- udi=- nc=1 flg=0x4
n1702n        ==>aload
n3931n    ifacmpne --> block_208 BBStart at n2759n (inlineProfiledGuard )                     [    0xffff77aabc70] bci=[17,0,197] rc=0 vc=4126 vn=- li=- udi=- nc=2 flg=0x1020
n3929n      ==>aloadi
n3932n      aconst 0xffffb8314400 (java/lang/CharacterDataLatin1.class) (classPointerConstant X!=0 sharedMemory )  [    0xffff77aabcc0] bci=[17,0,197] rc=1 vc=4126 vn=- li=- udi=- nc=0 flg=0x10004

After local value propagation:

n3928n    NULLCHK on n1702n [#32]                                                             [    0xffff77aabb80] bci=[15,1,10661] rc=0 vc=4180 vn=- li=- udi=- nc=1
n3929n      aloadi  <vft-symbol>[#263  Shadow] [flags 0x18607 0x0 ] (X!=0 sharedMemory )      [    0xffff77aabbd0] bci=[15,1,10661] rc=1 vc=4180 vn=- li=- udi=- nc=1 flg=0x4
n1702n        ==>aload
n4043n    ifacmpne --> block_208 BBStart at n2759n (inlineHierarchyGuard )                    [    0xffff77aadf70] bci=[17,0,197] rc=0 vc=0 vn=- li=- udi=- nc=2 flg=0x4020
n4041n      aloadi  <vft-symbol>[#263  Shadow] [flags 0x18607 0x0 ]                           [    0xffff77aaded0] bci=[15,6,10661] rc=1 vc=0 vn=- li=- udi=- nc=1
n4040n        aload  <temp slot 5>[#560  Auto] [flags 0x20000007 0x0 ]                        [    0xffff77aade80] bci=[15,6,10661] rc=1 vc=0 vn=- li=- udi=- nc=0
n4042n      aconst 0xffffb8314400 (java/lang/CharacterDataLatin1.class) (classPointerConstant sharedMemory )  [    0xffff77aadf20] bci=[17,0,197] rc=1 vc=0 vn=- li=- udi=- nc=0 flg=0x10000

n4040n from <temp slot 5> is a NULL here.

I see the following message with the address 0xffff77aadf70 (n4043n) in the trace file.

P2O: Considering a candidate at 0000FFFF77AABC70. Object Constraint : class Ljava/lang/CharacterDataLatin1; (non-NULL) {J9ClassObject} , Guard constraint : fixed class Ljava/lang/CharacterDataLatin1; (non-NULL) {J9ClassObject}
P2O: oldGuard 0000FFFF7726D520 newGuard 0000FFFF77AADF70 currentTree 0000FFFF77BB4508

knn-k avatar Aug 07 '22 23:08 knn-k

The IL trees above are from the "scorching" compilation.

The "hot" compilation in the same trace file does not change the inlineProfiledGuard into an inlineHierarchyGuard like the "scorching" compilation does. The inlineProfiledGuard node reuses the VFT loaded by the previous NULLCHK:

n1833n    NULLCHK on n3625n [#32]                                                             [    0xffff77272cb0] bci=[15,1,10661] rc=0 vc=7257 vn=- li=132 udi=- nc=1
n1816n      aloadi  <vft-symbol>[#263  Shadow] [flags 0x18607 0x0 ] (X!=0 sharedMemory )      [    0xffff77272760] bci=[15,1,10661] rc=2 vc=7257 vn=- li=132 udi=- nc=1 flg=0x4
n3625n        ==>aRegLoad
n1818n    ifacmpne --> block_307 BBStart at n3638n (inlineProfiledGuard )                     [    0xffff77272800] bci=[17,0,197] rc=0 vc=7257 vn=- li=132 udi=- nc=3 flg=0x1020
n1816n      ==>aloadi
n1817n      aconst 0xffffb8314400 (java/lang/CharacterDataLatin1.class) (classPointerConstant X!=0 sharedMemory )  [    0xffff772727b0] bci=[17,0,197] rc=1 vc=7257 vn=- li=132 udi=- nc=0 flg=0x10004

knn-k avatar Aug 08 '22 02:08 knn-k

Another "scorching" trace file is available in the artifact of https://openj9-jenkins.osuosl.org/job/Grinder_iteration_1/139/ as aqa-tests/TKG/output_16597331655943/jdk_lang_1/work/java/lang/String/UnicodeCasingTest/parseInt.txt.1.1426511.33813.20220805.211013.1426511 .

knn-k avatar Aug 08 '22 06:08 knn-k

Thank you for all of your analysis of the failure, Konno-san @knn-k! It looks like constrainIfcmpeqne is replacing the inlineProfiledGuard with an inlineHierarchyGuard.

In the process of constructing the new guard, addDelayedConvertedGuard loads from the symbol that is used as the second operand of the call node - but the definition for that symbol only appears in the call block that's the target of the branch.

n2759n    BBStart <block_208> (freq 1) (cold) (in loop 368)                                   [    0xffff775e4e20] bci=[15,6,10661] rc=0 vc=4112 vn=- li=- udi=- nc=0
n2761n    astore  <temp slot 5>[#560  Auto] [flags 0x20000007 0x0 ] (X!=0 privatizedInlinerArg sharedMemory )  [    0xffff775e4ec0] bci=[15,1,10661] rc=0 vc=4112 vn=- li=- udi=47 nc=1 flg=0x2004
n2762n      aload  <temp slot 3>[#555  Auto] [flags 0x7 0x0 ] (X!=0 sharedMemory )            [    0xffff775e4f10] bci=[15,1,10661] rc=1 vc=4112 vn=- li=- udi=159 nc=0 flg=0x4
n2763n    treetop ()                                                                          [    0xffff775e4f60] bci=[15,6,10661] rc=0 vc=4112 vn=- li=- udi=- nc=1 flg=0x8
n2764n      icalli  java/lang/CharacterData.digit(II)I[#539  virtual Method -184] (Abstract class) [flags 0x500 0x0 ] (virtualCallNodeForAGuardedInlinedCall )  [    0xffff775e4fb0] bci=[15,6,10661] rc=2 vc=4112 vn=- li=- udi=- nc=4 flg=0x828
n2765n        aloadi  <vft-symbol>[#263  Shadow] [flags 0x18607 0x0 ] (X!=0 sharedMemory )    [    0xffff775e5000] bci=[15,6,10661] rc=1 vc=4112 vn=- li=- udi=- nc=1 flg=0x4
n2766n          aload  <temp slot 5>[#560  Auto] [flags 0x20000007 0x0 ] (X!=0 X>=0 sharedMemory )  [    0xffff775e5050] bci=[15,6,10661] rc=1 vc=4112 vn=- li=- udi=160 nc=0 flg=0x104
n2767n        aload  <temp slot 5>[#560  Auto] [flags 0x20000007 0x0 ] (X!=0 sharedMemory )   [    0xffff775e50a0] bci=[15,6,10661] rc=1 vc=4112 vn=- li=- udi=161 nc=0 flg=0x4
n2768n        iload  <temp slot 13>[#526  Auto] [flags 0x3 0x0 ] (X>=0 cannotOverflow )       [    0xffff775e50f0] bci=[15,6,10661] rc=1 vc=4112 vn=- li=- udi=162 nc=0 flg=0x1100
n2769n        iload  radix<parm 1 I>[#336  Parm] [flags 0x40000103 0x0 ] (X>=0 cannotOverflow )  [    0xffff775e5140] bci=[15,6,10661] rc=1 vc=4112 vn=- li=- udi=163 nc=0 flg=0x1100
n2770n    istore  <temp slot 4>[#562  Auto] [flags 0x3 0x0 ] ()                               [    0xffff775e5190] bci=[15,6,10661] rc=0 vc=4112 vn=- li=- udi=48 nc=1 flg=0x8
n2764n      ==>icalli
n2760n    BBEnd </block_208> (cold) =====                                                     [    0xffff775e4e70] bci=[15,6,10661] rc=0 vc=4112 vn=- li=- udi=- nc=0

hzongaro avatar Aug 08 '22 13:08 hzongaro

The "hot" compilation in the same trace file does not change the inlineProfiledGuard into an inlineHierarchyGuard like the "scorching" compilation does.

Hmmm. It's not clear why the problem only appears at scorching. In the scorching log file, this output appears while processing block_142:

P2O: Considering a candidate at 0000FFFF77AABC70. Object Constraint : class Ljava/lang/CharacterDataLatin1; (non-NULL) {J9ClassObject} , Guard constraint : fixed class Ljava/lang/CharacterDataLatin1; (non-NULL) {J9ClassObject}
P2O: oldGuard 0000FFFF7726D520 newGuard 0000FFFF77AADF70 currentTree 0000FFFF77BB4508

P2O stands for "Profiled to Overridden". The second trace message is emitted by addDelayedConvertedGuard when the new guard is set up for a delayed transformation.

In the hot compile, the second trace message doesn't appear when processing the various inlineProfiledGuards. There's a long sequence of conditions considered by constrainIfcmpeqne in deciding whether to change the guard. I'll try to understand what might prevent the transformation at hot.

hzongaro avatar Aug 08 '22 14:08 hzongaro

Hmmm. It's not clear why the problem only appears at scorching

Now I see the reason for the difference between hot and scorching. At hot, one of the blocks that contains a profiled guard looks like this:

n1741n    BBStart <block_132> (freq 8601) (in loop 19)                                        [    0xffff77270ff0] bci=[15,1,10661] rc=0 vc=5209 vn=- li=-2 udi=- nc=0
n1833n    NULLCHK on n1745n [#32]                                                             [    0xffff77272cb0] bci=[15,1,10661] rc=0 vc=5209 vn=- li=-2 udi=- nc=1
n1816n      aloadi  <vft-symbol>[#263  Shadow] [flags 0x18607 0x0 ] (X!=0 sharedMemory )      [    0xffff77272760] bci=[15,1,10661] rc=2 vc=5209 vn=- li=-2 udi=- nc=1 flg=0x4
n1745n        aload  <temp slot 3>[#555  Auto] [flags 0x7 0x0 ] (X>=0 sharedMemory )          [    0xffff77271130] bci=[16,135,75] rc=1 vc=5209 vn=- li=-1 udi=78 nc=0 flg=0x100
n1818n    ifacmpne --> block_207 BBStart at n2752n (inlineProfiledGuard )                     [    0xffff77272800] bci=[17,0,197] rc=0 vc=5209 vn=- li=-2 udi=- nc=2 flg=0x1020
n1816n      ==>aloadi
n1817n      aconst 0xffffb8314400 (java/lang/CharacterDataLatin1.class) (classPointerConstant X!=0 sharedMemory )  [    0xffff772727b0] bci=[17,0,197] rc=1 vc=5209 vn=- li=-1 udi=- nc=0 flg=0x10004
n1811n    BBEnd </block_132>                                                                  [    0xffff772725d0] bci=[17,4,197] rc=0 vc=5208 vn=- li=-2 udi=- nc=0

The constraint for the aloadi looks like this:

   n1816n aloadi gets new constraint: value 1816 is  {HeapObject,JavaLangClassObject,J9ClassObject}

At scorching it looks like this

n1699n    BBStart <block_142> (freq 8526) (extension of previous block) (in loop 19)          [    0xffff772702d0] bci=[15,1,10661] rc=0 vc=4126 vn=- li=- udi=- nc=0
n1740n    astore  <temp slot 3>[#555  Auto] [flags 0x7 0x0 ]                                  [    0xffff77270fa0] bci=[16,7,73] rc=0 vc=4126 vn=- li=- udi=16 nc=1
n1702n      aload  java/lang/CharacterDataLatin1.instance Ljava/lang/CharacterDataLatin1;[#546  final Static] [flags 0x20307 0x0 ]  [    0xffff772703c0] bci=[16,7,73] rc=2 vc=4126 vn=- li=- udi=- nc=0
n3928n    NULLCHK on n1702n [#32]                                                             [    0xffff77aabb80] bci=[15,1,10661] rc=0 vc=4126 vn=- li=- udi=- nc=1
n3929n      aloadi  <vft-symbol>[#263  Shadow] [flags 0x18607 0x0 ] (X!=0 sharedMemory )      [    0xffff77aabbd0] bci=[15,1,10661] rc=2 vc=4126 vn=- li=- udi=- nc=1 flg=0x4
n1702n        ==>aload
n3931n    ifacmpne --> block_208 BBStart at n2759n (inlineProfiledGuard )                     [    0xffff77aabc70] bci=[17,0,197] rc=0 vc=4126 vn=- li=- udi=- nc=2 flg=0x1020
n3929n      ==>aloadi
n3932n      aconst 0xffffb8314400 (java/lang/CharacterDataLatin1.class) (classPointerConstant X!=0 sharedMemory )  [    0xffff77aabcc0] bci=[17,0,197] rc=1 vc=4126 vn=- li=- udi=- nc=0 flg=0x10004
n3933n    BBEnd </block_142>                                                                  

and the contraint for the aloadi is

   n3929n aloadi gets new constraint: value 3929 is class Ljava/lang/CharacterDataLatin1; (non-NULL) {J9ClassObject}

Notice that the store to #555 appears within block_142 at scorching. My understanding is that Local Value Propagation only considers value constraints within blocks, so at hot, it's unaware that #555 contains an instance of CharacterDataLatin1, preventing the guard from being rewritten.

hzongaro avatar Aug 08 '22 14:08 hzongaro