openj9 icon indicating copy to clipboard operation
openj9 copied to clipboard

aarch64 mac SharedClasses.SCM23.MultiCL_0 hang recusive in ClassLoader.loadClass

Open pshipton opened this issue 3 years ago • 25 comments

https://openj9-jenkins.osuosl.org/job/Test_openjdk18_j9_extended.system_aarch64_mac_Nightly_testList_0/79 SharedClasses.SCM23.MultiCL_0 -Xjit -Xgcpolicy:gencon -Xnocompressedrefs

https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Test_openjdk18_j9_extended.system_aarch64_mac_Nightly_testList_0/79/system_test_output.tar.gz

MCL3 12:06:38 >> Loaded 20000 classes...
MCL3 12:06:38 >> Total classes loaded = 20001
MCL5 12:06:39 >> Loaded 20000 classes...
MCL5 12:06:39 >> Total classes loaded = 20001
STF 12:07:34.277 - Heartbeat: Process MCL3 is still running
STF 12:12:34.110 - Heartbeat: Process MCL3 is still running
STF 12:17:34.467 - Heartbeat: Process MCL3 is still running
STF 12:22:34.389 - Heartbeat: Process MCL3 is still running
...
STF 13:57:34.034 - Heartbeat: Process MCL3 is still running
STF 14:02:34.423 - Heartbeat: Process MCL3 is still running
STF 14:02:36.464 - **FAILED** Process MCL3 has timed out
STF 14:02:36.464 - Collecting dumps for: MCL3

5.MCL3.stderr

JVMDUMP039I Processing dump event "abort", detail "" at 2022/06/16 14:04:06 - please wait.
JVMDUMP032I JVM requested Java dump using '/Users/jenkins/workspace/Test_openjdk18_j9_extended.system_aarch64_mac_Nightly_testList_0/aqa-tests/TKG/output_16553435102162/SharedClasses.SCM23.MultiCL_0/20220616-120232-SharedClasses/results/javacore.20220616.140406.16534.0002.txt' in response to an event
JVMDUMP012E Error in Java dump: /Users/jenkins/workspace/Test_openjdk18_j9_extended.system_aarch64_mac_Nightly_testList_0/aqa-tests/TKG/output_16553435102162/SharedClasses.SCM23.MultiCL_0/20220616-120232-SharedClasses/results/javacore.20220616.140406.16534.0002.txt
JVMDUMP032I JVM requested Snap dump using '/Users/jenkins/workspace/Test_openjdk18_j9_extended.system_aarch64_mac_Nightly_testList_0/aqa-tests/TKG/output_16553435102162/SharedClasses.SCM23.MultiCL_0/20220616-120232-SharedClasses/results/Snap.20220616.140406.16534.0003.trc' in response to an event
JVMDUMP032I JVM requested JIT dump using '/Users/jenkins/workspace/Test_openjdk18_j9_extended.system_aarch64_mac_Nightly_testList_0/aqa-tests/TKG/output_16553435102162/SharedClasses.SCM23.MultiCL_0/20220616-120232-SharedClasses/results/jitdump.20220616.140406.16534.0004.dmp' in response to an event
JVMDUMP051I JIT dump occurred in 'SIGABRT Thread' thread 0x0000000140062100
JVMDUMP013I Processed dump event "abort", detail "".

I only find a javacore in the diagnostics.

3XMTHREADINFO      "<name locked>" J9VMThread:0x0000000147811900, omrthread_t:0x0000000137817E60, java/lang/Thread:0x00000002802B8BE0, state:R, prio=10
3XMJAVALTHREAD            (java/lang/Thread getId:0x6, isDaemon:true)
3XMJAVALTHRCCL            jdk/internal/loader/ClassLoaders$AppClassLoader(0x000000028021C5A8)
3XMTHREADINFO1            (native thread ID:0x8B153B6, native priority:0xB, native policy:UNKNOWN, vmstate:CW, vm thread flags:0x00000040)
3XMTHREADINFO2            (native stack address range from:0x000000016DB94000, to:0x000000016DC97000, size:0x103000)
3XMCPUTIME               CPU usage total: 42.126572000 secs, current category="JIT"
3XMHEAPALLOC             Heap bytes allocated since last GC cycle=61384 (0xEFC8)
1INTERNAL                    Unable to obtain lock context information
3XMTHREADINFO3           Java callstack:
4XESTACKTRACE                at java/lang/ClassLoader.loadClass(ClassLoader.java(Compiled Code))
4XESTACKTRACE                at java/lang/ClassLoader.loadClass(ClassLoader.java(Compiled Code))
4XESTACKTRACE                at java/lang/ClassLoader.loadClass(ClassLoader.java(Compiled Code))
4XESTACKTRACE                at java/lang/ClassLoader.loadClass(ClassLoader.java(Compiled Code))
4XESTACKTRACE                at java/lang/ClassLoader.loadClass(ClassLoader.java(Compiled Code))
...
4XESTACKTRACE                at java/lang/ClassLoader.loadClass(ClassLoader.java(Compiled Code))
4XESTACKTRACE                at java/lang/ClassLoader.loadClass(ClassLoader.java(Compiled Code))
4XESTACKTRACE                at java/lang/ClassLoader.loadClass(ClassLoader.java(Compiled Code))
4XESTACKTRACE                at (Missing Method)
3XMTHREADINFO3           No native callstack available for this thread

pshipton avatar Jun 16 '22 18:06 pshipton

@knn-k fyi

pshipton avatar Jun 16 '22 18:06 pshipton

https://openj9-jenkins.osuosl.org/job/Test_openjdk17_j9_extended.system_aarch64_mac_Nightly_testList_0/80 SharedClasses.SCM23.MultiCL_0

No diagnostics produced, despite the kill -3. https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Test_openjdk17_j9_extended.system_aarch64_mac_Nightly_testList_0/80/system_test_output.tar.gz

MCL1 10:28:49 >> Total classes loaded = 20001
...
STF 10:32:21.486 - Heartbeat: Process MCL1 is still running
STF 10:37:21.259 - Heartbeat: Process MCL1 is still running
...
STF 12:22:21.459 - Heartbeat: Process MCL1 is still running
STF 12:25:51.435 - **FAILED** Process MCL1 has timed out
STF 12:25:51.435 - Collecting dumps for: MCL1
STF 12:25:51.435 - Sending SIG 3 to the java process to generate a javacore
STF 12:25:51.435 - Running command: kill -3 83264
STF 12:25:51.435 - Redirecting stderr to /Users/jenkins/workspace/Test_openjdk17_j9_extended.system_aarch64_mac_Nightly_testList_0/aqa-tests/TKG/output_16554238815122/SharedClasses.SCM23.MultiCL_0/20220617-102219-SharedClasses/results/6.MCL1.kill_3.stderr
STF 12:25:51.435 - Redirecting stdout to /Users/jenkins/workspace/Test_openjdk17_j9_extended.system_aarch64_mac_Nightly_testList_0/aqa-tests/TKG/output_16554238815122/SharedClasses.SCM23.MultiCL_0/20220617-102219-SharedClasses/results/6.MCL1.kill_3.stdout
STF 12:25:51.438 - Pausing for 30 seconds
...
etc.

pshipton avatar Jun 17 '22 15:06 pshipton

We weren't seeing this problem before. Are there recent changes that may have caused it?

Changes in the build where the failure first occurred.

https://github.com/eclipse-openj9/openj9/compare/920c3682ec9...e5cd3ba221d https://github.com/eclipse-openj9/openj9-omr/compare/cc14b3de1a4...26b89f9f93a

pshipton avatar Jun 17 '22 16:06 pshipton

I ran SharedClasses.SCM23.MultiCL_0 with the build from June 14 nightly job (https://openj9-jenkins.osuosl.org/job/Build_JDK18_aarch64_mac_Nightly/78/). 3 failures in 20 runs: https://openj9-jenkins.osuosl.org/job/Grinder/960/

OpenJ9   - 920c3682ec9
OMR      - cc14b3de1a4

knn-k avatar Jun 20 '22 06:06 knn-k

I also ran the testcase with June 13 nightly build (https://openj9-jenkins.osuosl.org/job/Build_JDK18_aarch64_mac_Nightly/77/). 1 failure in 20 runs: https://openj9-jenkins.osuosl.org/job/Grinder/961/

OpenJ9   - 7c5c4148460
OMR      - cf8ddbd1adc

knn-k avatar Jun 20 '22 08:06 knn-k

5 failures in x20 Grinder job: https://openj9-jenkins.osuosl.org/job/Grinder_testList_0/130/consoleText, with June 10 nightly build (https://openj9-jenkins.osuosl.org/job/Build_JDK18_aarch64_mac_Nightly/76/).

OpenJ9   - 3d06b2f9c2c
OMR      - cf8ddbd1adc

There are two different reasons of the failures:

[1st and 3rd failures]

[2022-06-21T07:51:23.723Z] MCL2 stderr Assertion failed at /Users/jenkins/workspace/Build_JDK18_aarch64_mac_Nightly/openj9/runtime/compiler/control/CompilationThread.cpp:4412: isDiagnosticThread()
[2022-06-21T07:51:23.723Z] MCL2 stderr 	A compilation thread has finished a compilation but does not hold VM access

[2nd, 4th, and 5th failures]

[2022-06-21T08:20:33.265Z] MCL2 stderr Type=Segmentation error vmState=0x00000000
[2022-06-21T08:20:33.265Z] MCL2 stderr J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000002
[2022-06-21T08:20:33.265Z] MCL2 stderr Handler1=00000001045FF9E8 Handler2=00000001047C6DFC InaccessibleAddress=0000000000010008
...
[2022-06-21T08:20:33.266Z] MCL2 stderr Module=/Users/jenkins/workspace/Grinder_testList_0/openjdkbinary/j2sdk-image/lib/default/libj9jit29.dylib
[2022-06-21T08:20:33.266Z] MCL2 stderr Module_base_address=0000000109800000 Symbol=_ZN2J97Monitor4exitEv
[2022-06-21T08:20:33.266Z] MCL2 stderr Symbol_address=00000001098D9FD8

knn-k avatar Jun 21 '22 10:06 knn-k

I started another Grinder job as https://openj9-jenkins.osuosl.org/job/Grinder/982/, using the June 9 nightly build.

knn-k avatar Jun 21 '22 10:06 knn-k

4 failures in x20 run above.

knn-k avatar Jun 22 '22 02:06 knn-k

https://openj9-jenkins.osuosl.org/job/Test_openjdk17_j9_extended.system_aarch64_mac_Nightly_testList_0/84 SharedClasses.SCM23.MultiCL_0 STF 13:05:40.218 - **FAILED** Process MCL3 has timed out - no diagnostics

pshipton avatar Jun 23 '22 13:06 pshipton

https://openj9-jenkins.osuosl.org/job/Test_openjdk18_j9_extended.system_aarch64_mac_Nightly_testList_0/88/ SharedClasses.SCM23.MultiCL_0

https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Test_openjdk18_j9_extended.system_aarch64_mac_Nightly_testList_0/88/system_test_output.tar.gz No diagnostics

MCL5 11:55:49 >> Total classes loaded = 20001
MCL5 stderr Unhandled exception
MCL5 stderr Type=Segmentation error vmState=0x00000000
MCL5 stderr J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000002
MCL5 stderr Handler1=00000001049638D4 Handler2=0000000104B2AE14 InaccessibleAddress=006100760061006A
MCL5 stderr x0=006100760061006A x1=000000016BE630D0 x2=0000000152817E20 x3=0000000104A66C18
MCL5 stderr x4=000000016BE63020 x5=0000000000000000 x6=0000000109E59570 x7=000000016BE63308
MCL5 stderr x8=000000016BE630D0 x9=006100760061006A x10=0000000000000001 x11=0000000132863468
MCL5 stderr x12=0000000000000072 x13=0000000152817E20 x14=000000037FC83EDC x15=000000016BE636E0
MCL5 stderr x16=000000018D7A2654 x17=00000001FC0FEF28 x18=000000037FC83E50 x19=000000016BE630D0
MCL5 stderr x20=0000000153804370 x21=000000037FC83EB0 x22=0000000104F8CB2A x23=0000000000013100
MCL5 stderr x24=0000000104A87648 x25=0000000154015108 x26=0000000000000000 x27=000000010490DE28
MCL5 stderr x28=0000000104A8630C x29(FP)=000000016BE62FE0 x30(LR)=0000000104A0B9BC x31(SP)=000000016BE62FE0
MCL5 stderr PC=0000000104952F18 SP=000000016BE62FE0
MCL5 stderr v0 000000000000efa8 (f: 61352.000000, d: 3.031192e-319)
MCL5 stderr v1 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL5 stderr v2 0706050403020100 (f: 50462976.000000, d: 7.949929e-275)
MCL5 stderr v3 3fc9525b1cf456f4 (f: 485775104.000000, d: 1.978258e-01)
MCL5 stderr v4 0000000100000000 (f: 0.000000, d: 2.121996e-314)
MCL5 stderr v5 0000000100000000 (f: 0.000000, d: 2.121996e-314)
MCL5 stderr v6 0000000000000001 (f: 1.000000, d: 4.940656e-324)
MCL5 stderr v7 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL5 stderr v8 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL5 stderr v9 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL5 stderr v10 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL5 stderr v11 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL5 stderr v12 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL5 stderr v13 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL5 stderr v14 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL5 stderr v15 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL5 stderr v16 bfd0000000000000 (f: 0.000000, d: -2.500000e-01)
MCL5 stderr v17 3fd545d21d305555 (f: 489706848.000000, d: 3.323865e-01)
MCL5 stderr v18 3f6ef76fbe225866 (f: 3189921792.000000, d: 3.780096e-03)
MCL5 stderr v19 3fe62e42fefa39ef (f: 4277811712.000000, d: 6.931472e-01)
MCL5 stderr v20 00000000ffffffff (f: 4294967296.000000, d: 2.121996e-314)
MCL5 stderr v21 00000000ffffffff (f: 4294967296.000000, d: 2.121996e-314)
MCL5 stderr v22 00000000ffffffff (f: 4294967296.000000, d: 2.121996e-314)
MCL5 stderr v23 ffffffffffffffff (f: 4294967296.000000, d: nan)
MCL5 stderr v24 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL5 stderr v25 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL5 stderr v26 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL5 stderr v27 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL5 stderr v28 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL5 stderr v29 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL5 stderr v30 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL5 stderr v31 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL5 stderr Module=/Users/jenkins/workspace/Test_openjdk18_j9_extended.system_aarch64_mac_Nightly_testList_0/openjdkbinary/j2sdk-image/lib/default/libj9vm29.dylib
MCL5 stderr Module_base_address=0000000104940000 Symbol=classAndLoaderHashEqualFn
MCL5 stderr Symbol_address=0000000104952F08
MCL5 stderr Target=2_90_20220630_89 (Mac OS X 11.4)
MCL5 stderr CPU=aarch64 (8 logical CPUs) (0x400000000 RAM)
MCL5 stderr ----------- Stack Backtrace -----------
MCL5 stderr ---------------------------------------
MCL5 stderr Assertion failed at /Users/jenkins/workspace/Build_JDK18_aarch64_mac_Nightly/openj9/runtime/compiler/control/CompilationThread.cpp:4444: isDiagnosticThread()
MCL5 stderr 	A compilation thread has finished a compilation but does not hold VM access
MCL5 stderr 
MCL5 stderr 
MCL5 stderr Unhandled exception in signal handler. Protected function: setupRasCrashInfo (0x0)

pshipton avatar Jun 30 '22 04:06 pshipton

InaccessibleAddress=006100760061006A

It looks like a string "java" in little endian.

leftKey passed to classAndLoaderHashEqualFn() is broken.

knn-k avatar Jun 30 '22 04:06 knn-k

https://openj9-jenkins.osuosl.org/job/Test_openjdk18_j9_extended.system_aarch64_mac_Nightly_testList_0/101 SharedClasses.SCM23.MultiCL_0

https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Test_openjdk18_j9_extended.system_aarch64_mac_Nightly_testList_0/101/system_test_output.tar.gz

MCL1 12:03:08 >> Total classes loaded = 20001
MCL1 stderr Unhandled exception
MCL1 stderr Type=Bus error vmState=0x00000000
MCL1 stderr J9Generic_Signal_Number=00000028 Signal_Number=0000000a Error_Value=00000000 Signal_Code=00000001
MCL1 stderr Handler1=00000001025F6154 Handler2=00000001023EAE14 InaccessibleAddress=0000000158A04E44
MCL1 stderr Unhandled exception
MCL1 stderr Type=Segmentation error vmState=0x00000000
MCL1 stderr x0=000000000000000E x1=0000000118082900 x2=000000037FB80C38 x3=00000001026FAC20
MCL1 stderr x4=000000016DE8AD68 x5=000000016DE8AD58 x6=000000016DE8AD50 x7=000000016DE8AD48
MCL1 stderr x8=0000000158A04E44 x9=0AEC7F49F7B70093 x10=000000016DE8AD40 x11=0000000000010000
MCL1 stderr x12=000000016DE8AD68 x13=000000016DE8AD40 x14=000000037FB80C9C x15=000000000000002E
MCL1 stderr x16=000000016DE8ADA0 x17=00000001F5FBEF28 x18=000000037FB80C10 x19=0000000118082900
MCL1 stderr x20=00000001180748F0 x21=0000000118009420 x22=000000016DE8AF90 x23=0000000107633750
MCL1 stderr x24=000000016DE8AF90 x25=00000001075B7550 x26=000000016DE8B090 x27=000000016DE8B090
MCL1 stderr x28=0000000118084868 x29(FP)=0000000118009420 x30(LR)=0000000106EEEC34 x31(SP)=000000016DE8ADC0
MCL1 stderr J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000002
MCL1 stderr Handler1=00000001025F6154 Handler2=00000001023EAE14 InaccessibleAddress=9EF9400281F9409E
MCL1 stderr x0=0000000000000001 x1=0000000000000000 x2=000000010FE69230 x3=00000000000000B3
MCL1 stderr x4=000000010FE69C50 x5=0000000000000008 x6=0000000000000000 x7=0000000000000000
MCL1 stderr x8=0000000141167330 x9=0000000158A04E45 x10=9EF9400281F94006 x11=00000000000000FF
MCL1 stderr x12=0000000000000103 x13=000000000000007C x14=0000000000000001 x15=000000000000E823
MCL1 stderr x16=0000000000000651 x17=0000000000000000 x18=0000000000000000 x19=00000001489812F0
MCL1 stderr x20=0000000118077C20 x21=0000000141167330 x22=0000000128814D00 x23=0000000000000001
MCL1 stderr x24=0000000141167338 x25=000000012FF1B988 x26=000000010D848DD8 x27=00000001070A5004
MCL1 stderr x28=0000000000000000 x29(FP)=000000016DE76F40 x30(LR)=00000001068D3D44 x31(SP)=000000016DE76B90
MCL1 stderr PC=00000001068D3DCC SP=000000016DE76B90
MCL1 stderr PC=0000000158A04E44 SP=000000016DE8ADC0
MCL1 stderr v0 0000400000004000 (f: 16384.000000, d: 3.476678e-310)
MCL1 stderr v1 ffffffdfffffffdf (f: 4294967296.000000, d: nan)
MCL1 stderr v2 41cdcd6500000000 (f: 0.000000, d: 1.000000e+09)
MCL1 stderr v3 2f676e616c2f6176 (f: 1815044480.000000, d: 2.470161e-80)
MCL1 stderr v4 0000000100000000 (f: 0.000000, d: 2.121996e-314)
MCL1 stderr v5 0000000100000000 (f: 0.000000, d: 2.121996e-314)
MCL1 stderr v6 0000000000000001 (f: 1.000000, d: 4.940656e-324)
MCL1 stderr v7 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL1 stderr v8 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL1 stderr v9 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL1 stderr v10 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL1 stderr v11 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL1 stderr v12 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL1 stderr v13 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL1 stderr v14 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL1 stderr v15 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL1 stderr v16 0000000100000000 (f: 0.000000, d: 2.121996e-314)
MCL1 stderr v17 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL1 stderr v18 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL1 stderr v19 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL1 stderr v20 00000000ffffffff (f: 4294967296.000000, d: 2.121996e-314)
MCL1 stderr v21 00000000ffffffff (f: 4294967296.000000, d: 2.121996e-314)
MCL1 stderr v22 00000000ffffffff (f: 4294967296.000000, d: 2.121996e-314)
MCL1 stderr v23 ffffffffffffffff (f: 4294967296.000000, d: nan)
MCL1 stderr v24 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL1 stderr v25 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL1 stderr v26 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL1 stderr v27 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL1 stderr v28 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL1 stderr v29 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL1 stderr v30 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL1 stderr v31 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL1 stderr v0 0000000000014fa0 (f: 85920.000000, d: 4.245012e-319)
MCL1 stderr v1 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL1 stderr v2 0706050403020100 (f: 50462976.000000, d: 7.949929e-275)
MCL1 stderr v3 bfa894a0949f9cb3 (f: 2493488384.000000, d: -4.800894e-02)
MCL1 stderr v4 0000000100000000 (f: 0.000000, d: 2.121996e-314)
MCL1 stderr v5 0000000100000000 (f: 0.000000, d: 2.121996e-314)
MCL1 stderr v6 0000000000000001 (f: 1.000000, d: 4.940656e-324)
MCL1 stderr v7 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL1 stderr v8 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL1 stderr v9 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL1 stderr v10 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL1 stderr v11 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL1 stderr v12 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL1 stderr v13 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL1 stderr v14 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL1 stderr v15 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL1 stderr v16 bfd0000000000000 (f: 0.000000, d: -2.500000e-01)
MCL1 stderr v17 3fd54e3840f0d555 (f: 1089525120.000000, d: 3.328992e-01)
MCL1 stderr v18 3f5c6e0023f1f85a (f: 603060288.000000, d: 1.735211e-03)
MCL1 stderr v19 3fe62e42fefa39ef (f: 4277811712.000000, d: 6.931472e-01)
MCL1 stderr v20 00000000ffffffff (f: 4294967296.000000, d: 2.121996e-314)
MCL1 stderr v21 00000000ffffffff (f: 4294967296.000000, d: 2.121996e-314)
MCL1 stderr v22 00000000ffffffff (f: 4294967296.000000, d: 2.121996e-314)
MCL1 stderr v23 ffffffffffffffff (f: 4294967296.000000, d: nan)
MCL1 stderr v24 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL1 stderr v25 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL1 stderr v26 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL1 stderr v27 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL1 stderr v28 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL1 stderr v29 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL1 stderr v30 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL1 stderr v31 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL1 stderr Module=/Users/jenkins/workspace/Test_openjdk18_j9_extended.system_aarch64_mac_Nightly_testList_0/openjdkbinary/j2sdk-image/lib/default/libj9jit29.dylib
MCL1 stderr Module_base_address=0000000106884000 Symbol=_ZN2TR24CompilationInfoPerThread7requeueEv
MCL1 stderr Symbol_address=00000001068D3D04
MCL1 stderr Target=2_90_20220719_102 (Mac OS X 11.4)
MCL1 stderr CPU=aarch64 (8 logical CPUs) (0x400000000 RAM)
MCL1 stderr ----------- Stack Backtrace -----------
MCL1 stderr ---------------------------------------
MCL1 stderr 
MCL1 stderr Compiled_method=java/lang/ClassLoader.loadClass(Ljava/lang/String;)Ljava/lang/Class;
MCL1 stderr Target=2_90_20220719_102 (Mac OS X 11.4)
MCL1 stderr CPU=aarch64 (8 logical CPUs) (0x400000000 RAM)
MCL1 stderr ----------- Stack Backtrace -----------
MCL1 stderr ---------------------------------------

pshipton avatar Jul 19 '22 13:07 pshipton

The crash above occurred at accessing prev->_next->_priority in TR::CompilationInfo::queueEntry() that is inlined in TR::CompilationInfoPerThread::requeue() below. prev->_next (= 0x9EF9400281F94006 in x10) is broken.

https://github.com/eclipse-openj9/openj9/blob/81d749041b56ad9f91da34c04c1e746a9d51b84b/runtime/compiler/control/CompilationThread.cpp#L5158

x9 seems to hold the value of _methodQueue, but it is broken: 0x158A04E45 is not a valid pointer to a structure.

knn-k avatar Jul 20 '22 09:07 knn-k

https://openj9-jenkins.osuosl.org/job/Test_openjdk18_j9_extended.system_aarch64_mac_Nightly_testList_0/102 SharedClasses.SCM23.MultiCL_0

https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Test_openjdk18_j9_extended.system_aarch64_mac_Nightly_testList_0/102/system_test_output.tar.gz

MCL5 12:50:50 >> Total classes loaded = 20001
MCL5 stderr Unhandled exception
MCL5 stderr Type=Segmentation error vmState=0x00000000
MCL5 stderr J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000002
MCL5 stderr Handler1=00000001044520F4 Handler2=000000010881EE14 InaccessibleAddress=000000010F7FFFF8
MCL5 stderr x0=000000010F809D00 x1=000000010F809D00 x2=000000037F7F7FE8 x3=0000000104556C4B
MCL5 stderr x4=000000016BD52A28 x5=000000016BD52A18 x6=000000016BD52A10 x7=000000016BD52A08
MCL5 stderr x8=0000000150A07E3D x9=000000010F800020 x10=000000010F800020 x11=0000000000000000
MCL5 stderr x12=000000016BD52A28 x13=000000016BD52A00 x14=000000037F7F806C x15=000000016BD52DE0
MCL5 stderr x16=000000016BD52A60 x17=00000001FD112F28 x18=000000037F7F7FC0 x19=000000010F809D00
MCL5 stderr x20=000000010F800020 x21=000000011D808220 x22=000000016BD52C50 x23=00000001096C3750
MCL5 stderr x24=000000016BD52C50 x25=0000000109647550 x26=000000016BD52D50 x27=000000016BD52D50
MCL5 stderr x28=000000010F80BC68 x29(FP)=000000016BD52A70 x30(LR)=0000000108F7B818 x31(SP)=000000016BD52A60
MCL5 stderr PC=0000000108F67074 SP=000000016BD52A60
MCL5 stderr v0 000000003f800000 (f: 1065353216.000000, d: 5.263544e-315)
MCL5 stderr v1 000000003f800000 (f: 1065353216.000000, d: 5.263544e-315)
MCL5 stderr v2 0000000048375e80 (f: 1211588224.000000, d: 5.986041e-315)
MCL5 stderr v3 3fd3a64afd694986 (f: 4251535872.000000, d: 3.070247e-01)
MCL5 stderr v4 0000000100000000 (f: 0.000000, d: 2.121996e-314)
MCL5 stderr v5 0000000100000000 (f: 0.000000, d: 2.121996e-314)
MCL5 stderr v6 0000000000000001 (f: 1.000000, d: 4.940656e-324)
MCL5 stderr v7 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL5 stderr v8 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL5 stderr v9 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL5 stderr v10 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL5 stderr v11 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL5 stderr v12 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL5 stderr v13 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL5 stderr v14 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL5 stderr v15 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL5 stderr v16 bfd0000000000000 (f: 0.000000, d: -2.500000e-01)
MCL5 stderr v17 3fd54f79c74d6555 (f: 3343738112.000000, d: 3.329758e-01)
MCL5 stderr v18 3f5769ef305de14e (f: 811458880.000000, d: 1.429065e-03)
MCL5 stderr v19 3fe62e42fefa39ef (f: 4277811712.000000, d: 6.931472e-01)
MCL5 stderr v20 00000000ffffffff (f: 4294967296.000000, d: 2.121996e-314)
MCL5 stderr v21 00000000ffffffff (f: 4294967296.000000, d: 2.121996e-314)
MCL5 stderr v22 00000000ffffffff (f: 4294967296.000000, d: 2.121996e-314)
MCL5 stderr v23 ffffffffffffffff (f: 4294967296.000000, d: nan)
MCL5 stderr v24 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL5 stderr v25 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL5 stderr v26 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL5 stderr v27 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL5 stderr v28 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL5 stderr v29 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL5 stderr v30 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL5 stderr v31 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL5 stderr Module=/Users/jenkins/workspace/Test_openjdk18_j9_extended.system_aarch64_mac_Nightly_testList_0/openjdkbinary/j2sdk-image/lib/default/libj9jit29.dylib
MCL5 stderr Module_base_address=0000000108914000 Symbol=old_slow_jitHandleInternalErrorTrap
MCL5 stderr Symbol_address=0000000108F67058
MCL5 stderr Target=2_90_20220720_103 (Mac OS X 11.4)
MCL5 stderr CPU=aarch64 (8 logical CPUs) (0x400000000 RAM)
MCL5 stderr ----------- Stack Backtrace -----------
MCL5 stderr ---------------------------------------

pshipton avatar Jul 20 '22 15:07 pshipton

The failure above with Symbol=old_slow_jitHandleInternalErrorTrap looks the same with #15518. Trying to access the address [x10, #-40], and that crosses a page boundary, and no memory is allocated for the page.

> info mmap 0x010F800020
Start Address           End Address             Size                    Size                            Read/Write/Execute
0x000000010f800000      0x000000010fffffff      0x0000000000800000      (8,388,608)
Name:   Image section @ 10f800000 (8388608 bytes)

> info mmap 0x010F7FFFF8
Start Address           End Address             Size                    Size                            Read/Write/Execute

knn-k avatar Jul 21 '22 01:07 knn-k

https://openj9-jenkins.osuosl.org/job/Test_openjdk18_j9_extended.system_aarch64_mac_Nightly_testList_0/105 SharedClasses.SCM23.MultiCL_0

https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Test_openjdk18_j9_extended.system_aarch64_mac_Nightly_testList_0/105/system_test_output.tar.gz

MCL4 12:17:19 >> Total classes loaded = 20001
MCL4 stderr Unhandled exception
MCL4 stderr Type=Segmentation error vmState=0x00000000
MCL4 stderr J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000002
MCL4 stderr Handler1=000000010317E0F8 Handler2=0000000103332E14 InaccessibleAddress=0000000118FFFFF0
MCL4 stderr x0=0000000139085B00 x1=0000000280217DC0 x2=000000037F8676B0 x3=0000000103282C4F
MCL4 stderr x4=000000016D046720 x5=0000000000000000 x6=000000010865D580 x7=000000016D046A08
MCL4 stderr x8=0000000118FFFFF0 x9=0000000000010000 x10=0000000000007700 x11=000000000000002E
MCL4 stderr x12=000000037F867716 x13=0000000000000072 x14=000000037F867714 x15=000000000000002E
MCL4 stderr x16=000000018D7A22A0 x17=00000001FC0FEF28 x18=000000037F8675B8 x19=0000000139085B00
MCL4 stderr x20=00000001032A3660 x21=000000037F8676B0 x22=0000000280217DC0 x23=00000001032A230C
MCL4 stderr x24=0000000119000018 x25=000000016D046E40 x26=00000000000007E0 x27=0000000000056880
MCL4 stderr x28=000000013B009468 x29(FP)=000000016D046710 x30(LR)=000000010316B220 x31(SP)=000000016D046640
MCL4 stderr PC=0000000103161160 SP=000000016D046640
MCL4 stderr v0 000000000000efa8 (f: 61352.000000, d: 3.031192e-319)
MCL4 stderr v1 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v2 0706050403020100 (f: 50462976.000000, d: 7.949929e-275)
MCL4 stderr v3 3fba92691a4adde5 (f: 441114080.000000, d: 1.037965e-01)
MCL4 stderr v4 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v5 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v6 0000000000000001 (f: 1.000000, d: 4.940656e-324)
MCL4 stderr v7 0000000000000001 (f: 1.000000, d: 4.940656e-324)
MCL4 stderr v8 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v9 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v10 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v11 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v12 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v13 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v14 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v15 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v16 bfd0000000000000 (f: 0.000000, d: -2.500000e-01)
MCL4 stderr v17 3fd54be54cdf1555 (f: 1289688448.000000, d: 3.327573e-01)
MCL4 stderr v18 3f62da8201f2bffc (f: 32686076.000000, d: 2.301458e-03)
MCL4 stderr v19 3fe62e42fefa39ef (f: 4277811712.000000, d: 6.931472e-01)
MCL4 stderr v20 ffffffffffffffff (f: 4294967296.000000, d: nan)
MCL4 stderr v21 ffffffffffffffff (f: 4294967296.000000, d: nan)
MCL4 stderr v22 ffffffffffffffff (f: 4294967296.000000, d: nan)
MCL4 stderr v23 ffffffffffffffff (f: 4294967296.000000, d: nan)
MCL4 stderr v24 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v25 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v26 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v27 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v28 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v29 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v30 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v31 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr Module=/Users/jenkins/workspace/Test_openjdk18_j9_extended.system_aarch64_mac_Nightly_testList_0/openjdkbinary/j2sdk-image/lib/default/libj9vm29.dylib
MCL4 stderr Module_base_address=000000010315C000 Symbol=sendLoadClass
MCL4 stderr Symbol_address=0000000103161018
MCL4 stderr Target=2_90_20220723_106 (Mac OS X 11.4)
MCL4 stderr CPU=aarch64 (8 logical CPUs) (0x400000000 RAM)
MCL4 stderr ----------- Stack Backtrace -----------
MCL4 stderr ---------------------------------------

pshipton avatar Jul 25 '22 21:07 pshipton

Not sure what happened here, I guess it hung on exit. https://openj9-jenkins.osuosl.org/job/Test_openjdk17_j9_extended.system_aarch64_mac_Nightly_testList_0/110 SharedClasses.SCM23.MultiCL_0

There are no diagnostic files produced (core, javacore, etc.) although the JVM was signalled to produce them. https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Test_openjdk17_j9_extended.system_aarch64_mac_Nightly_testList_0/110/system_test_output.tar.gz

STF 12:10:37.360 - **FAILED** Process MCL4 has timed out
STF 12:10:37.360 - Collecting dumps for: MCL4

6.MCL4.stdout

10:13:33 >> Loaded 20000 classes...
10:13:33 >> Total classes loaded = 20001

pshipton avatar Aug 03 '22 16:08 pshipton

https://openj9-jenkins.osuosl.org/job/Test_openjdk18_j9_extended.system_aarch64_mac_Nightly_testList_0/110 SharedClasses.SCM23.MultiCL_0

https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Test_openjdk18_j9_extended.system_aarch64_mac_Nightly_testList_0/110/system_test_output.tar.gz

MCL1 11:47:36 >> Total classes loaded = 20001
STF 11:47:36.559 - Found dump at: /Users/jenkins/workspace/Test_openjdk18_j9_extended.system_aarch64_mac_Nightly_testList_0/aqa-tests/TKG/output_16591440125030/SharedClasses.SCM23.MultiCL_0/20220730-114407-SharedClasses/results/core.20220730.114736.83183.0001.dmp
MCL3 11:47:36 >> Loaded 16000 classes...
MCL1 stderr Unhandled exception
MCL1 stderr Type=Segmentation error vmState=0x00000000
MCL1 stderr J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000002
MCL1 stderr Handler1=0000000100D8A0F8 Handler2=0000000100F3EE14 InaccessibleAddress=0000000000000038
MCL1 stderr x0=0000000000000001 x1=0000000138349F00 x2=0000000000000010 x3=0000000000000010
MCL1 stderr x4=0000000000000000 x5=000000016F9DBF58 x6=000000016F9DBF50 x7=000000016F9DBF48
MCL1 stderr x8=0000000000000010 x9=0000000000000000 x10=0000000000000010 x11=0000000000000000
MCL1 stderr x12=0000000000000053 x13=000000016F9DBF40 x14=000000037FE6CE3C x15=000000016F9DC320
MCL1 stderr x16=0000000192883F60 x17=0000000201192FF8 x18=000000037FE6CDB0 x19=000000016F9DBA78
MCL1 stderr x20=000000013701CE68 x21=0000000138349F00 x22=0000000000000010 x23=000000037FE6CF30
MCL1 stderr x24=0000000000000000 x25=0000000000000000 x26=0000000000000000 x27=000000037FE6CE40
MCL1 stderr x28=000000013701CE68 x29(FP)=000000016F9DBA20 x30(LR)=0000000105EAC204 x31(SP)=000000016F9DB9D0
MCL1 stderr PC=0000000105EAC21C SP=000000016F9DB9D0
MCL1 stderr v0 0000000000000001 (f: 1.000000, d: 4.940656e-324)
MCL1 stderr v1 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL1 stderr v2 0706050403020100 (f: 50462976.000000, d: 7.949929e-275)
MCL1 stderr v3 bfc95a5a5cf7013f (f: 1559691520.000000, d: -1.980699e-01)
MCL1 stderr v4 0000000000000003 (f: 3.000000, d: 1.482197e-323)
MCL1 stderr v5 0000000000000002 (f: 2.000000, d: 9.881313e-324)
MCL1 stderr v6 0000080000000800 (f: 2048.000000, d: 4.345847e-311)
MCL1 stderr v7 000000000000000d (f: 13.000000, d: 6.422853e-323)
MCL1 stderr v8 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL1 stderr v9 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL1 stderr v10 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL1 stderr v11 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL1 stderr v12 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL1 stderr v13 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL1 stderr v14 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL1 stderr v15 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL1 stderr v16 bfd0000000000000 (f: 0.000000, d: -2.500000e-01)
MCL1 stderr v17 3fd5436406515555 (f: 105993560.000000, d: 3.322382e-01)
MCL1 stderr v18 3f71e74703808c0f (f: 58756112.000000, d: 4.370954e-03)
MCL1 stderr v19 3fe62e42fefa39ef (f: 4277811712.000000, d: 6.931472e-01)
MCL1 stderr v20 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL1 stderr v21 ffffffffffffffff (f: 4294967296.000000, d: nan)
MCL1 stderr v22 ffffffffffffffff (f: 4294967296.000000, d: nan)
MCL1 stderr v23 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL1 stderr v24 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL1 stderr v25 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL1 stderr v26 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL1 stderr v27 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL1 stderr v28 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL1 stderr v29 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL1 stderr v30 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL1 stderr v31 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL1 stderr Module=/Users/jenkins/workspace/Test_openjdk18_j9_extended.system_aarch64_mac_Nightly_testList_0/openjdkbinary/j2sdk-image/lib/default/libj9gc29.dylib
MCL1 stderr Module_base_address=0000000105DE0000 Symbol=_ZN33MM_IndexableObjectAllocationModelC2EP18MM_EnvironmentBaseP7J9Classjm
MCL1 stderr Symbol_address=0000000105EAC0F0
MCL1 stderr Target=2_90_20220730_111 (Mac OS X 11.4)
MCL1 stderr CPU=aarch64 (8 logical CPUs) (0x400000000 RAM)
MCL1 stderr ----------- Stack Backtrace -----------
MCL1 stderr ---------------------------------------

pshipton avatar Aug 03 '22 17:08 pshipton

The failure above accesses load data from [x9, #56], and x9 is 0x0. It is in the constructor of MM_IndexableObjectAllocationModel.

   cc200: d7 2c ff 97   bl      0x9755c <__ZN22GC_ArrayletObjectModel17getArrayletLayoutEP7J9Classmm>
   cc204: 60 aa 00 b9   str     w0, [x19, #168] <- LR points here
   cc208: 7f b2 02 39   strb    wzr, [x19, #172]
   cc20c: 88 12 40 f9   ldr     x8, [x20, #32] // x20 is clazz
   cc210: 09 31 40 f9   ldr     x9, [x8, #96]
   cc214: 68 52 40 f9   ldr     x8, [x19, #160] // x19 is env
   cc218: 29 7d 40 f9   ldr     x9, [x9, #248]
   cc21c: 2a 1d 40 f9   ldr     x10, [x9, #56] <- PC points here
   cc220: 5f 05 00 b1   cmn     x10, #1

knn-k avatar Aug 04 '22 00:08 knn-k

https://openj9-jenkins.osuosl.org/job/Test_openjdk18_j9_extended.system_aarch64_mac_Nightly_testList_0/113 SharedClasses.SCM23.MultiCL_0

https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Test_openjdk18_j9_extended.system_aarch64_mac_Nightly_testList_0/113/system_test_output.tar.gz

MCL2 12:06:47 >> Total classes loaded = 20001
MCL2 stderr Unhandled exception
MCL2 stderr Type=Segmentation error vmState=0x00000000
MCL2 stderr J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000002
MCL2 stderr Handler1=00000001048E70D4 Handler2=0000000104A97F24 InaccessibleAddress=000000010F7FFFF8
MCL2 stderr x0=000000010F809D00 x1=000000016BD50030 x2=000000037FCC9AE0 x3=00000001049E6C03
MCL2 stderr x4=000000016BD4FFE0 x5=000000016BD4FFD8 x6=000000016BD4FFE8 x7=0000000000000000
MCL2 stderr x8=0000000140A0C245 x9=000000010F800020 x10=000000010F800020 x11=0000000000000000
MCL2 stderr x12=000000016BD4FFF8 x13=000000016BD4FFF8 x14=000000037FCC9B64 x15=000000016BD503B0
MCL2 stderr x16=000000010492C3B4 x17=000000016BD4FFE0 x18=000000016BD50200 x19=000000010F809D00
MCL2 stderr x20=000000010F800020 x21=0000000109CF3284 x22=000000016BD50300 x23=000000016BD50300
MCL2 stderr x24=000000010F80BC68 x25=000000011B813220 x26=0000000000000000 x27=0000000000000000
MCL2 stderr x28=000000011B813220 x29(FP)=000000016BD50040 x30(LR)=000000010962AA7C x31(SP)=000000016BD50030
MCL2 stderr PC=0000000109616380 SP=000000016BD50030
MCL2 stderr v0 000000003f800000 (f: 1065353216.000000, d: 5.263544e-315)
MCL2 stderr v1 000000003f800000 (f: 1065353216.000000, d: 5.263544e-315)
MCL2 stderr v2 00000000484a4a40 (f: 1212828160.000000, d: 5.992168e-315)
MCL2 stderr v3 bfc117918227db7c (f: 2183650048.000000, d: -1.335318e-01)
MCL2 stderr v4 0000000000000001 (f: 1.000000, d: 4.940656e-324)
MCL2 stderr v5 0000000000000001 (f: 1.000000, d: 4.940656e-324)
MCL2 stderr v6 ffffffffffffffff (f: 4294967296.000000, d: nan)
MCL2 stderr v7 ffffffffffffffff (f: 4294967296.000000, d: nan)
MCL2 stderr v8 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL2 stderr v9 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL2 stderr v10 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL2 stderr v11 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL2 stderr v12 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL2 stderr v13 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL2 stderr v14 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL2 stderr v15 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL2 stderr v16 bfd0000000000000 (f: 0.000000, d: -2.500000e-01)
MCL2 stderr v17 3fd54fc2ac0f5555 (f: 2886685952.000000, d: 3.329932e-01)
MCL2 stderr v18 3f5646c42a412521 (f: 708912448.000000, d: 1.359645e-03)
MCL2 stderr v19 3fe62e42fefa39ef (f: 4277811712.000000, d: 6.931472e-01)
MCL2 stderr v20 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL2 stderr v21 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL2 stderr v22 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL2 stderr v23 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL2 stderr v24 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL2 stderr v25 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL2 stderr v26 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL2 stderr v27 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL2 stderr v28 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL2 stderr v29 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL2 stderr v30 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL2 stderr v31 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL2 stderr Module=/Users/jenkins/workspace/Test_openjdk18_j9_extended.system_aarch64_mac_Nightly_testList_0/openjdkbinary/j2sdk-image/lib/default/libj9jit29.dylib
MCL2 stderr Module_base_address=0000000109000000 Symbol=old_slow_jitHandleInternalErrorTrap
MCL2 stderr Symbol_address=0000000109616364
MCL2 stderr Target=2_90_20220804_114 (Mac OS X 11.4)
MCL2 stderr CPU=aarch64 (8 logical CPUs) (0x400000000 RAM)
MCL2 stderr ----------- Stack Backtrace -----------
MCL2 stderr ---------------------------------------

pshipton avatar Aug 04 '22 16:08 pshipton

It looks the same as #15518, again. Accessing [x10, #-40] crosses the page boundary, and no memory is mapped to the address. Segmentation fault as the result.

InaccessibleAddress=000000010F7FFFF8 x10=000000010F800020

> info mmap 0x010F800000
Start Address           End Address             Size                    Size                            Read/Write/Execute
0x000000010f800000      0x000000010fffffff      0x0000000000800000      (8,388,608)
Name:   Image section @ 10f800000 (8388608 bytes)
> !whatis  0x10f800020
Found 0x000000010F800020 as !udata 0x10f800020: !j9javavm 0x12d012220->j9ras->crashInfo->failingThread->sp

knn-k avatar Aug 05 '22 03:08 knn-k

I got the exception "Invalid JIT return address" by running the "!stack" command against JIT Compilation Thread-004, as shown below.

> !threads
        !stack 0x10f809d00      !j9vmthread 0x10f809d00 !j9thread 0x12b816050   tid 0xc91f3f7 (210891767) // (JIT Compilation Thread-004)
        !stack 0x11901dd00      !j9vmthread 0x11901dd00 !j9thread 0x11901b050   tid 0xc91f3fd (210891773) // (JIT Diagnostic Compilation Thread-007 Suspended)
        !stack 0x12d020500      !j9vmthread 0x12d020500 !j9thread 0x11901b558   tid 0xc91f418 (210891800) // (JIT-SamplerThread)
        !stack 0x12b818d00      !j9vmthread 0x12b818d00 !j9thread 0x11901ba60   tid 0xc91f419 (210891801) // (IProfiler)
        !stack 0x119048700      !j9vmthread 0x119048700 !j9thread 0x11903dc50   tid 0xc91f43d (210891837) // (Common-Cleaner)
        !stack 0x12d060100      !j9vmthread 0x12d060100 !j9thread 0x12d00b650   tid 0xc91f399 (210891673) // (DestroyJavaVM helper thread)
> !stack 0x10f809d00
<10f809d00>                             JNI call-in frame
<10f809d00>                             known but unhandled frame type com.ibm.j9ddr.vm29.pointer.U8Pointer @ 0x00000005

 FAULT  FAULT  FAULT  FAULT  FAULT  FAULT  FAULT  FAULT  FAULT  FAULT  FAULT  FAULT  FAULT  FAULT  FAULT  FAULT

<10f809d00>     !j9method 0x000000017001F788   java/lang/ClassLoader.loadClass(Ljava/lang/String;)Ljava/lang/Class;
Aug 05, 2022 5:41:17 PM com.ibm.j9ddr.vm29.events.DefaultEventListener corruptData
WARNING: CorruptDataException thrown walking stack. walkThread = 0x000000010F809D00
com.ibm.j9ddr.AddressedCorruptDataException: Invalid JIT return address
        at com.ibm.j9ddr.vm29.j9.stackwalker.JITStackWalker$JITStackWalker_29_V0.jitWalkStackFrames(JITStackWalker.java:287)
        at com.ibm.j9ddr.vm29.j9.stackwalker.JITStackWalker.jitWalkStackFrames(JITStackWalker.java:101)
        (... snip ....)
        at openj9.dtfjview/com.ibm.jvm.dtfjview.DTFJView.main(DTFJView.java:46)

Stack walk result: STACK_CORRUPT

It could be related to Issue #14717.

The JVM is terminating when the crash occurs, because there is "DestroyJavaVM helper thread" in the "!threads" output.

knn-k avatar Aug 05 '22 08:08 knn-k

https://openj9-jenkins.osuosl.org/job/Test_openjdk18_j9_extended.system_aarch64_mac_Nightly_testList_0/114 SharedClasses.SCM23.MultiCL_0

https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Test_openjdk18_j9_extended.system_aarch64_mac_Nightly_testList_0/114/system_test_output.tar.gz

MCL4 12:38:35 >> Total classes loaded = 20001
MCL4 stderr Unhandled exception
MCL4 stderr Type=Segmentation error vmState=0x00000000
MCL4 stderr J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000002
MCL4 stderr Handler1=000000010088A0A8 Handler2=0000000100A52E14 InaccessibleAddress=0000000000000008
MCL4 stderr x0=0000000000000000 x1=000000012D972D00 x2=000000016FBD4BB0 x3=0000000000000000
MCL4 stderr x4=0000000100A4BBF0 x5=0000000000000000 x6=0000000000000000 x7=0000000000000005
MCL4 stderr x8=0000000000000002 x9=00000000000F4240 x10=0000000000072ABA x11=000000B2F4B482E2
MCL4 stderr x12=00000000016E3600 x13=0000000000045237 x14=0000000000004693 x15=000000000000E5F4
MCL4 stderr x16=0000000192802C8C x17=000000020118D648 x18=0000000100B95181 x19=000000012D972D00
MCL4 stderr x20=000000012D877C20 x21=000000012D973598 x22=0000000000000000 x23=000000014F014598
MCL4 stderr x24=0000000000000004 x25=0000000100845E28 x26=0000000000000000 x27=0005E5755CDCEB7A
MCL4 stderr x28=000000014E00B3E8 x29(FP)=000000016FBD4B80 x30(LR)=0000000105064D10 x31(SP)=000000016FBD4A10
MCL4 stderr PC=00000001050E4FC4 SP=000000016FBD4A10
MCL4 stderr v0 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v1 000fffff000fffff (f: 1048575.000000, d: 2.225072e-308)
MCL4 stderr v2 0706050403020100 (f: 50462976.000000, d: 7.949929e-275)
MCL4 stderr v3 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v4 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v5 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v6 0000000000000001 (f: 1.000000, d: 4.940656e-324)
MCL4 stderr v7 0000000000000001 (f: 1.000000, d: 4.940656e-324)
MCL4 stderr v8 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v9 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v10 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v11 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v12 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v13 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v14 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v15 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v16 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v17 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v18 0000000000000001 (f: 1.000000, d: 4.940656e-324)
MCL4 stderr v19 0000000000000001 (f: 1.000000, d: 4.940656e-324)
MCL4 stderr v20 ffffffffffffffff (f: 4294967296.000000, d: nan)
MCL4 stderr v21 ffffffffffffffff (f: 4294967296.000000, d: nan)
MCL4 stderr v22 ffffffffffffffff (f: 4294967296.000000, d: nan)
MCL4 stderr v23 ffffffffffffffff (f: 4294967296.000000, d: nan)
MCL4 stderr v24 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v25 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v26 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v27 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v28 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v29 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v30 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v31 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr Module=/Users/jenkins/workspace/Test_openjdk18_j9_extended.system_aarch64_mac_Nightly_testList_0/openjdkbinary/j2sdk-image/lib/default/libj9jit29.dylib
MCL4 stderr Module_base_address=0000000105000000 Symbol=_ZN2J97Monitor5enterEv
MCL4 stderr Symbol_address=00000001050E4FC4
MCL4 stderr Target=2_90_20220805_115 (Mac OS X 11.4)
MCL4 stderr CPU=aarch64 (8 logical CPUs) (0x400000000 RAM)
MCL4 stderr ----------- Stack Backtrace -----------
MCL4 stderr ---------------------------------------

pshipton avatar Aug 05 '22 15:08 pshipton

There were failures with J9::Monitor::exit() before, but this is the first case with J9::Monitor::enter(). The "this" object (x0) is NULL.

Call path: initThreadAfterCreation() -> TR::CompilationInfo::acquireCompMonitor() -> J9::Monitor::enter().

x20 points to the compInfo in initThreadAfterCreation(). _compilationMonitor in the compInfo ([x20, #136]) is NULL, which causes the segmentation fault.

knn-k avatar Aug 08 '22 06:08 knn-k

Running jdmpview "!stack" command on "JIT Compilation Thread-000" gives the following output, which is similar to the one last week.

> !stack 0x12d882900
<12d882900>                             JNI call-in frame
<12d882900>                             known but unhandled frame type com.ibm.j9ddr.vm29.pointer.U8Pointer @ 0x00000005

 FAULT  FAULT  FAULT  FAULT  FAULT  FAULT  FAULT  FAULT  FAULT  FAULT  FAULT  FAULT  FAULT  FAULT  FAULT  FAULT

<12d882900>     !j9method 0x000000038801F788   java/lang/ClassLoader.loadClass(Ljava/lang/String;)Ljava/lang/Class;
Aug 08, 2022 4:21:36 PM com.ibm.j9ddr.vm29.events.DefaultEventListener corruptData
WARNING: CorruptDataException thrown walking stack. walkThread = 0x000000012D882900
com.ibm.j9ddr.AddressedCorruptDataException: Invalid JIT return address
        at com.ibm.j9ddr.vm29.j9.stackwalker.JITStackWalker$JITStackWalker_29_V0.jitWalkStackFrames(JITStackWalker.java:287)
        at com.ibm.j9ddr.vm29.j9.stackwalker.JITStackWalker.jitWalkStackFrames(JITStackWalker.java:101)
        (... snip ...)
        at openj9.dtfjview/com.ibm.jvm.dtfjview.DTFJView.main(DTFJView.java:46)

Stack walk result: STACK_CORRUPT

knn-k avatar Aug 08 '22 07:08 knn-k

I see the following incomplete output in the javacore file for the Test_openjdk18_j9_extended.system_aarch64_mac_Nightly_testList_0/114 job above. It is a JIT Compilation Thread, and recursive calls to ClassLoader.loadClass(), which was observed in #15518.

3XMTHREADINFO      "JIT Compilation Thread-000" J9VMThread:0x000000012D882900, omrthread_t:0x000000014F00C060, java/lang/Thread:0x0000000280220CD0, state:R, prio=10
3XMJAVALTHREAD            (java/lang/Thread getId:0x3, isDaemon:true)
3XMJAVALTHRCCL            jdk/internal/loader/ClassLoaders$AppClassLoader(0x000000028021F458)
3XMTHREADINFO1            (native thread ID:0x86F8238, native priority:0xB, native policy:UNKNOWN, vmstate:R, vm thread flags:0x00000060)
3XMTHREADINFO2            (native stack address range from:0x000000016F8BC000, to:0x000000016F9BF000, size:0x103000)
3XMCPUTIME               CPU usage total: 28.381319000 secs, current category="JIT"
3XMHEAPALLOC             Heap bytes allocated since last GC cycle=83128 (0x144B8)
1INTERNAL                    Unable to obtain lock context information
3XMTHREADINFO3           Java callstack:
4XESTACKTRACE                at java/lang/ClassLoader.loadClass(ClassLoader.java(Compiled Code))
4XESTACKTRACE                at java/lang/ClassLoader.loadClass(ClassLoader.java(Compiled Code))
4XESTACKTRACE                at java/lang/ClassLoader.loadClass(ClassLoader.java(Compiled Code))
4XESTACKTRACE                at java/lang/ClassLoader.loadClass(ClassLoader.java(Compiled Code))
4XESTACKTRACE                at java/lang/ClassLoader.loadClass(ClassLoader.java(Compiled Code))
4XESTACKTRACE                at java/lang/ClassLoader.loadClass(ClassLoader.java(Compiled Code))
4XESTACKTRACE                at java/lang/ClassLoader.loadClass(ClassLoader.java(Compiled Code))
4XESTACKTRACE                at java/lang/ClassLoader.loadClass(ClassLoader.java(Compiled Code))
4XESTACKTRACE                at java/lang/ClassLoader.loadClass(ClassLoader.java(Compiled Code))
4XESTACKTRACE                at java/lang/ClassLoader.loadClass(ClassLoader.java(Compiled Code))
4XESTACKTRACE                at java/lang/ClassLoader.loadClass(ClassLoader.java(Compiled Code))
4XESTACKTRACE                at java/lang/ClassLoader.loadClass(ClassLoader.java(Compiled Code))
4XESTACKTRACE                at java/lang/ClassLoader.loadClass(ClassLoader.java(Compiled Code))
4XESTACKTRACE                at java/lang/ClassLoader.loadClass(ClassLoader.java(Compiled Code))
4XESTACKTRACE                at java/lang/ClassLoader.loadClass(ClassLoader.java(Compiled Code))
4XESTACKTRACE                at java/lang/ClassLoader.loadClass(ClassLoader.java(Compiled Code))
4XESTACKTRACE                at java/lang/ClassLoader.loadClass(ClassLoader.java(Compiled Code))
4XESTACKTRACE                at java/lang/ClassLoader.loadClass(ClassLoader.java(Compiled Code))
4XESTACKTRACE                at java/lang/ClassLoader.

knn-k avatar Sep 09 '22 05:09 knn-k

Tentatively tagging as a blocker for amac being removed from EA.

pshipton avatar Sep 13 '22 14:09 pshipton

PR #15907 should fix this issue.

knn-k avatar Sep 16 '22 07:09 knn-k