openj9
openj9 copied to clipboard
IBM Semeru 18.0.1 crashes in libj9gc29.so
Java -version output
openjdk version "18.0.1" 2022-04-19
IBM Semeru Runtime Open Edition 18.0.1.0 (build 18.0.1+10)
Eclipse OpenJ9 VM 18.0.1.0 (build openj9-0.32.0, JRE 18 Linux amd64-64-Bit Compressed References 20220422_62 (JIT enabled, AOT enabled)
OpenJ9 - 9a84ec34e
OMR - ab24b6666
JCL - b577b89345e based on jdk-18.0.1+10)
Summary of problem
We observed crashes in some of our test applications when executed with IBM Semeru 18.0.1. Those tests are running together with our JVMTI agent.
I observed the following message twice so far:
Unhandled exception
Type=Segmentation error vmState=0x00000000
J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000001
Handler1=00007F4FB0AF3290 Handler2=00007F4FB0851150 InaccessibleAddress=0000000000000000
RDI=00007F4FAC05FB10 RSI=0000000000000007 RAX=0000000000000000 RBX=00000000F00D1E90
RCX=0000000000000000 RDX=0000000099669966 R8=00007F4F2FFFE5F0 R9=00007F4F2FFFE5F0
R10=0000000000000006 R11=0000000000000000 R12=00007F4F2FFFE200 R13=00007F4F2FFFE030
R14=00007F4FAA051340 R15=00007F4E74002C48
RIP=00007F4FA9D23BBD GS=0000 FS=0000 RSP=00007F4F2FFFE030
EFlags=0000000000010246 CS=0033 RBP=00007F4F2FFFE440 ERR=0000000000000004
TRAPNO=000000000000000E OLDMASK=0000000000001000 CR2=0000000000000000
xmm0 0000003000000020 (f: 32.000000, d: 1.018558e-312)
xmm1 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm2 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm3 000000004153c21e (f: 1096008192.000000, d: 5.415000e-315)
xmm4 000000003eccccd5 (f: 1053609152.000000, d: 5.205521e-315)
xmm5 0000000039347648 (f: 959739456.000000, d: 4.741743e-315)
xmm6 000000004141ecbc (f: 1094839424.000000, d: 5.409226e-315)
xmm7 000000003d884dc2 (f: 1032342976.000000, d: 5.100452e-315)
xmm8 242d643a00080e00 (f: 527872.000000, d: 2.021873e-134)
xmm9 2a3d032500180900 (f: 1575168.000000, d: 3.162448e-105)
xmm10 2eec225f00110700 (f: 1115904.000000, d: 1.158582e-82)
xmm11 3ada3f4600190500 (f: 1639680.000000, d: 3.392366e-25)
xmm12 3b3b0af000100700 (f: 1050368.000000, d: 2.236922e-23)
xmm13 491e6e48001a0500 (f: 1705216.000000, d: 1.696573e+44)
xmm14 4dff21d5000c0500 (f: 787712.000000, d: 5.245758e+67)
xmm15 4ff688cb000e0d00 (f: 920832.000000, d: 1.630812e+77)
Module=/home/jenkins/agent/oa-master/agent/test/java/build/tmp/caches/jdk-ibm-semeru_64-18.0.1.10.0/lib/default/libj9gc29.so
Module_base_address=00007F4FA9B8A000
Target=2_90_20220422_62 (Linux 5.11.0-36-generic)
CPU=amd64 (40 logical CPUs) (0x3ef38ec000 RAM)
----------- Stack Backtrace -----------
_Z27concurrentStackSlotIteratorP8J9JavaVMPP8J9ObjectPvP16J9StackWalkStatePKv+0x39d (0x00007F4FA9D23BBD [libj9gc29.so+0x199bbd])
jitWalkFrame+0x16b (0x00007F4FAA9A4D2B [libj9jit29.so+0x951d2b])
jitWalkStackFrames+0xddc (0x00007F4FAA9A611C [libj9jit29.so+0x95311c])
walkStackFrames+0xb3 (0x00007F4FB0B33483 [libj9vm29.so+0x80483])
_ZN28GC_VMThreadStackSlotIterator9scanSlotsEP10J9VMThreadS1_PvPFvP8J9JavaVMPP8J9ObjectS2_P16J9StackWalkStatePKvEbb+0x76 (0x00007F4FA9BCC8A6 [libj9gc29.so+0x428a6])
_ZN28MM_ConcurrentMarkingDelegate15scanThreadRootsEP18MM_EnvironmentBase+0x1ac (0x00007F4FA9D241BC [libj9gc29.so+0x19a1bc])
_ZN15MM_ConcurrentGC10scanThreadEP18MM_EnvironmentBase+0x61 (0x00007F4FA9CE6301 [libj9gc29.so+0x15c301])
memorySubSpaceAsyncCallbackHandler+0x6d (0x00007F4FA9CB325D [libj9gc29.so+0x12925d])
dispatchAsyncEvents+0xc7 (0x00007F4FB0AF4887 [libj9vm29.so+0x41887])
javaCheckAsyncMessages+0xa8 (0x00007F4FB0ACADE8 [libj9vm29.so+0x17de8])
old_slow_jitCheckAsyncMessages+0x85 (0x00007F4FAA991475 [libj9jit29.so+0x93e475])
(0x00007F4FAA9A89F6 [libj9jit29.so+0x9559f6])
The small diagnostic files for those 2 crashes are attached. If necessary I may be able to provide core dumps as well.
A third crash did not create any diagnostic files and only printed the following:
Thread 000000000038AF00 structures scan: slot 00007F73941A8CF8 has bad value FFFFFFFFFFFFFFFF, iterator state 2
java: ../../../../../../openj9/runtime/gc_glue_java/ScavengerRootScanner.hpp:123: virtual void MM_ScavengerRootScanner
Not sure if this is related or a separate issue.
Diagnostic files
@quijote Would you please provide system core(s)?
All crashes occurred because of bad O-slot in java thread stack frames. For first two crashes we know there are JIT frames.
I can triage this problem and provide more details about it. Adding JIT component label. @0xdaryl FYI
I sent you the core dumps by email now (@ca.ibm.com). Let me know if this did not work.
I have received two cores uploaded to my /team/Dmitri/15017/ folder. Both of the cores have exactly the same problematic pattern - O-slot in JIT Frame for com/dynatrace/agent/casp/thirdparty/ComponentClassifier.isIgnoredClass(Lcom/dynatrace/agent/casp/thirdparty/wrapper/ClassWrapper;)Z points inside (the same) object in Tenure. It is possible (speculation) wrong offset is a result of disregard of expanded header size for Indexable object (and in this case an intention was to point to next object).
For core.20220505.093317.70573.0001.dmp:
Failing Thread: !j9vmthread 0x396c00
Failing Thread ID: 0x11403 (70659)
gpInfo:
J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000001
Handler1=00007F4FB0AF3290 Handler2=00007F4FB0851150 InaccessibleAddress=0000000000000000
RDI=00007F4FAC05FB10 RSI=0000000000000007 RAX=0000000000000000 RBX=00000000F00D1E90
RCX=0000000000000000 RDX=0000000099669966 R8=00007F4F2FFFE5F0 R9=00007F4F2FFFE5F0
R10=0000000000000006 R11=0000000000000000 R12=00007F4F2FFFE200 R13=00007F4F2FFFE030
R14=00007F4FAA051340 R15=00007F4E74002C48
RIP=00007F4FA9D23BBD GS=0000 FS=0000 RSP=00007F4F2FFFE030
EFlags=0000000000010246 CS=0033 RBP=00007F4F2FFFE440 ERR=0000000000000004
TRAPNO=000000000000000E OLDMASK=0000000000001000 CR2=0000000000000000
Module=/home/jenkins/agent/oa-master/agent/test/java/build/tmp/caches/jdk-ibm-semeru_64-18.0.1.10.0/lib/default/libj9gc29.so
Module_base_address=00007F4FA9B8A000
+----------------+----------------+----------------+----------------+--------+----------------+----------------------
| region | start | end | subspace | flags | size | region type
+----------------+----------------+----------------+----------------+--------+----------------+----------------------
00007f4fac1bd9c0 00000000f0000000 00000000f0700000 00007f4fac1a9840 00000009 700000 ADDRESS_ORDERED
00007f4fac1bc1f0 00000000ffda0000 00000000fff00000 00007f4fac1b54c0 0000000a 160000 ADDRESS_ORDERED
00007f4fac1baa20 00000000fff00000 0000000100000000 00007f4fac1b0260 0000000a 100000 ADDRESS_ORDERED
+----------------+----------------+----------------+----------------+--------+----------------+----------------------
Checking THREAD STACKS... <gc check (1): from debugger: THREAD STACKS: slot 396c00(7f4f2fffe5f0) -> f00d1e90: class pointer is null>
0xF00D1E80 : 00068500 00000002 f00d1e90 00000000 [ ................ ]
0xF00D1E90 : 0000003e 00000000 0006d900 1e01a3d6 [ >............... ] <-------
0xF00D1EA0 : 00000000 00000001 ba6fb858 00000000 [ ........X.o..... ]
0xF00D1EB0 : 00068500 0000000c f00d1ec0 00000000 [ ................ ]
0xF00D1EC0 : 0061006a 00610076 002e0078 00000000 [ j.a.v.a.x....... ]
0xf00d1e90 pointed inside object
> !j9object 0xF00D1E80
!J9IndexableObject 0x00000000F00D1E80 {
struct J9Class* clazz = !j9arrayclass 0x68500 // [B
Object flags = 0x00000000;
U_32 size = 0x00000002;
[0] = 62, 0x3e
[1] = 0, 0x00
}
next object:
> !j9object 0xF00D1E98
J9VMJavaLangString at 0x00000000F00D1E98 {
struct J9Class* clazz = !j9class 0x6D900 // java/lang/String
Object flags = 0x00000000;
[B value = !fj9object 0x1e01a3d6 (offset = 0) (java/lang/String)
B coder = 0x00000001 (offset = 8) (java/lang/String)
I hash = 0xBA6FB858 (offset = 12) (java/lang/String)
Z hashIsZero = 0x00000000 (offset = 16) (java/lang/String)
Lcom/dynatrace/agent/introspection/casp/flawfinder/store/TaintInfo; dynatraceTaintInfo = !fj9object 0x0 (offset = 4) (java/lang/String)
"javax."
}
Problematic JIT frame:
<396c00> JIT frame: bp = 0x00000000003BA3B8, pc = 0x00007F4F9042D043, unwindSP = 0x00000000003BA340, cp = 0x000000000037EDC0, arg0EA = 0x00000000003BA3C8, jitInfo = 0x00007F4F8805F1A8
<396c00> Method: com/dynatrace/agent/casp/thirdparty/ComponentClassifier.isIgnoredClass(Lcom/dynatrace/agent/casp/thirdparty/wrapper/ClassWrapper;)Z !j9method 0x000000000037FBE8
<396c00> Bytecode index = 11, inlineDepth = 0, PC offset = 0x00000000000020C3
<396c00> stackMap=0x00007F4F8805FF74, slots=I16(0x0002) parmBaseOffset=I16(0x0008), parmSlots=U16(0x0002), localBaseOffset=I16(0xFFC8)
<396c00> Described JIT args starting at 0x00000000003BA3C0 for U16(0x0002) slots
<396c00> I-Slot: : a1[0x00000000003BA3C0] = 0x00000000F061D940
<396c00> O-Slot: : a0[0x00000000003BA3C8] = 0x00000000F013B6B8
<396c00> Described JIT temps starting at 0x00000000003BA380 for IDATA(0x0000000000000007) slots
<396c00> I-Slot: : t6[0x00000000003BA380] = 0x0000000000000000
<396c00> I-Slot: : t5[0x00000000003BA388] = 0x00000000FFFB6178
<396c00> I-Slot: : t4[0x00000000003BA390] = 0x0000000000300000
<396c00> I-Slot: : t3[0x00000000003BA398] = 0x00000000F013B6B8
<396c00> I-Slot: : t2[0x00000000003BA3A0] = 0x0000000010000000
<396c00> I-Slot: : t1[0x00000000003BA3A8] = 0x00007F4FAA9B2920
<396c00> I-Slot: : t0[0x00000000003BA3B0] = 0x00007F4F903FF105
<396c00> JIT-RegisterMap = UDATA(0x0000000000001027)
<396c00> JIT-RegisterMap-O-Slot[0x00007F4F2FFFE5E0] = 0x00000000F013B6B8 (jit_rax)
<396c00> JIT-RegisterMap-O-Slot[0x00007F4F2FFFE5E8] = 0x00000000FFDA5A00 (jit_rbx)
<396c00> JIT-RegisterMap-O-Slot[0x00007F4F2FFFE5F0] = 0x00000000F00D1E90 (jit_rcx) <----------------------
<396c00> JIT-RegisterMap-I-Slot[0x00007F4F2FFFE5F8] = UDATA(0x000000001E0C3B36) (jit_rdx)
<396c00> JIT-RegisterMap-I-Slot[0x00007F4F2FFFE600] = UDATA(0x0000000000000000) (jit_rdi)
<396c00> JIT-RegisterMap-O-Slot[0x00007F4F2FFFE608] = 0x00000000FFDA5A18 (jit_rsi)
<396c00> JIT-RegisterMap-I-Slot[0x00007F4F2FFFE610] = UDATA(0x00007F4F2FFFE740) (jit_rbp)
<396c00> JIT-RegisterMap-I-Slot[0x00007F4F2FFFE618] = UDATA(0x00007F4FAC7E3AB0) (jit_rsp)
<396c00> JIT-RegisterMap-I-Slot[0x00007F4F2FFFE620] = UDATA(0x00000000F00D1E80) (jit_r8)
<396c00> JIT-RegisterMap-I-Slot[0x00007F4F2FFFE628] = UDATA(0x00000000EA6081DC) (jit_r9)
<396c00> JIT-RegisterMap-I-Slot[0x00007F4F2FFFE630] = UDATA(0x00000000FFFFFFFF) (jit_r10)
<396c00> JIT-RegisterMap-I-Slot[0x00007F4F2FFFE638] = UDATA(0x000000001E02D3F8) (jit_r11)
<396c00> JIT-RegisterMap-O-Slot[0x00007F4F2FFFE640] = 0x00000000F061D9B0 (jit_r12)
<396c00> JIT-RegisterMap-I-Slot[0x00007F4F2FFFE648] = UDATA(0x00000000FFFB6198) (jit_r13)
<396c00> JIT-RegisterMap-I-Slot[0x00007F4F2FFFE650] = UDATA(0x00000000FFFB61A0) (jit_r14)
<396c00> JIT-RegisterMap-I-Slot[0x00007F4F2FFFE658] = UDATA(0x00000000F0000000) (jit_r15)
<396c00> JIT-Frame-RegisterMap[0x00000000003BA368] = UDATA(0x00000000FFFB6178) (jit_rbx)
<396c00> JIT-Frame-RegisterMap[0x00000000003BA370] = UDATA(0x0000000000000010) (jit_r9)
<396c00> JIT-Frame-RegisterMap[0x00007F4F2FFFE630] = UDATA(0x00000000FFFFFFFF) (jit_r10)
<396c00> JIT-Frame-RegisterMap[0x00007F4F2FFFE638] = UDATA(0x000000001E02D3F8) (jit_r11)
<396c00> JIT-Frame-RegisterMap[0x00007F4F2FFFE640] = UDATA(0x00000000F061D9B0) (jit_r12)
<396c00> JIT-Frame-RegisterMap[0x00007F4F2FFFE648] = UDATA(0x00000000FFFB6198) (jit_r13)
<396c00> JIT-Frame-RegisterMap[0x00007F4F2FFFE650] = UDATA(0x00000000FFFB61A0) (jit_r14)
<396c00> JIT-Frame-RegisterMap[0x00007F4F2FFFE658] = UDATA(0x00000000F0000000) (jit_r15)
For core.20220504.140827.18093.0001.dmp - there is bad O-slot in JIT frame for `com/dynatrace/agent/casp/thirdparty/ComponentClassifier.doesClassNameStartWithIgnoredPackagePrefix(Lcom/dynatrace/agent/casp/thirdparty/wrapper/ClassWrapper;)Z:
Failing Thread: !j9vmthread 0x396c00
Failing Thread ID: 0x4703 (18179)
gpInfo:
J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000001
Handler1=00007FBFD0993290 Handler2=00007FBFD06F1150 InaccessibleAddress=0000000000000000
RDI=00007FBFCC066060 RSI=0000000000000007 RAX=0000000000000000 RBX=00000000F00DABD8
RCX=0000000000000000 RDX=0000000099669966 R8=00007FBF52FFE5F0 R9=00007FBF52FFE5F0
R10=0000000040000000 R11=0000000000000000 R12=00007FBF52FFE200 R13=00007FBF52FFE030
R14=00007FBFCA051340 R15=00007FBE94002C48
RIP=00007FBFC9D23BBD GS=0000 FS=0000 RSP=00007FBF52FFE030
EFlags=0000000000010246 CS=0033 RBP=00007FBF52FFE440 ERR=0000000000000004
TRAPNO=000000000000000E OLDMASK=0000000000001000 CR2=0000000000000000
Module=/home/jenkins/agent/oa-master/agent/test/java/build/tmp/caches/jdk-ibm-semeru_64-18.0.1.10.0/lib/default/libj9gc29.so
Module_base_address=00007FBFC9B8A000
Checking THREAD STACKS... <gc check (1): from debugger: THREAD STACKS: slot 396c00(7fbf52ffe5f0) -> f00dabd8: class pointer is null>
+----------------+----------------+----------------+----------------+--------+----------------+----------------------
| region | start | end | subspace | flags | size | region type
+----------------+----------------+----------------+----------------+--------+----------------+----------------------
00007fbfcc1c3f40 00000000f0000000 00000000f0830000 00007fbfcc1afdc0 00000009 830000 ADDRESS_ORDERED
00007fbfcc1c2770 00000000ffd60000 00000000ffde0000 00007fbfcc1bba40 0000000a 80000 ADDRESS_ORDERED
00007fbfcc1c0fa0 00000000ffde0000 0000000100000000 00007fbfcc1b67e0 0000000a 220000 ADDRESS_ORDERED
+----------------+----------------+----------------+----------------+--------+----------------+----------------------
0xF00DABB0 : 0006d900 1e01b579 00000000 00000001 [ ....y........... ]
0xF00DABC0 : 0000002e 00000000 00068500 00000002 [ ................ ]
0xF00DABD0 : f00dabd8 00000000 0000003e 00000000 [ ........>....... ] <--- 0xf00dabd8
0xF00DABE0 : 0006d900 1e01b57f 00000000 00000001 [ ................ ]
0xF00DABF0 : ba6fb858 00000000 00068500 0000000c [ X.o............. ]
> !j9object 0xF00DABC8
!J9IndexableObject 0x00000000F00DABC8 {
struct J9Class* clazz = !j9arrayclass 0x68500 // [B
Object flags = 0x00000000;
U_32 size = 0x00000002;
[0] = 62, 0x3e
[1] = 0, 0x00
}
> !j9object 0xF00DABE0
J9VMJavaLangString at 0x00000000F00DABE0 {
struct J9Class* clazz = !j9class 0x6D900 // java/lang/String
Object flags = 0x00000000;
[B value = !fj9object 0x1e01b57f (offset = 0) (java/lang/String)
B coder = 0x00000001 (offset = 8) (java/lang/String)
I hash = 0xBA6FB858 (offset = 12) (java/lang/String)
Z hashIsZero = 0x00000000 (offset = 16) (java/lang/String)
Lcom/dynatrace/agent/introspection/casp/flawfinder/store/TaintInfo; dynatraceTaintInfo = !fj9object 0x0 (offset = 4) (java/lang/String)
"javax."
}
<396c00> JIT frame: bp = 0x00000000003BA3F8, pc = 0x00007FBFB00743B2, unwindSP = 0x00000000003BA370, cp = 0x000000000037EDC0, arg0EA = 0x00000000003BA408, jitInfo = 0x00007FBFA801CDA8
<396c00> Method: com/dynatrace/agent/casp/thirdparty/ComponentClassifier.doesClassNameStartWithIgnoredPackagePrefix(Lcom/dynatrace/agent/casp/thirdparty/wrapper/ClassWrapper;)Z !j9method 0x000000000037FC28
<396c00> Bytecode index = 19, inlineDepth = 0, PC offset = 0x0000000000000DF2
<396c00> stackMap=0x00007FBFA801D522, slots=I16(0x0002) parmBaseOffset=I16(0x0008), parmSlots=U16(0x0002), localBaseOffset=I16(0xFFC8)
<396c00> Described JIT args starting at 0x00000000003BA400 for U16(0x0002) slots
<396c00> O-Slot: : a1[0x00000000003BA400] = 0x00000000FFF7B400
<396c00> O-Slot: : a0[0x00000000003BA408] = 0x00000000F014CEF0
<396c00> Described JIT temps starting at 0x00000000003BA3C0 for IDATA(0x0000000000000007) slots
<396c00> I-Slot: : t6[0x00000000003BA3C0] = 0x00000000FFF0A898
<396c00> I-Slot: : t5[0x00000000003BA3C8] = 0x00000000F0139010
<396c00> I-Slot: : t4[0x00000000003BA3D0] = 0x00000000F0138620
<396c00> I-Slot: : t3[0x00000000003BA3D8] = 0x00007FBFCA9B29B0
<396c00> I-Slot: : t2[0x00000000003BA3E0] = 0x00000000FFF167E8
<396c00> I-Slot: : t1[0x00000000003BA3E8] = 0x0000000000000000
<396c00> I-Slot: : t0[0x00000000003BA3F0] = 0x000000000037FBE8
<396c00> JIT-RegisterMap = UDATA(0x0000000000000904)
<396c00> JIT-RegisterMap-I-Slot[0x00007FBF52FFE5E0] = UDATA(0x00000000FFF367F8) (jit_rax)
<396c00> JIT-RegisterMap-I-Slot[0x00007FBF52FFE5E8] = UDATA(0x00000000F00DABC8) (jit_rbx)
<396c00> JIT-RegisterMap-O-Slot[0x00007FBF52FFE5F0] = 0x00000000F00DABD8 (jit_rcx) <-----------------
<396c00> JIT-RegisterMap-I-Slot[0x00007FBF52FFE5F8] = UDATA(0x0000000000000018) (jit_rdx)
<396c00> JIT-RegisterMap-I-Slot[0x00007FBF52FFE600] = UDATA(0x0000000000000000) (jit_rdi)
<396c00> JIT-RegisterMap-I-Slot[0x00007FBF52FFE608] = UDATA(0x0000000000000001) (jit_rsi)
<396c00> JIT-RegisterMap-I-Slot[0x00007FBF52FFE610] = UDATA(0x00007FBF52FFE740) (jit_rbp)
<396c00> JIT-RegisterMap-I-Slot[0x00007FBF52FFE618] = UDATA(0x00007FBFCC801F30) (jit_rsp)
<396c00> JIT-RegisterMap-O-Slot[0x00007FBF52FFE620] = 0x00000000FFF7B410 (jit_r8)
<396c00> JIT-RegisterMap-I-Slot[0x00007FBF52FFE628] = UDATA(0x0000000000000018) (jit_r9)
<396c00> JIT-RegisterMap-I-Slot[0x00007FBF52FFE630] = UDATA(0x00000000FFFFFFFF) (jit_r10)
<396c00> JIT-RegisterMap-O-Slot[0x00007FBF52FFE638] = 0x00000000F075E788 (jit_r11)
<396c00> JIT-RegisterMap-I-Slot[0x00007FBF52FFE640] = UDATA(0x0000000000000004) (jit_r12)
<396c00> JIT-RegisterMap-I-Slot[0x00007FBF52FFE648] = UDATA(0x00000000FFF167E8) (jit_r13)
<396c00> JIT-RegisterMap-I-Slot[0x00007FBF52FFE650] = UDATA(0x00000000FFF16800) (jit_r14)
<396c00> JIT-RegisterMap-I-Slot[0x00007FBF52FFE658] = UDATA(0x00000000FFFFFFFF) (jit_r15)
<396c00> JIT-Frame-RegisterMap[0x00000000003BA398] = UDATA(0x00000000BDF7880F) (jit_rbx)
<396c00> JIT-Frame-RegisterMap[0x00000000003BA3A0] = UDATA(0x00000000FFF0A898) (jit_r9)
<396c00> JIT-Frame-RegisterMap[0x00007FBF52FFE630] = UDATA(0x00000000FFFFFFFF) (jit_r10)
<396c00> JIT-Frame-RegisterMap[0x00007FBF52FFE638] = UDATA(0x00000000F075E788) (jit_r11)
<396c00> JIT-Frame-RegisterMap[0x00007FBF52FFE640] = UDATA(0x0000000000000004) (jit_r12)
<396c00> JIT-Frame-RegisterMap[0x00007FBF52FFE648] = UDATA(0x00000000FFF167E8) (jit_r13)
<396c00> JIT-Frame-RegisterMap[0x00007FBF52FFE650] = UDATA(0x00000000FFF16800) (jit_r14)
<396c00> JIT-Frame-RegisterMap[0x00007FBF52FFE658] = UDATA(0x00000000FFFFFFFF) (jit_r15)
<396c00> I2J values: PC = 0x00007FBFCD5C73B2, A0 = 0x00000000003BA438, walkSP = 0x00000000003BA418, literals = 0x000000000037FBE8, JIT PC = 0x00007FBFCA9B29B0, pcAddress = 0x00007FBF52FFE808, decomp = 0x0000000000000000
@a7ehuo : could you investigate this issue please?
I looked at the core from 18093 since the jitdmp from 70573 is incomplete. However, the jitdmp for 18093 shows different inlining from when it was crashed. jitdmp.18093 doesn't have java/lang/StringUTF16.getChars([BII[CI)V inlined, but javacore and stackslots show the top JIT frame is inlined frame java/lang/StringUTF16.getChars([BII[CI)V when the crash happened. The incomplete jitdmp.70573 has java/lang/StringUTF16.getChars([BII[CI)V inlined but it lacks of generated code.
The bad object (f00dabd8) on the JIT frame existed when jitCheckAsyncMessages helper call is called (pc = 0x00007BFB00743B2) [1]. They look like out-of-line instructions. The inlined java/lang/StringUTF16.getChars([BII[CI)V from jitdmp.70573 shows that the aynchcheck is inserted at the end of the loop that stores values from byte[] value into char dst[] [2].
[1]
0x7fbfb0074378 {com/.../ComponentClassifier.doesClassNameStartWithIgnoredPackagePrefix} +3512 7:41 e80342931a call 0x7fbfca9a8580 <----- jitNewArray
0x7fbfb007437d {com/.../ComponentClassifier.doesClassNameStartWithIgnoredPackagePrefix} +3517 4d8bde mov r11, r14
0x7fbfb0074380 {com/.../ComponentClassifier.doesClassNameStartWithIgnoredPackagePrefix} +3520 4c8b742470 mov r14, qword ptr [rsp + 0x70]
0x7fbfb0074385 {com/.../ComponentClassifier.doesClassNameStartWithIgnoredPackagePrefix} +3525 498bff mov rdi, r15
0x7fbfb0074388 {com/.../ComponentClassifier.doesClassNameStartWithIgnoredPackagePrefix} +3528 498bd6 mov rdx, r14
0x7fbfb007438b {com/.../ComponentClassifier.doesClassNameStartWithIgnoredPackagePrefix} +3531 498bcd mov rcx, r13
0x7fbfb007438e {com/.../ComponentClassifier.doesClassNameStartWithIgnoredPackagePrefix} +3534 4887c2 xchg rdx, rax
0x7fbfb0074391 {com/.../ComponentClassifier.doesClassNameStartWithIgnoredPackagePrefix} +3537 488bc2 mov rax, rdx
0x7fbfb0074394 {com/.../ComponentClassifier.doesClassNameStartWithIgnoredPackagePrefix} +3540 e93cf6ffff jmp 0x7fbfb00739d5 U>> ^+1045
0x7fbfb0074399 GCMap Bytecode 11:78 Registers: r11 r14 StackMap: P0x90 P0x98
0x7fbfb0074399 {com/.../ComponentClassifier.doesClassNameStartWithIgnoredPackagePrefix} +3545 11:78 e85245931a call 0x7fbfca9a88f0 <<< +807 <----- jitCheckAsyncMessages
0x7fbfb007439e {com/.../ComponentClassifier.doesClassNameStartWithIgnoredPackagePrefix} +3550 e94af5ffff jmp 0x7fbfb00738ed U>> ^+813
0x7fbfb00743a3 GCMap Bytecode 36:25 Registers: r9 r11 r14 StackMap: P0x90 P0x98
0x7fbfb00743a3 {com/.../ComponentClassifier.doesClassNameStartWithIgnoredPackagePrefix} +3555 36:25 e84845931a call 0x7fbfca9a88f0 <<< +767 <----- jitCheckAsyncMessages
0x7fbfb00743a8 {com/.../ComponentClassifier.doesClassNameStartWithIgnoredPackagePrefix} +3560 e918f5ffff jmp 0x7fbfb00738c5 U>> ^+773
0x7fbfb00743ad GCMap Bytecode 23:38 Registers: rcx r8 r11 StackMap: P0x90 P0x98
0x7fbfb00743ad {com/.../ComponentClassifier.doesClassNameStartWithIgnoredPackagePrefix} +3565 23:38 e83e45931a call 0x7fbfca9a88f0 <<< +549 <----- jitCheckAsyncMessages
0x7fbfb00743b2 {com/.../ComponentClassifier.doesClassNameStartWithIgnoredPackagePrefix} +3570 e934f4ffff jmp 0x7fbfb00737eb U>> ^+555 <----- pc
0x7fbfb00743b7 {com/.../ComponentClassifier.doesClassNameStartWithIgnoredPackagePrefix} +3575 4d8beb mov r13, r11 <<< +486
[2]
<396c00> JIT inline frame: bp = 0x00000000003BA3F8, pc = 0x00007FBFB00743B2, unwindSP = 0x00000000003BA370, cp = 0x00000000000F8190, arg0EA = 0x0000000000000000, jitInfo = 0x00007FBFA801CDA8
<396c00> Method: java/lang/StringUTF16.getChars([BII[CI)V !j9method 0x00000000000F78D0
<396c00> Bytecode index = 38, inlineDepth = 4, PC offset = 0x000000000000110A
0 iload1
1 iload2
2 ificmpge 13
5 iload1
6 iload2
7 iload1
8 isub
9 aload0
10 invokestatic 42 java/lang/StringUTF16.checkBoundsOffCount(II[B)V
13 iload1
14 istore 5
16 iload 5
18 iload2
19 ificmpge 41
22 aload3
23 iload 4
25 iinc 4 1
28 aload0
29 iload 5
31 invokestatic 28 java/lang/StringUTF16.getChar([BI)C
34 castore
35 iinc 5 1
38 goto 16 <-- asynccheck at the end of the loop
41 return0
n3622n astore <pending push temp 0>[#797 Auto] [flags 0x7 0x800 ] [0x7f4f4ab44be0] bci=[28,31,256] rc=0 vc=371 vn=- li=- udi=- nc=1
n3611n aload dst<auto slot 1>[#781 Auto] [flags 0x7 0x0 ]
...
...
n3637n sstorei <array-shadow>[#227 Shadow] [flags 0x80000602 0x0 ] [0x7f4f4ab45090] bci=[28,34,256] rc=0 vc=371 vn=- li=- udi=- nc=2
n3636n aladd (internalPtr sharedMemory ) [0x7f4f4ab45040] bci=[28,34,256] rc=1 vc=371 vn=- li=- udi=- nc=2 flg=0x8000
n3611n ==>aload
n3635n ladd [0x7f4f4ab44ff0] bci=[28,34,256] rc=1 vc=371 vn=- li=- udi=- nc=2
n3633n lshl [0x7f4f4ab44f50] bci=[28,34,256] rc=1 vc=371 vn=- li=- udi=- nc=2
n3632n i2l (X>=0 ) [0x7f4f4ab44f00] bci=[28,34,256] rc=1 vc=371 vn=- li=- udi=- nc=1 flg=0x100
n3612n ==>iload
n3631n iconst 1 [0x7f4f4ab44eb0] bci=[28,34,256] rc=1 vc=371 vn=- li=- udi=- nc=0
n3634n lconst 16 [0x7f4f4ab44fa0] bci=[28,34,256] rc=1 vc=371 vn=- li=- udi=- nc=0
n3627n i2s [0x7f4f4ab44d70] bci=[28,34,256] rc=1 vc=371 vn=- li=- udi=- nc=1
n3620n ==>su2i
n3641n istore i<auto slot 5>[#793 Auto] [flags 0x3 0x0 ] [0x7f4f4ab451d0] bci=[28,35,255] rc=0 vc=371 vn=- li=- udi=- nc=1
n3640n iadd [0x7f4f4ab45180] bci=[28,35,255] rc=1 vc=371 vn=- li=- udi=- nc=2
n3638n iload i<auto slot 5>[#793 Auto] [flags 0x3 0x0 ] [0x7f4f4ab450e0] bci=[28,35,255] rc=1 vc=371 vn=- li=- udi=- nc=0
n3639n iconst 1 [0x7f4f4ab45130] bci=[28,35,255] rc=1 vc=371 vn=- li=- udi=- nc=0
n3642n asynccheck jitCheckAsyncMessages[#23 helper Method] [flags 0x400 0x0 ] [0x7f4f4ab45220] bci=[28,38,255] rc=0 vc=371 vn=- li=- udi=- nc=0
n3643n goto --> block_583 BBStart at n3586n [0x7f4f4ab45270] bci=[28,38,255] rc=0 vc=371 vn=- li=- udi=- nc=0
n3609n BBEnd </block_582> =====
We won't be able to get to the bottom of this for 0.35. Moving to 0.36.
@quijote Do you still see this issue? Is there a way for us to reproduce it locally?
We disabled our tests for affected OpenJ9 JVMs for now but yesterday I rolled out J9 0.35 (8, 11, 17) in our CI. Are there any plans to have Java 18 with J9 0.35 as well? Or maybe 19? Did not find them yet via https://developer.ibm.com/languages/java/semeru-runtimes/downloads/.
Currently I don't have a way to reproduce this deterministically. We run thousands of tests each day and observed crashes like this occasionally. We will runs our tests with all new releases of J9 again. I could provide additional dumps or try setting some VM flags if this helps with debugging the problem (and does not cause significant overhead).
Are there any plans to have Java 18 with J9 0.35 as well? Or maybe 19?
I saw the announcement on Monday that OpenJ9 0.35.0 is available for jdk8, 11, 17. @pshipton I'm wondering if you know there might be 0.35.0 for jdk18 or jdk19. Thanks!
There won't be 0.35 for jdk18 or jdk19. jdk18 is out of support now, and jdk19 isn't ready yet. We are trying for jdk19 with 0.36 (in January), but not sure it's going to make it. If it doesn't make it, we'll have to decide whether to have a special jdk19 release afterwards or skip jdk19 and work on jdk20.
There are nightly builds for jdk19 if you want to try it. It's work in progress and there are known problems, but depending what you are running you may not hit them.
https://openj9-jenkins.osuosl.org/job/Pipeline-Build-Test-JDK19/ Select a build date, click on "build - passing" beside "build" for the platform you want, and there is a download link for the JDK at the top.
Thank you for the insights. Nightly builds are too risky for me to run in our CI. I will wait for the next official release. Feel free to close this ticket if the provided information is not enough to make a change. And thank you for having a look at those crashes. I understand well that getting a crash report without a reproducer can be painful to analyze.
Thank you for your understanding! The jitdmp logs show different methods from com/dynatrace/agent/casp/thirdparty/ComponentClassifier associated with the two crashes. I suspect it could be any other method. Tracing all methods from ComponentClassifier like below might add a bit overhead. I'll keep the issue open for now in case we find more to narrow down the crash
'-Xjit:{com/dynatrace/agent/casp/thirdparty/ComponentClassifier.*}(traceFull,traceILGen,traceCG,log=/tmp/jitlog)'
Moving to 0.38 given the above discussion.
No change in status. Moving out of 0.38.
Moving this to the Backlog for now, as it's been difficult to make a lot of progress investigating the problem.