openj9 icon indicating copy to clipboard operation
openj9 copied to clipboard

SharedClasses.SCM crash vmState=0x00000000

Open pshipton opened this issue 3 years ago • 25 comments

https://openj9-jenkins.osuosl.org/job/Test_openjdk17_j9_extended.system_aarch64_mac_Nightly_testList_0/131 SharedClasses.SCM01.SingleCL_0

https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Test_openjdk17_j9_extended.system_aarch64_mac_Nightly_testList_0/131/system_test_output.tar.gz

SCL5 10:09:23 >> Loaded 20000 classes...
SCL5 10:09:23 >> Total classes loaded = 20001
SCL2 10:09:23 >> Loaded 20000 classes...
SCL2 10:09:23 >> Total classes loaded = 20001
SCL3 10:09:23 >> Loaded 20000 classes...
SCL3 10:09:23 >> Total classes loaded = 20001
STF 10:09:23.760 - Found dump at: /Users/jenkins/workspace/Test_openjdk17_j9_extended.system_aarch64_mac_Nightly_testList_0/aqa-tests/TKG/output_16624215336086/SharedClasses.SCM01.SingleCL_0/20220906-100832-SharedClasses/results/core.20220906.100923.45099.0001.dmp
SCL5 stderr Unhandled exception
SCL5 stderr Type=Segmentation error vmState=0x00000000
SCL5 stderr J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000002
SCL5 stderr Handler1=00000001025E7BE4 Handler2=000000010681EE14 InaccessibleAddress=0000000000000038
SCL5 stderr x0=0000000000000001 x1=0000000118019F00 x2=0000000000000030 x3=0000000000000030
SCL5 stderr x4=0000000000000000 x5=0000000000000072 x6=00000001077B4CF4 x7=000000016DD456E8
SCL5 stderr x8=0000000000000030 x9=0000000000000000 x10=0000000000000030 x11=0000000000000000
SCL5 stderr x12=0000000000000072 x13=00001D0000001D00 x14=000000037FDD0D4C x15=000000016DD45AC0
SCL5 stderr x16=00000001A10EA2A0 x17=000000020FA46FF8 x18=000000037FDD0CC0 x19=000000016DD44F68
SCL5 stderr x20=000000013701C668 x21=0000000118019F00 x22=0000000000000030 x23=000000037FDD0E90
SCL5 stderr x24=0000000000000000 x25=0000000000000000 x26=0000000000000000 x27=000000037FDD0D98
SCL5 stderr x28=000000013701C668 x29(FP)=000000016DD44F10 x30(LR)=000000010771F104 x31(SP)=000000016DD44EC0
SCL5 stderr PC=000000010771F11C SP=000000016DD44EC0
SCL5 stderr v0 0000000000000001 (f: 1.000000, d: 4.940656e-324)
SCL5 stderr v1 0000000000000000 (f: 0.000000, d: 0.000000e+00)
SCL5 stderr v2 0706050403020100 (f: 50462976.000000, d: 7.949929e-275)
SCL5 stderr v3 3faf0a32c01163a6 (f: 3222365184.000000, d: 6.062468e-02)
SCL5 stderr v4 0000000000000000 (f: 0.000000, d: 0.000000e+00)
SCL5 stderr v5 0000000000000000 (f: 0.000000, d: 0.000000e+00)
SCL5 stderr v6 0000000000000000 (f: 0.000000, d: 0.000000e+00)
SCL5 stderr v7 0000000000000000 (f: 0.000000, d: 0.000000e+00)
SCL5 stderr v8 0000000000000000 (f: 0.000000, d: 0.000000e+00)
SCL5 stderr v9 0000000000000000 (f: 0.000000, d: 0.000000e+00)
SCL5 stderr v10 0000000000000000 (f: 0.000000, d: 0.000000e+00)
SCL5 stderr v11 0000000000000000 (f: 0.000000, d: 0.000000e+00)
SCL5 stderr v12 0000000000000000 (f: 0.000000, d: 0.000000e+00)
SCL5 stderr v13 0000000000000000 (f: 0.000000, d: 0.000000e+00)
SCL5 stderr v14 0000000000000000 (f: 0.000000, d: 0.000000e+00)
SCL5 stderr v15 0000000000000000 (f: 0.000000, d: 0.000000e+00)
SCL5 stderr v16 bfd0000000000000 (f: 0.000000, d: -2.500000e-01)
SCL5 stderr v17 3fd566a83816d555 (f: 941020480.000000, d: 3.343907e-01)
SCL5 stderr v18 bf715c4a73a51164 (f: 1940197760.000000, d: -4.238406e-03)
SCL5 stderr v19 3fe62e42fefa39ef (f: 4277811712.000000, d: 6.931472e-01)
SCL5 stderr v20 ffffffffffffffff (f: 4294967296.000000, d: nan)
SCL5 stderr v21 ffffffffffffffff (f: 4294967296.000000, d: nan)
SCL5 stderr v22 ffffffffffffffff (f: 4294967296.000000, d: nan)
SCL5 stderr v23 ffffffffffffffff (f: 4294967296.000000, d: nan)
SCL5 stderr v24 0000000000000000 (f: 0.000000, d: 0.000000e+00)
SCL5 stderr v25 0000000000000000 (f: 0.000000, d: 0.000000e+00)
SCL5 stderr v26 0000000000000000 (f: 0.000000, d: 0.000000e+00)
SCL5 stderr v27 0000000000000000 (f: 0.000000, d: 0.000000e+00)
SCL5 stderr v28 0000000000000000 (f: 0.000000, d: 0.000000e+00)
SCL5 stderr v29 0000000000000000 (f: 0.000000, d: 0.000000e+00)
SCL5 stderr v30 0000000000000000 (f: 0.000000, d: 0.000000e+00)
SCL5 stderr v31 0000000000000000 (f: 0.000000, d: 0.000000e+00)
SCL5 stderr Module=/Users/jenkins/workspace/Test_openjdk17_j9_extended.system_aarch64_mac_Nightly_testList_0/openjdkbinary/j2sdk-image/lib/default/libj9gc29.dylib
SCL5 stderr Module_base_address=0000000107654000 Symbol=_ZN33MM_IndexableObjectAllocationModelC2EP18MM_EnvironmentBaseP7J9Classjm
SCL5 stderr Symbol_address=000000010771EFF0
SCL5 stderr Target=2_90_20220906_139 (Mac OS X 11.4)
SCL5 stderr CPU=aarch64 (8 logical CPUs) (0x400000000 RAM)
SCL5 stderr ----------- Stack Backtrace -----------
SCL5 stderr ---------------------------------------

SharedClasses.SCM23.MultiCL_0

MCL4 10:16:18 >> Loaded 19000 classes...
MCL1 10:16:18 >> Loaded 13000 classes...
MCL5 10:16:19 >> Loaded 15000 classes...
MCL3 10:16:19 >> Loaded 19000 classes...
MCL2 10:16:22 >> Loaded 20000 classes...
MCL2 10:16:22 >> Total classes loaded = 20001
STF 10:16:23.260 - Found dump at: /Users/jenkins/workspace/Test_openjdk17_j9_extended.system_aarch64_mac_Nightly_testList_0/aqa-tests/TKG/output_16624215336086/SharedClasses.SCM23.MultiCL_0/20220906-101010-SharedClasses/results/core.20220906.101623.45478.0001.dmp
MCL4 10:16:23 >> Loaded 20000 classes...
MCL4 10:16:23 >> Total classes loaded = 20001
MCL4 stderr Unhandled exception
MCL4 stderr Type=Segmentation error vmState=0x00000000
MCL4 stderr J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000002
MCL4 stderr Handler1=000000010012BBE4 Handler2=000000010045EE14 InaccessibleAddress=0000000000000008
MCL4 stderr x0=0000000000000000 x1=0000000149039C78 x2=0000000000000000 x3=0000000000000000
MCL4 stderr x4=0000000000000000 x5=00000000000000A0 x6=0000000000000000 x7=0000000000000000
MCL4 stderr x8=0000000000000007 x9=0000000000000010 x10=0000000000000000 x11=0000000000000002
MCL4 stderr x12=0000000000000002 x13=0000000000000000 x14=0000000000000200 x15=0000000000000001
MCL4 stderr x16=00000001A10EA2A0 x17=000000020FA46E48 x18=000000014AAE7858 x19=000000014907C820
MCL4 stderr x20=0000000000000000 x21=0000000130981030 x22=000000020000BFA0 x23=0000000105019000
MCL4 stderr x24=0000000100253720 x25=0000000100491944 x26=0000000000000001 x27=000000014A840D00
MCL4 stderr x28=000000014A811E20 x29(FP)=000000016FEBEC70 x30(LR)=0000000104848F30 x31(SP)=000000016FEBEBE0
MCL4 stderr PC=00000001048DF7B8 SP=000000016FEBEBE0
MCL4 stderr v0 0000000800010008 (f: 65544.000000, d: 1.697600e-313)
MCL4 stderr v1 fff7fffffff7ffff (f: 4294443008.000000, d: nan)
MCL4 stderr v2 41cdcd6500000000 (f: 0.000000, d: 1.000000e+09)
MCL4 stderr v3 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v4 0000000000000003 (f: 3.000000, d: 1.482197e-323)
MCL4 stderr v5 0000000000000002 (f: 2.000000, d: 9.881313e-324)
MCL4 stderr v6 0000080000000800 (f: 2048.000000, d: 4.345847e-311)
MCL4 stderr v7 0000000000000006 (f: 6.000000, d: 2.964394e-323)
MCL4 stderr v8 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v9 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v10 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v11 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v12 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v13 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v14 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v15 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v16 bfd0000000000000 (f: 0.000000, d: -2.500000e-01)
MCL4 stderr v17 3fd54de9fd555555 (f: 4250227968.000000, d: 3.328805e-01)
MCL4 stderr v18 3f5da680a5770133 (f: 2776039680.000000, d: 1.809717e-03)
MCL4 stderr v19 3fe62e42fefa39ef (f: 4277811712.000000, d: 6.931472e-01)
MCL4 stderr v20 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v21 ffffffffffffffff (f: 4294967296.000000, d: nan)
MCL4 stderr v22 ffffffffffffffff (f: 4294967296.000000, d: nan)
MCL4 stderr v23 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v24 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v25 0102564d01763a47 (f: 24525384.000000, d: 8.356133e-304)
MCL4 stderr v26 a1937e85a07ed3cf (f: 2692666368.000000, d: -6.098291e-147)
MCL4 stderr v27 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v28 08f5503c09d8f55f (f: 165213536.000000, d: 1.652483e-265)
MCL4 stderr v29 00dc2b23015eb2cd (f: 22983372.000000, d: 1.604531e-304)
MCL4 stderr v30 9db62bf99c4fff2b (f: 2622488320.000000, d: -1.503983e-165)
MCL4 stderr v31 0000000041040000 (f: 1090781184.000000, d: 5.389175e-315)
MCL4 stderr JVMDUMP039I Processing dump event "abort", detail "" at 2022/09/06 10:16:23 - please wait.
MCL4 stderr Module=/Users/jenkins/workspace/Test_openjdk17_j9_extended.system_aarch64_mac_Nightly_testList_0/openjdkbinary/j2sdk-image/lib/default/libj9jit29.dylib
MCL4 stderr Module_base_address=0000000104800000 Symbol=_ZN2J97Monitor9notifyAllEv
MCL4 stderr Symbol_address=00000001048DF7B8
MCL4 stderr Target=2_90_20220906_139 (Mac OS X 11.4)
MCL4 stderr CPU=aarch64 (8 logical CPUs) (0x400000000 RAM)
MCL4 stderr ----------- Stack Backtrace -----------
MCL4 stderr ---------------------------------------

pshipton avatar Sep 06 '22 17:09 pshipton

@knn-k @dmitripivkine fyi

pshipton avatar Sep 06 '22 17:09 pshipton

I can see we are in Indexable Object Allocation path:

> !MM_AllocateDescription 0x16DD45230
MM_AllocateDescription at 0x16dd45230 {
  Fields for MM_Base:
  Fields for MM_AllocateDescription:
 0x0: U64 _bytesRequested = 0x0000000000000028 (40)
 0x8: U64 _allocateFlags = 0x0000000000000040 (64)
 0x10: U32 _objectFlags = 0x00000000 (0)
 0x14: bool _allocationSucceeded = true
 0x18: class MM_MemorySpace* _memorySpace = !mm_memoryspace 0x0000000136A1B310
 0x20: class MM_MemorySubSpace* _memorySubSpace = !mm_memorysubspacegeneric 0x0000000136A18D40
 0x28: U64 _allocationTaxSize = 0x0000000000000000 (0)
 0x30: bool _tlhAllocation = false
 0x31: bool _nurseryAllocation = false
 0x32: bool _loaAllocation = false
 0x38: U64 _spineBytes = 0x0000000000000028 (40)
 0x40: U64 _numArraylets = 0x0000000000000001 (1)
 0x48: bool _chunkedArray = false
 0x49: bool _dataAdjacentToHeader = false
 0x50: struct J9IndexableObject* _spine = !j9indexableobject 0x000000037FDD0D70
 0x58: bool _threadAtSafePoint = true
 0x60: class MM_MemoryPool* _memoryPool = !mm_memorypool 0x0000000000000000
 0x68: bool _collectorAllocateExpandOnFailure = false
 0x69: bool _collectorAllocateSatisfyAnywhere = false
 0x6c: MM_MemorySubSpace$AllocationType _allocationType = 0x0 (0) //ALLOCATION_TYPE_INVALID
 0x70: bool _collectAndClimb = true
 0x71: bool _climb = false
 0x72: bool _completedFromTlh = true
}

> !MM_IndexableObjectAllocationModel 0x16DD45218
MM_IndexableObjectAllocationModel at 0x16dd45218 {
  Fields for MM_Base:
  Fields for MM_AllocateInitialization:
	0x0: const U64 _allocationCategory = 0x0000000000000001 (1)
	0x8: const U64 _requestedSizeInBytes = 0x0000000000000000 (0)
	0x10: bool _isAllocatable = true
	0x18: class MM_AllocateDescription _allocateDescription = !mm_allocatedescription 0x000000016DD45230
  Fields for MM_JavaObjectAllocationModel:
	0x90: struct J9Class* _class = !j9class 0x0000000118019F00 // [B
  Fields for MM_IndexableObjectAllocationModel:
	0x98: const U32 _numberOfIndexedFields = 0x0000000C (12)
	0xa0: const U64 _dataSize = 0x0000000000000010 (16)
	0xa8: const enum GC_ArrayletObjectModelBase::ArrayLayout _layout = 0x1 (1) //InlineContiguous
	0xac: const bool _alignSpineDataSection = false
	0xb0: const U64 _numberOfArraylets = 0x0000000000000001 (1)
}


> !j9indexableobject 0x000000037FDD0D70
!J9IndexableObject 0x000000037FDD0D70 {
    struct J9Class* clazz = !j9arrayclass 0x118019F00   // [B
    Object flags = 0x00000000;
    U_32 size = 0x0000000C;
	[0] =  83, 0x53
	[1] =   0, 0x00
	[2] =  73, 0x49
	[3] =   0, 0x00
	[4] =  71, 0x47
	[5] =   0, 0x00
	[6] =  66, 0x42
	[7] =   0, 0x00
	[8] =  85, 0x55
	[9] =   0, 0x00
	[10] =  83, 0x53
	[11] =   0, 0x00
}

Currently still don't understand where/why it crashed (dereferenced NULL pointer somewhere obviously). Continue investigation. @knn-k Would be very helpful if you can provide low level native details for this crash, would you please help with this?

dmitripivkine avatar Sep 06 '22 21:09 dmitripivkine

It seems _omrVM is NULL in the following location in numArraylets(), which is inlined in MM_IndexableObjectAllocationModel constructor:

https://github.com/eclipse-openj9/openj9/blob/8186943144f98dccaf4ba48699dc923fe81b4e3c/runtime/gc_glue_java/ArrayletObjectModelBase.hpp#L237

See the disassembled instructions below. Register x9 for _omrVM is 0x0 at the address pointed by PC. x9 + 56 is equal to the InaccessibleAddress, 0x38:

   cb118: 29 81 40 f9   ldr     x9, [x9, #256] // Load _omrVM
   cb11c: 2a 1d 40 f9   ldr     x10, [x9, #56] // leafSize = _omrVM->_arrayletLeafSize / PC points here
   cb120: 5f 05 00 b1   cmn     x10, #1 // Check (UDATA_MAX != leafSize)
   cb124: 80 01 00 54   b.eq    0xcb154
   cb128: 4a 05 00 d1   sub     x10, x10, #1 // leafSizeMask = leafSize - 1
   cb12c: 29 21 40 f9   ldr     x9, [x9, #64] // leafLogSize = _omrVM->_arrayletLeafLogSize

knn-k avatar Sep 07 '22 03:09 knn-k

@knn-k Thank you very much!

dmitripivkine avatar Sep 07 '22 12:09 dmitripivkine

Interesting, _omrVM is set properly. If it would not be any Indexable object allocation crash.

> !mm_gcextensions 0x000000013780BA20
        ......
	0xf8: class GC_ArrayletObjectModel indexableObjectModel = !gc_arrayletobjectmodel 0x000000013780BB18
        ......

> !gc_arrayletobjectmodel 0x000000013780BB18
GC_ArrayletObjectModel at 0x13780bb18 {
  Fields for GC_ArrayletObjectModelBase:
	0x0: void** _vptr$GC_ArrayletObjectModelBase = !j9x 0x000000010785E250
	0x8: struct OMR_VM* _omrVM = !omr_vm 0x0000000137016870 <-------------
	0x10: void* _arrayletRangeBase = !j9x 0x0000000000000000
	0x18: void* _arrayletRangeTop = !j9x 0xFFFFFFFFFFFFFFFF
	0x20: class MM_MemorySubSpace* _arrayletSubSpace = !mm_memorysubspace 0x0000000000000000
	0x28: UDATA _largestDesirableArraySpineSize = 0xFFFFFFFFFFFFFFFF (-1)
  Fields for GC_ArrayletObjectModel:
}

So, somehow for this particular array allocation it was read as NULL. Either it has been read from wrong location (ex. stack corruption ?) or this is machine failure

dmitripivkine avatar Sep 07 '22 13:09 dmitripivkine

Looks like pointer to GC Extensions OMR_VM->_gcOmrVMExtensions is broken. It is set to 0x37FDD0D98:

> !omr_vm 0x0000000137016870
OMR_VM at 0x137016870 {
  Fields for OMR_VM:
	0x0: struct OMR_Runtime* _runtime = !omr_runtime 0x0000000137016838
	0x8: void* _language_vm = !j9x 0x0000000137011A20
	0x10: struct OMR_VM* _linkNext = !omr_vm 0x0000000137016870
	0x18: struct OMR_VM* _linkPrevious = !omr_vm 0x0000000137016870
	0x20: struct OMR_VMThread* _vmThreadList = !omr_vmthread 0x000000013701B178
	0x28: struct J9ThreadMonitor* _vmThreadListMutex = !j9threadmonitor 0x0000000137008AB0
	0x30: U64 _vmThreadKey = 0x0000000000000006 (6)
	0x38: U64 _arrayletLeafSize = 0xFFFFFFFFFFFFFFFF (18446744073709551615)
	0x40: U64 _arrayletLeafLogSize = 0x0000000000000000 (0)
	0x48: U64 _compressedPointersShift = 0x0000000000000000 (0)
	0x50: U64 _objectAlignmentInBytes = 0x0000000000000008 (8)
	0x58: U64 _objectAlignmentShift = 0x0000000000000003 (3)
	0x60: void* _gcOmrVMExtensions = !j9x 0x000000037FDD0D98 <-----------------------
	0x68: struct OMR_VMConfiguration _configuration = !omr_vmconfiguration 0x00000001370168D8
	0x70: U64 _languageThreadCount = 0x0000000000000000 (0)
	0x78: U64 _internalThreadCount = 0x0000000000000000 (0)
	0x80: struct OMR_ExclusiveVMAccessStats exclusiveVMAccessStats = !omr_exclusivevmaccessstats 0x00000001370168F0
	0xb0: U64 gcPolicy = 0x0000000000000000 (0)
	0xb8: struct OMR_SysInfo* sysInfo = !omr_sysinfo 0x0000000000010000
	0xc0: struct OMR_SizeClasses* _sizeClasses = !omr_sizeclasses 0x0000000000000010
	0xc8: struct UtInterface* utIntf = !utinterface 0x0000000000000005
	0xd0: struct OMR_Agent* _hcAgent = !omr_agent 0x000000013701697A
	0xd8: struct J9ThreadMonitor* _omrTIAccessMutex = !j9threadmonitor 0x000000037FDD0CA0
	0xe0: struct OMRTraceEngine* _trcEngine = !omrtraceengine 0x0000000000000000
	0xe8: void* _methodDictionary = !j9x 0x0000000000000000
	0xf0: struct J9ThreadMonitor* _gcCycleOnMonitor = !j9threadmonitor 0x0000000000080000
	0xf8: U64 _gcCycleOn = 0x0000000000000000 (0)
}

It should be the same as j9javavm->gcExtensions 0x13780BA20 <--- this is correct one.

And yes, if we dereference OMR_VM->_gcOmrVMExtensions->indexableObjectModel->omrVM there is NULL:

OMR_VM->_gcOmrVMExtensions->indexableObjectModel = 0x37FDD0D98 + 0xf8 = 0x37FDD0E90
(see x23=000000037FDD0E90 from crash registers)

> !gc_arrayletobjectmodel 0x37FDD0E90
GC_ArrayletObjectModel at 0x37fdd0e90 {
  Fields for GC_ArrayletObjectModelBase:
	0x0: void** _vptr$GC_ArrayletObjectModelBase = !j9x 0x0000000000000000
	0x8: struct OMR_VM* _omrVM = !omr_vm 0x0000000000000000 <---------------
	0x10: void* _arrayletRangeBase = !j9x 0x0000000000000000
	0x18: void* _arrayletRangeTop = !j9x 0x0000000000000000
	0x20: class MM_MemorySubSpace* _arrayletSubSpace = !mm_memorysubspace 0x0000000000000000
	0x28: UDATA _largestDesirableArraySpineSize = 0x0000000000000000 (0)
  Fields for GC_ArrayletObjectModel:
}

dmitripivkine avatar Sep 07 '22 13:09 dmitripivkine

OMR_VM->_gcOmrVMExtensions is set originally (and never modified after) in gcInitializeDefaults() and it is the same as j9javavm->gcExtensions:

	extensions->setOmrVM(vm->omrVM);
	vm->omrVM->_gcOmrVMExtensions = (void *)extensions;
	vm->gcExtensions = vm->omrVM->_gcOmrVMExtensions;

Looks like we have deal with memory corruption in OMR_VM->_gcOmrVMExtensions

dmitripivkine avatar Sep 07 '22 13:09 dmitripivkine

https://openj9-jenkins.osuosl.org/job/Test_openjdk17_j9_extended.system_aarch64_mac_Nightly_testList_1/132 SharedClasses.SCM23.SingleCL_0

I don't see any more information or diagnostic files. https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Test_openjdk17_j9_extended.system_aarch64_mac_Nightly_testList_1/132/system_test_output.tar.gz

SCL5 10:18:31 >> Loaded 20000 classes...
SCL5 10:18:31 >> Total classes loaded = 20001
SCL1 10:18:31 >> Loaded 20000 classes...
SCL1 10:18:31 >> Total classes loaded = 20001
STF 10:18:31.818 - **FAILED** Process SCL5 ended with exit code (255) and not the expected exit code/s (0)

pshipton avatar Sep 07 '22 15:09 pshipton

next slot to corrupted in OMR VM does not look correct also:

> !omr_vmconfiguration 0x00000001370168D8
OMR_VMConfiguration at 0x1370168d8 {
  Fields for OMR_VMConfiguration:
	0x0: U64 _maximum_thread_count = 0x000000037FDD0D50 (15030095184)
}

dmitripivkine avatar Sep 07 '22 18:09 dmitripivkine

Both bogus values written over OMR VM slots 0x37FDD0D50 and 0x37FDD0D98 are object pointers in the heap (Nursery):

> !j9object 0x37FDD0D50
J9VMJavaLangString at 0x000000037FDD0D50 {
struct J9Class* clazz = !j9class 0x11801EC00 // java/lang/String
Object flags = 0x00000000;
[B value = !fj9object 0x37fdd0d70 (offset = 0) (java/lang/String)
B coder = 0x00000001 (offset = 8) (java/lang/String)
I hash = 0x00000000 (offset = 12) (java/lang/String)
Z hashIsZero = 0x00000000 (offset = 16) (java/lang/String)
"SIGBUS"
}
> !j9object 0x000000037FDD0D98
J9VMJavaLangString at 0x000000037FDD0D98 {
struct J9Class* clazz = !j9class 0x11801EC00 // java/lang/String
Object flags = 0x00000000;
[B value = !fj9object 0x0 (offset = 0) (java/lang/String)
B coder = 0x00000000 (offset = 8) (java/lang/String)
I hash = 0x00000000 (offset = 12) (java/lang/String)
Z hashIsZero = 0x00000000 (offset = 16) (java/lang/String)
"<Uninitialized String>"
}

dmitripivkine avatar Sep 07 '22 20:09 dmitripivkine

This is speculation but considering String object !j9object 0x000000037FDD0D98 array has not been initialized yet (is object it crashed at !j9indexableobject 0x000000037FDD0D70 intended for it?) the code doing corruption in OMR VM (compiled method?) should be active now and be around

dmitripivkine avatar Sep 07 '22 20:09 dmitripivkine

raw dump of memory at OMR VM location reveals pattern - there is something looks like linked list structure is written over:

0x137016870 :  0000000137016838 0000000137011a20 [ 8h.7.... ..7.... ]
0x137016880 :  0000000137016870 0000000137016870 [ ph.7....ph.7.... ]
0x137016890 :  000000013701b178 0000000137008ab0 [ x..7.......7.... ]
0x1370168A0 :  0000000000000006 ffffffffffffffff [ ................ ]
0x1370168B0 :  0000000000000000 0000000000000000 [ ................ ]
0x1370168C0 :  0000000000000008 0000000000000003 [ ................ ]
0x1370168D0 :  000000037fdd0d98 000000037fdd0d50 [ ........P....... ]<--- !j9object 0x37fdd0d98 String <Uninitialized String>; !j9object 0x37fdd0d50 String "SIGBUS"
0x1370168E0 :  0000000000000000 0000000000000000 [ ................ ]
0x1370168F0 :  0000000000080000 0000000000000000 [ ................ ]
0x137016900 :  0000000158f03f15 0000000137016912 [ .?.X.....i.7.... ]<--- 0x137016910 "next" pointer + bit 0x2 ???
0x137016910 :  000000037fdd0ce8 0000000280213528 [ ........(5!..... ]<-- !j9object 0x37fdd0ce8 String "java.lang.InternalError"; !j9object 0x280213528 // jdk/internal/loader/ClassLoaders$BootClassLoader
0x137016920 :  0000000000000000 0000000000010000 [ ................ ]
0x137016930 :  0000000000000010 0000000000000005 [ ................ ]
0x137016940 :  000000013701697a 000000037fdd0ca0 [ zi.7............ ]<--- 0x137016978 "next" pointer???; !j9object 0x37fdd0ca0 String "SIGBUS"
0x137016950 :  0000000000000000 0000000000000000 [ ................ ]
0x137016960 :  0000000000080000 0000000000000000 [ ................ ]
0x137016970 :  0000000158f03f15 0000000137016982 [ .?.X.....i.7.... ] <--- 137016980  "next" pointer???; 
0x137016980 :  000000037fdd0c38 0000000280213528 [ 8.......(5!..... ]<--- !j9object 0x37fdd0c38 String "java.lang.InternalError"; !j9object 0x280213528 // jdk/internal/loader/ClassLoaders$BootClassLoader
0x137016990 :  0000000000000000 0000000000010000 [ ................ ]
0x1370169A0 :  0000000000000010 0000000000000005 [ ................ ]
0x1370169B0 :  00000001370169ea 000000037fdd0bf0 [ .i.7............ ]<--- !j9object 0x37fdd0bf0 String "SIGBUS"
0x1370169C0 :  0000000000000000 0000000000000000 [ ................ ]
0x1370169D0 :  0000000000080000 0000000000000000 [ ................ ]
0x1370169E0 :  0000000158f03f15 00000001370169f2 [ .?.X.....i.7.... ]
0x1370169F0 :  000000037fdd0b88 0000000280213528 [ ........(5!..... ]<--- !j9object 0x37fdd0b88 String "java.lang.InternalError"
.... etc.....
0x137016A00 :  0000000000000000 0000000000010000 [ ................ ]
0x137016A10 :  0000000000000010 0000000000000005 [ ................ ]
0x137016A20 :  0000000137016a5a 000000037fdd0b40 [ Zj.7....@....... ]
0x137016A30 :  0000000000000000 0000000000000000 [ ................ ]
0x137016A40 :  0000000000080000 0000000000000000 [ ................ ]
0x137016A50 :  0000000158f03f15 0000000137016a62 [ .?.X....bj.7.... ]
0x137016A60 :  000000037fdd0ad8 0000000280213528 [ ........(5!..... ]
0x137016A70 :  0000000000000000 0000000000010000 [ ................ ]
0x137016A80 :  0000000000000010 0000000000000005 [ ................ ]
0x137016A90 :  0000000137016aca 000000037fdd0a90 [ .j.7............ ]
0x137016AA0 :  0000000000000000 0000000000000000 [ ................ ]
0x137016AB0 :  0000000000080000 0000000000000000 [ ................ ]
0x137016AC0 :  0000000158f03f15 0000000137016ad2 [ .?.X.....j.7.... ]
0x137016AD0 :  000000037fdd0a28 0000000280213528 [ (.......(5!..... ]
0x137016AE0 :  0000000000000000 0000000000010000 [ ................ ]
0x137016AF0 :  0000000000000010 0000000000000005 [ ................ ]
0x137016B00 :  0000000137016b3a 000000037fdd09e0 [ :k.7............ ]
0x137016B10 :  0000000000000000 0000000000000000 [ ................ ]
0x137016B20 :  0000000000080000 0000000000000000 [ ................ ]
0x137016B30 :  0000000158f03f15 0000000137016b42 [ .?.X....Bk.7.... ]
0x137016B40 :  000000037fdd0978 0000000280213528 [ x.......(5!..... ]
0x137016B50 :  0000000000000000 0000000000010000 [ ................ ]
0x137016B60 :  0000000000000010 0000000000000005 [ ................ ]
0x137016B70 :  0000000137016baa 000000037fdd0930 [ .k.7....0....... ]

dmitripivkine avatar Sep 08 '22 14:09 dmitripivkine

https://openj9-jenkins.osuosl.org/job/Test_openjdk17_j9_extended.system_aarch64_mac_Nightly_testList_0/133 SharedClasses.SCM23.MultiCL_0 -Xjit -Xgcpolicy:gencon -Xnocompressedrefs

https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Test_openjdk17_j9_extended.system_aarch64_mac_Nightly_testList_0/133/system_test_output.tar.gz

MCL3 10:55:56 >> Total classes loaded = 20001
MCL3 stderr Unhandled exception
MCL3 stderr Type=Segmentation error vmState=0x00020019
MCL3 stderr J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000002
MCL3 stderr Handler1=0000000100B03BE0 Handler2=0000000100CD2E14 InaccessibleAddress=F4E8000000013849
MCL3 stderr x0=00000001380192AA x1=0000000100C2B720 x2=0000000000017302 x3=0000000000000000
MCL3 stderr x4=000000016F8E5240 x5=00000000000000A0 x6=0000000000000000 x7=0000000000000000
MCL3 stderr x8=F4E8000000013801 x9=E06A000000013801 x10=F43D000000013801 x11=0000000000000000
MCL3 stderr x12=F438000000013801 x13=0000120000001200 x14=0000000000001200 x15=0000000000000001
MCL3 stderr x16=0000000188CE9278 x17=00000001F77A1AB0 x18=000000037FC4FF10 x19=0000000138019068
MCL3 stderr x20=E06A000000013801 x21=0000000138014778 x22=0000000138021EA0 x23=000000012E018420
MCL3 stderr x24=0000000000000157 x25=000000013F218060 x26=0000000105EF515C x27=000000016F8E56C0
MCL3 stderr x28=000000012F8053E0 x29(FP)=000000016F8E5280 x30(LR)=0000000105D0B510 x31(SP)=000000016F8E5270
MCL3 stderr PC=0000000105D0B5E0 SP=000000016F8E5270
MCL3 stderr v0 00000000000000ff (f: 255.000000, d: 1.259867e-321)
MCL3 stderr v1 ffffffffffffffff (f: 4294967296.000000, d: nan)
MCL3 stderr v2 0706050403020100 (f: 50462976.000000, d: 7.949929e-275)
MCL3 stderr v3 bfb93358dd593a69 (f: 3713612288.000000, d: -9.843974e-02)
MCL3 stderr v4 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL3 stderr v5 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL3 stderr v6 0000000000000001 (f: 1.000000, d: 4.940656e-324)
MCL3 stderr v7 0000000000000001 (f: 1.000000, d: 4.940656e-324)
MCL3 stderr v8 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL3 stderr v9 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL3 stderr v10 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL3 stderr v11 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL3 stderr v12 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL3 stderr v13 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL3 stderr v14 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL3 stderr v15 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL3 stderr v16 bfd0000000000000 (f: 0.000000, d: -2.500000e-01)
MCL3 stderr v17 3fd54c81d0d55555 (f: 3503641856.000000, d: 3.327946e-01)
MCL3 stderr v18 3f61a22c58eb7472 (f: 1491825792.000000, d: 2.152526e-03)
MCL3 stderr v19 3fe62e42fefa39ef (f: 4277811712.000000, d: 6.931472e-01)
MCL3 stderr v20 ffffffffffffffff (f: 4294967296.000000, d: nan)
MCL3 stderr v21 ffffffffffffffff (f: 4294967296.000000, d: nan)
MCL3 stderr v22 ffffffffffffffff (f: 4294967296.000000, d: nan)
MCL3 stderr v23 ffffffffffffffff (f: 4294967296.000000, d: nan)
MCL3 stderr v24 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL3 stderr v25 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL3 stderr v26 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL3 stderr v27 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL3 stderr v28 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL3 stderr v29 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL3 stderr v30 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL3 stderr v31 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL3 stderr Module=/Users/jenkins/workspace/Test_openjdk17_j9_extended.system_aarch64_mac_Nightly_testList_0/openjdkbinary/j2sdk-image/lib/default/libj9gc29.dylib
MCL3 stderr Module_base_address=0000000105CE0000 Symbol=_ZN23GC_OMRVMThreadInterface16flushCachesForGCEP18MM_EnvironmentBase
MCL3 stderr Symbol_address=0000000105D0B5C8
MCL3 stderr Target=2_90_20220908_141 (Mac OS X 11.5.2)
MCL3 stderr CPU=aarch64 (8 logical CPUs) (0x400000000 RAM)
MCL3 stderr ----------- Stack Backtrace -----------
MCL3 stderr ---------------------------------------

pshipton avatar Sep 08 '22 23:09 pshipton

It is in GC_OMRVMThreadInterface::flushCachesForGC(MM_EnvironmentBase *env). The data pointed by env or the env pointer itself seem to be broken. The !mm_environmentbase command fails as follows:

> !mm_environmentbase 0x138019068 (x19 value in the register dump)
Problem running command:
Cannot invoke "com.ibm.j9ddr.StructureReader$StructureDescriptor.getSuperName()" because "desc" is null

> hexdump 0x138019068

138019068: 00000000 00000000 2dec6213 01000000  |........-.b.....|
138019078: 82900138 01000000 58f8c47f 03000000  |...8....X.......|
138019088: f8c62180 02000000 00000000 00000000  |..!.............|
138019098: 00000100 00000000 10000000 00000000  |................|
1380190a8: 05000000 00000000 ea900138 01000000  |...........8....|
1380190b8: 10f8c47f 03000000 00000000 00000000  |................|
1380190c8: 00000000 00000000 00000800 00000000  |................|
1380190d8: 00000000 00000000 2dec6213 01000000  |........-.b.....|
1380190e8: f2900138 01000000 a8f7c47f 03000000  |...8............|
1380190f8: f8c62180 02000000 00000000 00000000  |..!.............|
138019108: 00000100 00000000 10000000 00000000  |................|
138019118: 05000000 00000000 5a910138 01000000  |........Z..8....|
138019128: 60f7c47f 03000000 00000000 00000000  |`...............|
138019138: 00000000 00000000 00000800 00000000  |................|
138019148: 00000000 00000000 2dec6213 01000000  |........-.b.....|
138019158: 62910138 01000000 f8f6c47f 03000000  |b..8............|
000000000002b5c8 <GC_OMRVMThreadInterface::flushCachesForGC(MM_EnvironmentBase*)>:
   ...
   2b5d4: f3 03 00 aa   mov     x19, x0 // x19=0x138019068 (env)
   2b5d8: 00 04 41 f9   ldr     x0, [x0, #520] // x0=0x1380192AA -- not a correct pointer
   2b5dc: 08 00 40 f9   ldr     x8, [x0] // x8=0xF4E8000000013801 -- not a correct pointer, either
   2b5e0: 08 25 40 f9   ldr     x8, [x8, #72] // PC points here; x8+72 = InaccessibleAddress
   2b5e4: e1 03 13 aa   mov     x1, x19
   2b5e8: 00 01 3f d6   blr     x8

knn-k avatar Sep 09 '22 00:09 knn-k

GC Environment address has been taken from !j9vmthread 0x138013D00:

> !j9vmthread 0x138013D00 | grep gcExtensions
	0x628: void* gcExtensions = !j9x 0x0000000138019068

Data structure for !j9vmthread 0x138013D00 itself looks intact.

The memory at 0x138019068 it is corrupted by some kind of list-like structure with element size 0x70

dmitripivkine avatar Sep 09 '22 18:09 dmitripivkine

https://openj9-jenkins.osuosl.org/job/Test_openjdk17_j9_extended.system_aarch64_mac_Release_testList_0/14/ SharedClasses.SCM23.MultiCL_0 -Xjit -Xgcpolicy:gencon -Xnocompressedrefs

https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Test_openjdk17_j9_extended.system_aarch64_mac_Release_testList_0/14/system_test_output.tar.gz

MCL4 01:18:14 >> Loaded 20000 classes...
MCL4 01:18:14 >> Total classes loaded = 20001
MCL4 stderr Unhandled exception
MCL4 stderr Type=Segmentation error vmState=0x000501ff
MCL4 stderr J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000002
MCL4 stderr Handler1=0000000100A317DC Handler2=0000000100D16E14 InaccessibleAddress=0000000000000000
MCL4 stderr x0=00000001009D9E28 x1=0000000000000018 x2=000000016FB39DF0 x3=000000014382E500
MCL4 stderr x4=000000016FB39F58 x5=543A254E29E1F295 x6=0000000000000001 x7=0000000000000000
MCL4 stderr x8=0000000000000000 x9=000000007FBAAAF9 x10=000000007FBAAAF8 x11=000000010567EE48
MCL4 stderr x12=0000000000000005 x13=0000000000000046 x14=0000000000000000 x15=0000000000000002
MCL4 stderr x16=0000000100D938F8 x17=0000000131E07000 x18=0000000000000000 x19=000000014382E500
MCL4 stderr x20=0000000105696438 x21=000000000000000B x22=000000014382B850 x23=0000000100D5E000
MCL4 stderr x24=0000000000000018 x25=000000016FB417A0 x26=000000010568BC26 x27=0000000000000018
MCL4 stderr x28=0000000100D16E14 x29(FP)=000000016FB39EB0 x30(LR)=0000000100D16F38 x31(SP)=000000016FB39DB0
MCL4 stderr Unhandled exception
MCL4 stderr Type=Segmentation error vmState=0x00000000
MCL4 stderr PC=0000000105053F0C SP=000000016FB39DB0
MCL4 stderr J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000002
MCL4 stderr Handler1=0000000100A317DC Handler2=0000000100D16E14 InaccessibleAddress=00000000000000A8
MCL4 stderr v0 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v1 0706050403020100 (f: 50462976.000000, d: 7.949929e-275)
MCL4 stderr v2 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v3 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v4 000003f0000003f0 (f: 1008.000000, d: 2.138972e-311)
MCL4 stderr v5 000003f0000003f0 (f: 1008.000000, d: 2.138972e-311)
MCL4 stderr v6 000003b8000003f0 (f: 1008.000000, d: 2.020140e-311)
MCL4 stderr v7 00000370000003b8 (f: 952.000000, d: 1.867356e-311)
MCL4 stderr v8 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v9 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v10 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v11 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v12 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v13 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v14 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v15 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v16 0000001800000018 (f: 24.000000, d: 5.092790e-313)
MCL4 stderr v17 0000001800000018 (f: 24.000000, d: 5.092790e-313)
MCL4 stderr v18 0000001800000018 (f: 24.000000, d: 5.092790e-313)
MCL4 stderr v19 0000001800000018 (f: 24.000000, d: 5.092790e-313)
MCL4 stderr v20 ffffffffffffffff (f: 4294967296.000000, d: nan)
MCL4 stderr v21 ffffffffffffffff (f: 4294967296.000000, d: nan)
MCL4 stderr v22 ffffffffffffffff (f: 4294967296.000000, d: nan)
MCL4 stderr v23 ffffffffffffffff (f: 4294967296.000000, d: nan)
MCL4 stderr v24 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v25 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v26 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v27 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v28 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v29 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v30 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v31 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr x0=0000000000000001 x1=0000000000000000 x2=00000001430B0940 x3=0000000000000190
MCL4 stderr x4=00000001430B2040 x5=0000000000000008 x6=0000000000000000 x7=0000000000000000
MCL4 stderr x8=0000000131E5A990 x9=0000000132877C80 x10=0000000000000010 x11=0000000000000000
MCL4 stderr x12=0000000000000000 x13=000000000000006C x14=0000000000000001 x15=00000000000007A9
MCL4 stderr x16=0000000000000AC9 x17=0000000000000041 x18=0000000000000000 x19=0000000128981DF0
MCL4 stderr x20=0000000132877C20 x21=0000000131E5A990 x22=0000000143837300 x23=0000000000000001
MCL4 stderr x24=0000000131E5A998 x25=0000000124670F08 x26=00000001251A87D8 x27=000000010582100C
MCL4 stderr x28=0000000000000000 x29(FP)=000000016FC4EF40 x30(LR)=000000010504E2DC x31(SP)=000000016FC4EB90
MCL4 stderr PC=000000010504E348 SP=000000016FC4EB90
MCL4 stderr v0 1000000010000000 (f: 268435456.000000, d: 1.288230e-231)
MCL4 stderr v1 ffffffefffffffef (f: 4294967296.000000, d: nan)
MCL4 stderr v2 41cdcd6500000000 (f: 0.000000, d: 1.000000e+09)
MCL4 stderr v3 2f676e616c2f6176 (f: 1815044480.000000, d: 2.470161e-80)
MCL4 stderr v4 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v5 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v6 0000000000000001 (f: 1.000000, d: 4.940656e-324)
MCL4 stderr v7 0000000000000001 (f: 1.000000, d: 4.940656e-324)
MCL4 stderr v8 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v9 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v10 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v11 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v12 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v13 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v14 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v15 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v16 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v17 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v18 0000000000000001 (f: 1.000000, d: 4.940656e-324)
MCL4 stderr v19 0000000000000001 (f: 1.000000, d: 4.940656e-324)
MCL4 stderr v20 ffffffffffffffff (f: 4294967296.000000, d: nan)
MCL4 stderr v21 ffffffffffffffff (f: 4294967296.000000, d: nan)
MCL4 stderr v22 ffffffffffffffff (f: 4294967296.000000, d: nan)
MCL4 stderr v23 ffffffffffffffff (f: 4294967296.000000, d: nan)
MCL4 stderr v24 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v25 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v26 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v27 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v28 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v29 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v30 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v31 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr JVMDUMP039I Processing dump event "abort", detail "" at 2022/09/11 01:18:15 - please wait.
MCL4 stderr Module=/Users/jenkins/workspace/Test_openjdk17_j9_extended.system_aarch64_mac_Release_testList_0/openjdkbinary/j2sdk-image/lib/default/libj9jit29.dylib
MCL4 stderr Module_base_address=0000000105000000 Symbol=_ZL16jitSignalHandlerP13J9PortLibraryjPvS1_
MCL4 stderr Symbol_address=0000000105053ED0
MCL4 stderr 
MCL4 stderr Method_being_compiled=net/openj9/sc/classes/TestClass_9675.makeString(I)Ljava/lang/String;
MCL4 stderr Target=2_90_20220910_19 (Mac OS X 11.4)
MCL4 stderr CPU=aarch64 (8 logical CPUs) (0x400000000 RAM)
MCL4 stderr ----------- Stack Backtrace -----------
MCL4 stderr ---------------------------------------
MCL4 stderr JVMDUMP039I Processing dump event "gpf", detail "" at 2022/09/11 01:18:15 - please wait.
MCL4 stderr Module=/Users/jenkins/workspace/Test_openjdk17_j9_extended.system_aarch64_mac_Release_testList_0/openjdkbinary/j2sdk-image/lib/default/libj9jit29.dylib
MCL4 stderr Module_base_address=0000000105000000 Symbol=_ZN2TR24CompilationInfoPerThread7requeueEv
MCL4 stderr Symbol_address=000000010504E29C
MCL4 stderr Target=2_90_20220910_19 (Mac OS X 11.4)
MCL4 stderr CPU=aarch64 (8 logical CPUs) (0x400000000 RAM)
MCL4 stderr ----------- Stack Backtrace -----------
MCL4 stderr ---------------------------------------

pshipton avatar Sep 12 '22 14:09 pshipton

Tagging as a blocker for amac being removed from EA.

pshipton avatar Sep 13 '22 14:09 pshipton

https://openj9-jenkins.osuosl.org/job/Test_openjdk19_j9_extended.system_aarch64_mac_Nightly_testList_1/5 SharedClasses.SCM23.SingleCL_0

No diagnostic files - https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Test_openjdk19_j9_extended.system_aarch64_mac_Nightly_testList_1/5/system_test_output.tar.gz

SCL3 stderr Unhandled exception
SCL3 stderr Type=Segmentation error vmState=0x00000000
SCL3 stderr J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000002
SCL3 stderr Handler1=00000001011AA124 Handler2=000000010137AE1C InaccessibleAddress=000000009EA19B78
SCL3 stderr x0=000000009EA19B78 x1=000000010669DDF3 x2=000000016F5E2A40 x3=0000000000000001
SCL3 stderr x4=000000016F5E2688 x5=000000016F5E2678 x6=000000016F5E2670 x7=0000000000000000
SCL3 stderr x8=0000000000000000 x9=000000009EA19B78 x10=0000000000000070 x11=000000037FDAA00F
SCL3 stderr x12=0000000000000001 x13=0000040000000400 x14=0000000000000001 x15=0000000000000000
SCL3 stderr x16=0000000188D6BC30 x17=00000001F77A6FF8 x18=00000001014BD181 x19=000000012F005C50
SCL3 stderr x20=000000009EA19B78 x21=000000010669DDF3 x22=0000000101155E28 x23=0000000000000001
SCL3 stderr x24=000000011F00C100 x25=00000001013AD944 x26=000000011E83A758 x27=00000001013C2000
SCL3 stderr x28=0000000101155E28 x29(FP)=000000016F5E2AD0 x30(LR)=00000001011C5F74 x31(SP)=000000016F5E2A40
SCL3 stderr PC=0000000188D6BC38 SP=000000016F5E2A40
SCL3 stderr v0 0000000000000000 (f: 0.000000, d: 0.000000e+00)
SCL3 stderr v1 000000000000000a (f: 10.000000, d: 4.940656e-323)
SCL3 stderr v2 0706050403020100 (f: 50462976.000000, d: 7.949929e-275)
SCL3 stderr v3 000000011f00c100 (f: 520143104.000000, d: 2.378981e-314)
SCL3 stderr v4 0000000000000000 (f: 0.000000, d: 0.000000e+00)
SCL3 stderr v5 0000000000000000 (f: 0.000000, d: 0.000000e+00)
SCL3 stderr v6 0000000000000000 (f: 0.000000, d: 0.000000e+00)
SCL3 stderr v7 0000000000000000 (f: 0.000000, d: 0.000000e+00)
SCL3 stderr v8 0000000000000000 (f: 0.000000, d: 0.000000e+00)
SCL3 stderr v9 0000000000000000 (f: 0.000000, d: 0.000000e+00)
SCL3 stderr v10 0000000000000000 (f: 0.000000, d: 0.000000e+00)
SCL3 stderr v11 0000000000000000 (f: 0.000000, d: 0.000000e+00)
SCL3 stderr v12 0000000000000000 (f: 0.000000, d: 0.000000e+00)
SCL3 stderr v13 0000000000000000 (f: 0.000000, d: 0.000000e+00)
SCL3 stderr v14 0000000000000000 (f: 0.000000, d: 0.000000e+00)
SCL3 stderr v15 0000000000000000 (f: 0.000000, d: 0.000000e+00)
SCL3 stderr v16 0000000000000000 (f: 0.000000, d: 0.000000e+00)
SCL3 stderr v17 0000000000000000 (f: 0.000000, d: 0.000000e+00)
SCL3 stderr v18 0000000000000000 (f: 0.000000, d: 0.000000e+00)
SCL3 stderr v19 0000000000000000 (f: 0.000000, d: 0.000000e+00)
SCL3 stderr v20 ffffffffffffffff (f: 4294967296.000000, d: nan)
SCL3 stderr v21 ffffffffffffffff (f: 4294967296.000000, d: nan)
SCL3 stderr v22 ffffffffffffffff (f: 4294967296.000000, d: nan)
SCL3 stderr v23 ffffffffffffffff (f: 4294967296.000000, d: nan)
SCL3 stderr v24 0000000000000000 (f: 0.000000, d: 0.000000e+00)
SCL3 stderr v25 0000000000000000 (f: 0.000000, d: 0.000000e+00)
SCL3 stderr v26 0000000000000000 (f: 0.000000, d: 0.000000e+00)
SCL3 stderr v27 0000000000000000 (f: 0.000000, d: 0.000000e+00)
SCL3 stderr v28 0000000000000000 (f: 0.000000, d: 0.000000e+00)
SCL3 stderr v29 0000000000000000 (f: 0.000000, d: 0.000000e+00)
SCL3 stderr v30 0000000000000000 (f: 0.000000, d: 0.000000e+00)
SCL3 stderr v31 0000000000000000 (f: 0.000000, d: 0.000000e+00)
SCL3 stderr Module=/usr/lib/system/libsystem_platform.dylib
SCL3 stderr Module_base_address=0000000188D6B000 Symbol=_platform_strcmp
SCL3 stderr Symbol_address=0000000188D6BC30
SCL3 stderr Target=2_90_20220913_71 (Mac OS X 11.5.2)
SCL3 stderr CPU=aarch64 (8 logical CPUs) (0x400000000 RAM)
SCL3 stderr ----------- Stack Backtrace -----------
SCL3 stderr ---------------------------------------
SCL3 stderr JVMDUMP039I Processing dump event "gpf", detail "" at 2363/01/09 16:09:34 - please wait.
SCL3 stderr 
SCL3 stderr Unhandled exception in signal handler. Protected function: generateDiagnosticFiles (0x0)
SCL3 stderr 
SCL3 stderr 
SCL3 stderr Unhandled exception in signal handler. Protected function: reportThreadCrash (0x0)
SCL3 stderr 
STF 13:15:58.479 - **FAILED** Process SCL3 ended with exit code (255) and not the expected exit code/s (0)

pshipton avatar Sep 13 '22 17:09 pshipton

Created https://github.com/eclipse-openj9/openj9/issues/15879 for the Unhandled exception in signal handler. Protected function: generateDiagnosticFiles (0x0)

pshipton avatar Sep 13 '22 17:09 pshipton

The Java processes in the crashes above were about to terminate after printing the message "Total classes loaded = 20001". Let's see if it happens or not after PR #15907 is applied.

knn-k avatar Sep 19 '22 10:09 knn-k

Related? https://openj9-jenkins.osuosl.org/job/Test_openjdk11_j9_extended.system_aarch64_mac_Nightly_testList_1/140 SharedClasses.SCM01.MultiThread_0

No additional diagnostics. https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Test_openjdk11_j9_extended.system_aarch64_mac_Nightly_testList_1/140/system_test_output.tar.gz

MT3 12:19:13 >> Loaded 20000 classes...
MT3 stderr 
MT3 stderr 
MT3 stderr *** Invalid JIT return address 000000037D953A48 in 000000016D59F7E8
MT3 stderr 
STF 12:19:13.710 - **FAILED** Process MT3 ended with exit code (255) and not the expected exit code/s (0)

pshipton avatar Sep 19 '22 14:09 pshipton

https://openj9-jenkins.osuosl.org/job/Test_openjdk17_j9_extended.system_aarch64_mac_Nightly_testList_0/139 SharedClasses.SCM23.MultiCL_0

https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Test_openjdk17_j9_extended.system_aarch64_mac_Nightly_testList_0/139/system_test_output.tar.gz

MCL4 10:13:58 >> Total classes loaded = 20001
MCL4 stderr Unhandled exception
MCL4 stderr Type=Segmentation error vmState=0x00000000
MCL4 stderr J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000002
MCL4 stderr Handler1=0000000102A0CA7C Handler2=00000001027DAE1C InaccessibleAddress=0000000000000008
MCL4 stderr x0=0000000000000000 x1=0000000111809500 x2=000000016DC98BB0 x3=0000000000000000
MCL4 stderr x4=00000001027D3BF8 x5=0000000000000000 x6=0000000000000000 x7=0000000000000005
MCL4 stderr x8=0000000000000002 x9=00000000000F4240 x10=0000000000009434 x11=000000B2F3D6CF86
MCL4 stderr x12=00000000016E3600 x13=0000000000096078 x14=00000000092E0FA9 x15=000000000000E9A5
MCL4 stderr x16=00000001A10B6C8C x17=000000020FA41648 x18=00000001028BD181 x19=0000000111809500
MCL4 stderr x20=000000012307A820 x21=0000000111809D98 x22=0000000000000000 x23=00000001228142A8
MCL4 stderr x24=0000000000000004 x25=0000000102775E28 x26=0000000000000000 x27=0005E8C03D0AD3F4
MCL4 stderr x28=0000000111009DE8 x29(FP)=000000016DC98B80 x30(LR)=00000001070638C0 x31(SP)=000000016DC98A10
MCL4 stderr PC=00000001070E2600 SP=000000016DC98A10
MCL4 stderr v0 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v1 0000ffff0000ffff (f: 65535.000000, d: 1.390650e-309)
MCL4 stderr v2 0706050403020100 (f: 50462976.000000, d: 7.949929e-275)
MCL4 stderr v3 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v4 000003f0000003f0 (f: 1008.000000, d: 2.138972e-311)
MCL4 stderr v5 000003f0000003f0 (f: 1008.000000, d: 2.138972e-311)
MCL4 stderr v6 000003b8000003f0 (f: 1008.000000, d: 2.020140e-311)
MCL4 stderr v7 00000370000003b8 (f: 952.000000, d: 1.867356e-311)
MCL4 stderr v8 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v9 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v10 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v11 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v12 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v13 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v14 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v15 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v16 0000001800000018 (f: 24.000000, d: 5.092790e-313)
MCL4 stderr v17 0000001800000018 (f: 24.000000, d: 5.092790e-313)
MCL4 stderr v18 0000001800000018 (f: 24.000000, d: 5.092790e-313)
MCL4 stderr v19 0000001800000018 (f: 24.000000, d: 5.092790e-313)
MCL4 stderr v20 ffffffffffffffff (f: 4294967296.000000, d: nan)
MCL4 stderr v21 ffffffffffffffff (f: 4294967296.000000, d: nan)
MCL4 stderr v22 ffffffffffffffff (f: 4294967296.000000, d: nan)
MCL4 stderr v23 ffffffffffffffff (f: 4294967296.000000, d: nan)
MCL4 stderr v24 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v25 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v26 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v27 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v28 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v29 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v30 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr v31 0000000000000000 (f: 0.000000, d: 0.000000e+00)
MCL4 stderr Module=/Users/jenkins/workspace/Test_openjdk17_j9_extended.system_aarch64_mac_Nightly_testList_0/openjdkbinary/j2sdk-image/lib/default/libj9jit29.dylib
MCL4 stderr Module_base_address=0000000107000000 Symbol=_ZN2J97Monitor5enterEv
MCL4 stderr Symbol_address=00000001070E2600
MCL4 stderr Target=2_90_20220916_147 (Mac OS X 11.4)
MCL4 stderr CPU=aarch64 (8 logical CPUs) (0x400000000 RAM)
MCL4 stderr ----------- Stack Backtrace -----------
MCL4 stderr ---------------------------------------

pshipton avatar Sep 19 '22 14:09 pshipton

Another crash, but before the potential fix was merged. https://openj9-jenkins.osuosl.org/job/Test_openjdk17_j9_extended.system_aarch64_mac_Nightly_testList_0/141 SharedClasses.SCM23.MultiCL_0

pshipton avatar Sep 20 '22 18:09 pshipton

I ran SharedClasses.SCM01.SingleCL_0 80 times in total after PR #15907 was merged. No failures. https://openj9-jenkins.osuosl.org/job/Grinder/1282/ and https://openj9-jenkins.osuosl.org/job/Grinder/1283/

knn-k avatar Sep 21 '22 05:09 knn-k

I also ran SharedClasses.SCM01.MultiThread_0 100 times in total. No failures. https://openj9-jenkins.osuosl.org/job/Grinder/1284/ and https://openj9-jenkins.osuosl.org/job/Grinder/1285/

knn-k avatar Sep 21 '22 06:09 knn-k