openj9 icon indicating copy to clipboard operation
openj9 copied to clipboard

JDK19 jdk/internal/vm/Continuation/HumongousStack.java - Error evaluating expression: invalid boolean value: null for expression vm.gc.G1

Open JasonFengJ9 opened this issue 2 years ago • 7 comments

Failure link

From an internal build /job/Test_openjdknext_j9_sanity.openjdk_aarch64_linux_Personal/2/consoleFull(ub18-aarch64-4):

12:36:01  openjdk version "19-internal" 2022-09-20
12:36:01  OpenJDK Runtime Environment (build 19-internal-adhoc.jenkins.BuildJDKnextaarch64linuxPersonal)
12:36:01  Eclipse OpenJ9 VM (build loom_patch-1e7a31c30d4, JRE 19 Linux aarch64-64-Bit Compressed References 20220531_28 (JIT enabled, AOT enabled)
12:36:01  OpenJ9   - 1e7a31c30d4
12:36:01  OMR      - e542d083538
12:36:01  JCL      - a66f9edf71a based on jdk-19+24)

Rerun in Grinder - Change TARGET to run only the failed test targets.

Optional info

Failure output (captured from console output)

13:11:27  --------------------------------------------------
13:11:28  TEST: jdk/internal/vm/Continuation/HumongousStack.java
13:11:28  
13:11:28  TEST RESULT: Error. Error evaluating expression: invalid boolean value: `null' for expression `vm.gc.G1'

This is a test issue, the property need to be added in extension repo.

JasonFengJ9 avatar Jun 01 '22 16:06 JasonFengJ9

@llxia can someone add the test properly pls.

pshipton avatar Jun 01 '22 16:06 pshipton

@LongyuZhang could you help with this? Thanks

llxia avatar Jun 01 '22 16:06 llxia

Just to confirm vm.gc.G1 should be false for openj9? In this case, the test HumongousStack will not run, since it requires vm.gc.G1 to be true to run. Also, vm.continuations is also required to be true to run this test.

LongyuZhang avatar Jun 08 '22 16:06 LongyuZhang

Also, vm.continuations is also required to be true to run this test.

I would suggest set vm.gc.G1 to true to enable HumongousStack test, if it does fail, we can exclude it later.

JasonFengJ9 avatar Jun 08 '22 17:06 JasonFengJ9

Following are the test results:

  • set either vm.continuations or vm.gc.G1 to false, this test will not run (grinder1, grinder2)
  • set both values to true, the test has ExceptionInInitializerError (grinder):
14:59:42  STDERR:
14:59:42  java.lang.ExceptionInInitializerError
14:59:42  	at java.base/java.lang.J9VMInternals.recordInitializationFailure(J9VMInternals.java:171)
14:59:42  	at HumongousStack.main(HumongousStack.java:53)
14:59:42  	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104)
14:59:42  	at java.base/java.lang.reflect.Method.invoke(Method.java:578)
14:59:42  	at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127)
14:59:42  	at java.base/java.lang.Thread.run(Thread.java:1566)
14:59:42  Caused by: java.lang.UnsupportedOperationException: VM does not support continuations
14:59:42  	at java.base/jdk.internal.vm.ContinuationSupport.ensureSupported(ContinuationSupport.java:49)
14:59:42  	at java.base/jdk.internal.vm.Continuation.<clinit>(Continuation.java:52)
14:59:42  	... 5 more
14:59:42  
14:59:42  JavaTest Message: Test threw exception: java.lang.ExceptionInInitializerError
14:59:42  JavaTest Message: shutting down test

LongyuZhang avatar Jun 08 '22 20:06 LongyuZhang

Caused by: java.lang.UnsupportedOperationException: VM does not support continuations

This is expected, the test in question has been excluded, could have another try when ContinuationSupport is ready.

JasonFengJ9 avatar Jun 08 '22 20:06 JasonFengJ9

As per https://github.com/eclipse-openj9/openj9/issues/15152#issuecomment-1251743982, this passes with --enable-preview -Dvm.continuations=true -Xgcpolicy:optthruput -Xint

fengxue-IS avatar Sep 20 '22 01:09 fengxue-IS

@fengxue-IS has this test been re-enabled?

tajila avatar Oct 18 '22 13:10 tajila

Test failure seen with JIT enabled, on aarch64 and plinux similar to #15939

fengxue-IS avatar Oct 25 '22 14:10 fengxue-IS

@fengxue-IS can this be closed?

tajila avatar Feb 14 '23 13:02 tajila

https://github.com/eclipse-openj9/openj9/issues/15939 has now been resolved - can we verify and close this one?

vij-singh avatar Feb 14 '23 14:02 vij-singh

I am still ale to get an Invalid JIT return address failure in Grinder using the OpenJ9 nightly builds. Currently investigating the corefile produced.

fengxue-IS avatar Feb 14 '23 14:02 fengxue-IS

@0xdaryl Can JIT team help with investigating this issue please, I wasn't able to . Heres aGrinder run with nightly build

fengxue-IS avatar Feb 17 '23 20:02 fengxue-IS

@a7ehuo : could you take a look at this from a JIT perspective please?

0xdaryl avatar Feb 21 '23 13:02 0xdaryl

I'm trying to reproduce this locally on Power with JDK19. I don't see the Invalid JIT return address failure yet. @fengxue-IS Are my settings for the test below are correct? Thanks!

javac --add-exports java.base/jdk.internal.vm=ALL-UNNAMED HumongousStack.java
java --add-exports java.base/jdk.internal.vm=ALL-UNNAMED --enable-preview -Dvm.continuations=true -Xgcpolicy:optthruput -Xjit:count=1,disableAsyncCompilation HumongousStack 10000
start depth: 10000
-- Yielding!
-- Resumed!
done: 0
done; x:-2014557164

a7ehuo avatar Feb 21 '23 17:02 a7ehuo

-Xgcpolicy:optthruput -Xjit:count=1,disableAsyncCompilation option isn't used in the Grinder test through Jtreg framework. Also this is an intermittent failure, you may need to running multiple time (or launch via test framework) to reproduce

fengxue-IS avatar Feb 21 '23 17:02 fengxue-IS

-Xgcpolicy:optthruput -Xjit:count=1,disableAsyncCompilation option isn't used in the Grinder test through Jtreg framework

Thanks for the clarification @fengxue-IS. -Xjit:count=1,disableAsyncCompilation are added to force more JIT compilation. I'll remove -Xgcpolicy:optthruput -Xjit:count=1,disableAsyncCompilation and try more.

a7ehuo avatar Feb 21 '23 18:02 a7ehuo

I'm able to repro the issue only by running the test in Grinder (31185) on Power [1], but I'm not able to repro it locally (locally I ran the test 800 times). The invalid JIT return address is detected on the path of walkContinuationStackFrames [1].

HumongousStack.deep(ILjava/lang/String;)Ljava/lang/String; is recompiled due to RecompDueToMegamorphicCallProfile (M) [2]. The invalid JIT return address 0x00007FFF912FA9E0 is within the code cache of the old compiled method (00007FFF912FA978-00007FFF912FB05C). HumongousStack.deep(ILjava/lang/String;)Ljava/lang/String; is a static method.

[1]

[2023-02-22T23:29:28.105Z] STDOUT:
[2023-02-22T23:29:28.105Z] start depth: 10000
[2023-02-22T23:29:28.105Z] -- Yielding!
[2023-02-22T23:29:28.105Z] STDERR:
[2023-02-22T23:29:28.105Z] 
[2023-02-22T23:29:28.105Z] 
[2023-02-22T23:29:28.105Z] *** Invalid JIT return address 00007FFF912FA9E0 in 00007FFF8AA7B7A8
[2023-02-22T23:29:28.105Z] 
[2023-02-22T23:29:28.105Z] 23:29:19.447 0x23c200    j9vm.249    *   ** ASSERTION FAILED ** at /home/jenkins/workspace/Build_JDK19_ppc64le_linux_Nightly/openj9/runtime/vm/swalk.c:1632: ((0 ))
...
#16 0x00007fffb356b27c in invalidJITReturnAddress (walkState=<optimized out>)
    at /home/jenkins/workspace/Build_JDK19_ppc64le_linux_Nightly/openj9/runtime/vm/swalk.c:1632
#17 0x00007fffb29bd1a4 in jitWalkStackFrames (walkState=0x7fff8aa7b7a8)
    at /home/jenkins/workspace/Build_JDK19_ppc64le_linux_Nightly/openj9/runtime/codert_vm/jswalk.c:283
#18 0x00007fffb356a0a4 in walkStackFrames (currentThread=0x23c200, walkState=0x7fff8aa7b7a8)
    at /home/jenkins/workspace/Build_JDK19_ppc64le_linux_Nightly/openj9/runtime/vm/swalk.c:384
#19 0x00007fffb3585fec in walkContinuationStackFrames (currentThread=0x23c200, continuation=<optimized out>, 
    walkState=0x7fff8aa7b7a8)
    at /home/jenkins/workspace/Build_JDK19_ppc64le_linux_Nightly/openj9/runtime/vm/ContinuationHelpers.cpp:289
#20 0x00007fffb1a66d94 in VM_VMHelpersCompressed::walkContinuationStackFramesWrapper (isGlobalGC=false, 
--Type <RET> for more, q to quit, c to continue without paging--
    isConcurrentGC=<optimized out>, walkState=0x7fff8aa7b7a8, continuationObject=0x80067c38, vmThread=0x23c200)
    at /home/jenkins/workspace/Build_JDK19_ppc64le_linux_Nightly/openj9/runtime/oti/VMHelpers.hpp:2167
#21 GC_VMThreadStackSlotIterator::scanSlots (vmThread=0x23c200, continuationObjectPtr=0x80067c38, userData=<optimized out>, 
    oSlotIterator=<optimized out>, includeStackFrameClassReferences=<optimized out>, trackVisibleFrameDepth=<optimized out>, 
    isConcurrentGC=<optimized out>, isGlobalGC=<optimized out>)
    at /home/jenkins/workspace/Build_JDK19_ppc64le_linux_Nightly/openj9/runtime/gc_structs/VMThreadStackSlotIterator.cpp:147
#22 0x00007fffb1c2d038 in MM_ScavengerDelegate::scanContinuationNativeSlots (this=0x7fffac06efb0, env=0x7fff3c004988, 
    objectPtr=0x80067c38, reason=<optimized out>)
    at /home/jenkins/workspace/Build_JDK19_ppc64le_linux_Nightly/openj9/runtime/gc_glue_java/ScavengerDelegate.cpp:358
#23 0x00007fffb1c2ea30 in MM_ScavengerDelegate::getObjectScanner (this=0x7fffac06efb0, env=0x7fff3c004988, objectPtr=0x80067c38, 
    allocSpace=0x7fffb1641108, flags=2, reason=<optimized out>, shouldRemember=<optimized out>)
    at /home/jenkins/workspace/Build_JDK19_ppc64le_linux_Nightly/openj9/runtime/gc_glue_java/ScavengerDelegate.cpp:446
#24 0x00007fffb1bf373c in MM_Scavenger::getObjectScanner (shouldRemember=<optimized out>, reason=SCAN_REASON_SCAVENGE, flags=2, 
    objectScannerState=<optimized out>, objectptr=<optimized out>, env=0x7fff3c004988, this=<optimized out>)
    at /home/jenkins/workspace/Build_JDK19_ppc64le_linux_Nightly/omr/gc/base/standard/Scavenger.cpp:1877
#25 MM_Scavenger::incrementalScavengeObjectSlots (scanCache=0x7fffb16410d0, objectPtr=0x80067c38, env=<optimized out>, 
    this=<optimized out>) at /home/jenkins/workspace/Build_JDK19_ppc64le_linux_Nightly/omr/gc/base/standard/Scavenger.cpp:2111
#26 MM_Scavenger::incrementalScanCacheBySlot (this=0x7fffac06ef50, env=0x7fff3c004988, scanCache=0x7fffb16410d0)
    at /home/jenkins/workspace/Build_JDK19_ppc64le_linux_Nightly/omr/gc/base/standard/Scavenger.cpp:2466
#27 0x00007fffb1bf4054 in MM_Scavenger::completeScan (this=0x7fffac06ef50, env=0x7fff3c004988)
    at /home/jenkins/workspace/Build_JDK19_ppc64le_linux_Nightly/omr/gc/base/standard/Scavenger.cpp:2542
#28 0x00007fffb1bf4554 in MM_Scavenger::workThreadGarbageCollect (this=0x7fffac06ef50, env=0x7fff3c004988)
    at /home/jenkins/workspace/Build_JDK19_ppc64le_linux_Nightly/omr/gc/base/standard/Scavenger.cpp:2586
#29 0x00007fffb1c6d64c in MM_ParallelScavengeTask::run (this=<optimized out>, envBase=<optimized out>)
    at /home/jenkins/workspace/Build_JDK19_ppc64le_linux_Nightly/omr/gc/base/standard/ParallelScavengeTask.cpp:38
#30 0x00007fffb1b8f488 in MM_ParallelDispatcher::workerEntryPoint (this=0x7fffac06ed20, env=0x7fff3c004988)
    at /home/jenkins/workspace/Build_JDK19_ppc64le_linux_Nightly/omr/gc/base/ParallelDispatcher.cpp:187
#31 0x00007fffb1b8e7e0 in dispatcher_thread_proc2 (portLib=<optimized out>, info=<optimized out>)
    at /home/jenkins/workspace/Build_JDK19_ppc64le_linux_Nightly/omr/gc/base/ParallelDispatcher.cpp:93
#32 0x00007fffb343ab74 in omrsig_protect (portLibrary=0x7fffb3bb1f30 <j9portLibrary>, 
    fn=0x7fffb1b8e680 <dispatcher_thread_proc2(OMRPortLibrary*, void*)>, fn_arg=0x7fffb3c3c740 <main_runJavaMain+2112>, 
    handler=0x7fffb3510ac0 <structuredSignalHandlerVM>, handler_arg=0x7fffac037250, flags=<optimized out>, result=0x7fff8aa7c488)
    at /home/jenkins/workspace/Build_JDK19_ppc64le_linux_Nightly/omr/port/unix/omrsignal.c:425
#33 0x00007fffb1b8db50 in dispatcher_thread_proc (info=<optimized out>)
    at /home/jenkins/workspace/Build_JDK19_ppc64le_linux_Nightly/omr/gc/base/ParallelDispatcher.hpp:198
#34 0x00007fffb33cd820 in thread_wrapper (arg=0x7fffac2a4e70)
    at /home/jenkins/workspace/Build_JDK19_ppc64le_linux_Nightly/omr/thread/common/omrthread.c:1724
#35 0x00007fffb3ed9718 in ?? ()
#36 0x00007fffb3dbb498 in ?? ()
(gdb) p *walkState
$2 = {previous = 0xb, walkThread = 0x7fff8aa7ac88, javaVM = 0x7fffac037250, flags = 610271234, bp = 0x3edac8, unwindSP = 0x3edad0, 
  pc = 0x7fff912fa9e0 "", 
  nextPC = 0x7fffb36ef4d6 <VM_BytecodeInterpreterCompressed::nativeReturnBytecodePC(unsigned long*&, unsigned char*&, J9ROMMethod*)::returnFromNativeBytecodes+4> "\270", sp = 0x3edaa0, arg0EA = 0x3edad8, literals = 0x0, walkSP = 0x38b298, argCount = 2, 
  constantPool = 0x28e970, method = 0x28ef48, jitInfo = 0x0, frameFlags = 0, resolveFrameFlags = 0, skipCount = 140735519635752, 
  maxFrames = 140736175413528, 
  userData1 = 0x7fffb1c2cf00 <stackSlotIteratorForScavenge(J9JavaVM*, J9Object**, void*, J9StackWalkState*, void const*)>, 
  userData2 = 0x7fffac037250, userData3 = 0x7fff8aa7bac0, userData4 = 0x7fffb1c4671c
     <MM_MemorySubSpaceGeneric::collectorAllocateTLH(MM_EnvironmentBase*, MM_Collector*, MM_AllocateDescription*, unsigned long, void*&, void*&)+124>, framesWalked = 8404, 
  frameWalkFunction = 0x7fffb1a66b80 <vmThreadStackFrameIterator(J9VMThread*, J9StackWalkState*)>, 
  objectSlotWalkFunction = 0x7fffb1a66b20 <vmThreadStackDoOSlotIterator(J9VMThread*, J9StackWalkState*, j9object_t*, void const*)>, returnAddressWalkFunction = 0x9900, cache = 0x0, restartPoint = 0x0, restartException = 0x7fffac06ef50, inlinerMap = 0x0, 
  inlineDepth = 0, cacheCursor = 0x7fff8aa7b9e0, decompilationRecord = 0x0, registerEAs = {jit_r0 = 0x0, jit_r1 = 0x0, 
    jit_r2 = 0x0, jit_r3 = 0x0, jit_r4 = 0x0, jit_r5 = 0x0, jit_r6 = 0x0, jit_r7 = 0x0, jit_r8 = 0x0, jit_r9 = 0x0, jit_r10 = 0x0, 
    jit_r11 = 0x0, jit_r12 = 0x0, jit_r13 = 0x0, jit_r14 = 0x0, jit_r15 = 0x0, jit_r16 = 0x7fff40028a20, jit_r17 = 0x38b2d8, 
    jit_r18 = 0x38b2e0, jit_r19 = 0x38b2e8, jit_r20 = 0x38b2f0, jit_r21 = 0x38b2f8, jit_r22 = 0x38b300, jit_r23 = 0x38b308, 
    jit_r24 = 0x38b310, jit_r25 = 0x38b318, jit_r26 = 0x38b320, jit_r27 = 0x38b328, jit_r28 = 0x38b330, jit_r29 = 0x38b338, 
    jit_r30 = 0x3edab8, jit_r31 = 0x3edac0}, walkedEntryLocalStorage = 0x7fff8aa7ac28, i2jState = 0x7fff8aa7ac38, 
  decompilationStack = 0x0, pcAddress = 0x3edac8, outgoingArgCount = 2, 
  objectSlotBitVector = 0x1000 <error: Cannot access memory at address 0x1000>, elsBitVector = 140736175501696, 
  savedObjectSlotWalkFunction = 0x7fffb1d85600, bytecodePCOffset = -1, 
  dropToCurrentFrame = 0x7fffb29bbe40 <jitDropToCurrentFrame>, j2iFrame = 0x0, previousFrameFlags = 1073741824, slotIndex = 1, 
  slotType = 1, currentThread = 0x23c200, linearSlotWalker = 0x0, inlinedCallSite = 0x0, stackMap = 0x0, inlineMap = 0x0, 
  loopBreaker = 0}

(kca) whatis 0x28ef48 // <--- walkState->method
J9Method - {HumongousStack.deep}

[2]

vlog

+ (warm) HumongousStack.deep(ILjava/lang/String;)Ljava/lang/String; @ 00007FFF912FA978-00007FFF912FB05C OrdinaryMethod - Q_SZ=0 Q_SZI=0 QW=12 j9m=000000000028EF48 bcsz=98 sync GCR compThreadID=0 CpuLoad=0%(0%avg) JvmCpu=74%
+ (warm) HumongousStack.deep(ILjava/lang/String;)Ljava/lang/String; @ 00007FFF912FB0F8-00007FFF912FB4AC OrdinaryMethod M Q_SZ=0 Q_SZI=0 QW=12 j9m=000000000028EF48 bcsz=98 sync compThreadID=0 CpuLoad=0%(0%avg) JvmCpu=74%
trace log 

23:29:18.437114581  0x0000000000097d00 j9jit.2             Event        (warm) Compiling HumongousStack.deep(ILjava/lang/String;)Ljava/lang/String;
23:29:18.722025011  0x0000000000097d00 j9jit.44            Event       + (warm) HumongousStack.deep(ILjava/lang/String;)Ljava/lang/String; @ 0x7fff912fa978-0x7fff912fb05c/0x0-0x7fff912fb05c time=284933us 0x28ef48 0x7fff8a6e8a78 RR=- Q_SZ=0 bcsz=98 mem=16384KB  sync   GCR  JSR292
...
23:29:18.723183009  0x0000000000097d00 j9jit.2             Event        (warm) Compiling HumongousStack.deep(ILjava/lang/String;)Ljava/lang/String;
23:29:18.840135775  0x0000000000097d00 j9jit.44            Event       + (warm) HumongousStack.deep(ILjava/lang/String;)Ljava/lang/String; @ 0x7fff912fb0f8-0x7fff912fb4ac/0x0-0x7fff912fb4ac time=116977us 0x28ef48 0x7fff8a6e8db8 RR=M Q_SZ=0 bcsz=98 mem=16384KB  sync     JSR292

a7ehuo avatar Feb 23 '23 02:02 a7ehuo

To elaborate more on what I found on the above, walkState->method is HumongousStack.deep. HumongousStack.deep has been recompiled with the new code cache 00007FFF912FB0F8-00007FFF912FB4AC. The invalid JIT return address 0x00007FFF912FA9E0 is within the old code cache of the first compilation (00007FFF912FA978-00007FFF912FB05C). Somehow walkState->pc isn't updated to the new code cache.

[2023-02-22T23:29:28.105Z] *** Invalid JIT return address 00007FFF912FA9E0 in 00007FFF8AA7B7A8
23:29:18.437114581  0x0000000000097d00 j9jit.2             Event        (warm) Compiling HumongousStack.deep(ILjava/lang/String;)Ljava/lang/String;
23:29:18.722025011  0x0000000000097d00 j9jit.44            Event       + (warm) HumongousStack.deep(ILjava/lang/String;)Ljava/lang/String; @ 0x7fff912fa978-0x7fff912fb05c/0x0-0x7fff912fb05c time=284933us 0x28ef48 0x7fff8a6e8a78 RR=- Q_SZ=0 bcsz=98 mem=16384KB  sync   GCR  JSR292
...
23:29:18.723183009  0x0000000000097d00 j9jit.2             Event        (warm) Compiling HumongousStack.deep(ILjava/lang/String;)Ljava/lang/String;
23:29:18.840135775  0x0000000000097d00 j9jit.44            Event       + (warm) HumongousStack.deep(ILjava/lang/String;)Ljava/lang/String; @ 0x7fff912fb0f8-0x7fff912fb4ac/0x0-0x7fff912fb4ac time=116977us 0x28ef48 0x7fff8a6e8db8 RR=M Q_SZ=0 bcsz=98 mem=16384KB  sync     JSR292
...
23:29:19.229080997 *0x0000000000239d00 j9vm.682            Entry      >WalkStackFrames - walkThread=0x7fff8ab0ac88 flags=0x4600002 sp=0x38b298 a0=0x38b2b8 pc=0x3 literals=0x0 els=0x7fff8ab0ac28 j2i=0x0
23:29:19.233432989  0x0000000000239d00 j9vm.683            Exit       <WalkStackFrames - walkThread=0x7fff8ab0ac88, rc=0
...
23:29:19.443432569 *0x000000000023c200 j9vm.682            Entry      >WalkStackFrames - walkThread=0x7fff8aa7ac88 flags=0x4600002 sp=0x38b298 a0=0x38b2b8 pc=0x3 literals=0x0 els=0x7fff8aa7ac28 j2i=0x0
23:29:19.447062561  0x000000000023c200 j9vm.249            Assert     * ** ASSERTION FAILED ** at /home/jenkins/workspace/Build_JDK19_ppc64le_linux_Nightly/openj9/runtime/vm/swalk.c:1632: ((0 ))

I also noticed that stack walk with sp=0x38b298 completes successfully at 23:29:19.233432989 but failed later at 23:29:19.443432569 with this invalid JIT return address. Both stack walks happen after the second compilation of HumongousStack.deep. sp=0x38b298 points to method jdk/internal/vm/Continuation.yieldImpl.

(kca) whatis (0x38b298)
0x38b298: 0x000000000010eac0 J9Method - {jdk/internal/vm/Continuation.yieldImpl}

This issue looks similar to #15939, but it happens on the path of walkContinuationStackFrames during GC instead of JIT helper calls such as jitCheckAsyncMessages from #15939. The stackThread->pc and walkState->walkThread->pc are 0x3 which if I understand correctly, it is J9SF_FRAME_TYPE_NATIVE_METHOD. walkState->frameFlags is 0.

#19 0x00007fffb3585fec in walkContinuationStackFrames (currentThread=0x23c200, continuation=<optimized out>, 
    walkState=0x7fff8aa7b7a8)
    at /home/jenkins/workspace/Build_JDK19_ppc64le_linux_Nightly/openj9/runtime/vm/ContinuationHelpers.cpp:289
289			rc = currentThread->javaVM->walkStackFrames(currentThread, walkState);

(gdb) p stackThread->pc
$13 = (U_8 *) 0x3 <error: Cannot access memory at address 0x3>

(gdb) p walkState->walkThread->pc
$14 = (U_8 *) 0x3 <error: Cannot access memory at address 0x3>

(gdb) p walkState->pc
$15 = (U_8 *) 0x7fff912fa9e0 ""
(gdb) p /x walkState->flags
$73 = 0x24600002 < == J9_STACKWALK_DO_NOT_SNIFF_AND_WHACK | J9_STACKWALK_ITERATE_FRAMES | J9_STACKWALK_ITERATE_O_SLOTS

(gdb) p /x walkState->frameFlags
$75 = 0x0

@fengxue-IS @tajila With what's found so far, I'm wondering if the fix for #15939 might need to be expanded to cover more scenarios, or if there is anything I could check to help narrow down why walkState->pc isn't updated to the new code cache or why the stack frame still contains the address from the old code cache. So far I'm not sure how JIT affects this scenarios as the method is recompiled before walkContinuationStackFrames.

a7ehuo avatar Feb 24 '23 01:02 a7ehuo

Forgot to mention one more thing HumongousStack.deep (walkState->method) is not on any of the active thread stacks.

> !threads
	!stack 0x00095700	!j9vmthread 0x00095700	!j9thread 0x7fffac01cec0	tid 0x23ab7 (146103) // (main)
	!stack 0x00097d00	!j9vmthread 0x00097d00	!j9thread 0x7fffac107330	tid 0x23ab9 (146105) // (JIT Compilation Thread-000)
	!stack 0x0009a200	!j9vmthread 0x0009a200	!j9thread 0x7fffac1078a8	tid 0x23aba (146106) // (JIT Compilation Thread-001 Suspended)
	!stack 0x0009c700	!j9vmthread 0x0009c700	!j9thread 0x7fffac10b260	tid 0x23abb (146107) // (JIT Compilation Thread-002 Suspended)
	!stack 0x0009ec00	!j9vmthread 0x0009ec00	!j9thread 0x7fffac10b7d8	tid 0x23abc (146108) // (JIT Compilation Thread-003 Suspended)
	!stack 0x000a1100	!j9vmthread 0x000a1100	!j9thread 0x7fffac10c790	tid 0x23abd (146109) // (JIT Compilation Thread-004 Suspended)
	!stack 0x000a3700	!j9vmthread 0x000a3700	!j9thread 0x7fffac10cd08	tid 0x23abe (146110) // (JIT Compilation Thread-005 Suspended)
	!stack 0x000a5c00	!j9vmthread 0x000a5c00	!j9thread 0x7fffac10dcc0	tid 0x23abf (146111) // (JIT Compilation Thread-006 Suspended)
	!stack 0x000a8100	!j9vmthread 0x000a8100	!j9thread 0x7fffac10e238	tid 0x23ac0 (146112) // (JIT Diagnostic Compilation Thread-007 Suspended)
	!stack 0x000aa600	!j9vmthread 0x000aa600	!j9thread 0x7fffac171720	tid 0x23ac1 (146113) // (JIT-SamplerThread)
	!stack 0x001b2e00	!j9vmthread 0x001b2e00	!j9thread 0x7fffac171c98	tid 0x23ac8 (146120) // (Common-Cleaner)
	!stack 0x00235300	!j9vmthread 0x00235300	!j9thread 0x7fffac2a2e88	tid 0x23ad2 (146130) // (Concurrent Mark Helper)
	!stack 0x00237800	!j9vmthread 0x00237800	!j9thread 0x7fffac2a3bc0	tid 0x23ad3 (146131) // (GC Worker)
	!stack 0x00239d00	!j9vmthread 0x00239d00	!j9thread 0x7fffac2a4138	tid 0x23ad4 (146132) // (GC Worker)
	!stack 0x0023c200	!j9vmthread 0x0023c200	!j9thread 0x7fffac2a4e70	tid 0x23ad5 (146133) // (GC Worker)
	!stack 0x00244f00	!j9vmthread 0x00244f00	!j9thread 0x7fff4001e850	tid 0x23ad9 (146137) // (Attach API wait loop)
	!stack 0x00246800	!j9vmthread 0x00246800	!j9thread 0x7fffac2a53e8	tid 0x23adc (146140) // (MainThread)
	!stack 0x002b1b00	!j9vmthread 0x002b1b00	!j9thread 0x7fff4001edc8	tid 0x23ae1 (146145) // (Finalizer thread)

> 	!stack 0x00095700
<95700> 	!j9method 0x00000000000AB648   java/lang/Object.waitImpl(JI)V
<95700> 	!j9method 0x00000000000AB628   java/lang/Object.wait(JI)V
<95700> 	!j9method 0x00000000000AB608   java/lang/Object.wait(J)V
<95700> 	!j9method 0x00000000000E3DA8   java/lang/Thread.join(J)V
<95700> 	!j9method 0x00000000000E3DE8   java/lang/Thread.join()V
<95700> 	!j9method 0x000000000028D5E0   com/sun/javatest/regtest/agent/MainWrapper.main([Ljava/lang/String;)V
<95700> 	                        JNI call-in frame
<95700> 	                        Native method frame

> 	!stack 0x001b2e00
<1b2e00> 	!j9method 0x00000000000AB648   java/lang/Object.waitImpl(JI)V
<1b2e00> 	!j9method 0x00000000000AB628   java/lang/Object.wait(JI)V
<1b2e00> 	!j9method 0x00000000000AB608   java/lang/Object.wait(J)V
<1b2e00> 	!j9method 0x0000000000152C90   java/lang/ref/ReferenceQueue.remove(J)Ljava/lang/ref/Reference;
<1b2e00> 	!j9method 0x00000000001A9050   jdk/internal/ref/CleanerImpl.run()V
<1b2e00> 	!j9method 0x00000000000E3AA8   java/lang/Thread.run()V
<1b2e00> 	!j9method 0x00000000001AABB0   jdk/internal/misc/InnocuousThread.run()V
<1b2e00> 	                        JNI call-in frame
<1b2e00> 	                        Native method frame

> !stack 0x00235300
<235300> 	                        Native method frame
> !stack 0x00237800
<237800> 	                        Native method frame
> 	!stack 0x00239d00	
<239d00> 	                        Native method frame
> 	!stack 0x0023c200
<23c200> 	                        Native method frame

> !stack 0x00244f00
<244f00> 	!j9method 0x000000000023CF88   openj9/internal/tools/attach/target/IPC.waitSemaphore()I
<244f00> 	!j9method 0x0000000000243330   openj9/internal/tools/attach/target/CommonDirectory.waitSemaphore(Ljava/lang/String;)I
<244f00> 	!j9method 0x000000000026AF88   openj9/internal/tools/attach/target/WaitLoop.waitForNotification(Z)Lopenj9/internal/tools/attach/target/Attachment;
<244f00> 	!j9method 0x000000000026AFC8   openj9/internal/tools/attach/target/WaitLoop.run()V
<244f00> 	                        JNI call-in frame
<244f00> 	                        Native method frame
> !stack 0x00246800
<246800> 	                        known but unhandled frame type com.ibm.j9ddr.vm29.pointer.U8Pointer @ 0x00000005 

 FAULT  FAULT  FAULT  FAULT  FAULT  FAULT  FAULT  FAULT  FAULT  FAULT  FAULT  FAULT  FAULT  FAULT  FAULT  FAULT 

<246800> 	!j9method 0x000000000028EEC8   HumongousStack.fillYoungGen(I)[Ljava/lang/Object;
<246800> 	!j9method 0x000000000028EEC8   HumongousStack.fillYoungGen(I)[Ljava/lang/Object;
<246800> 	!j9method 0x000000000028EEC8   HumongousStack.fillYoungGen(I)[Ljava/lang/Object;
<246800> 	!j9method 0x000000000028EEC8   HumongousStack.fillYoungGen(I)[Ljava/lang/Object;
<246800> 	!j9method 0x000000000028EEE8   HumongousStack.main([Ljava/lang/String;)V
<246800> 	!j9method 0x0000000000245EC8   java/lang/invoke/LambdaForm$DMH/0x00000000400250b0.invokeStatic(Ljava/lang/Object;Ljava/lang/Object;)V
<246800> 	!j9method 0x0000000000259AD8   java/lang/invoke/LambdaForm$MH/0x000000004002a6c0.invoke(Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;
<246800> 	!j9method 0x0000000000259FD8   java/lang/invoke/LambdaForm$MH/0x000000004002ad40.invokeExact_MT(Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;
<246800> 	!j9method 0x00000000001F9098   jdk/internal/reflect/DirectMethodHandleAccessor.invokeImpl(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;
<246800> 	!j9method 0x00000000001F9058   jdk/internal/reflect/DirectMethodHandleAccessor.invoke(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;
<246800> 	!j9method 0x00000000000DAF70   java/lang/reflect/Method.invoke(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;
<246800> 	!j9method 0x000000000028E948   com/sun/javatest/regtest/agent/MainWrapper$MainThread.run()V
<246800> 	!j9method 0x00000000000E3AA8   java/lang/Thread.run()V
<246800> 	                        JNI call-in frame
<246800> 	                        Native method frame

> !stack 0x002b1b00
<2b1b00> 	!j9method 0x0000000000106618   java/lang/ref/Reference.enqueueImpl()Z
<2b1b00> 	                        JNI call-in frame
<2b1b00> 	                        Native method frame
> 

a7ehuo avatar Feb 24 '23 15:02 a7ehuo

I ran with disableCodeCacheReclamation in Grinder #31220, all 3x passed. Previously, running in Grinder #31185 without specifying this option, all 3x failed. @0xdaryl @fengxue-IS @tajila fyi

Grinder #31220 (pass): -Xjit:disableCodeCacheReclamation,verbose,vlog=vlog,{*HumongousStack*,java/lang/ref/Reference.enqueueImpl*}(traceFull,traceILGen,traceCG,traceInlining,debugInliner,log=jitlog)

Grinder #31185 (fail): -Xjit:verbose,vlog=vlog,{*HumongousStack*,java/lang/ref/Reference.enqueueImpl*}(traceFull,traceILGen,traceCG,traceInlining,debugInliner,log=jitlog)

a7ehuo avatar Feb 24 '23 18:02 a7ehuo

What's the next step with this one?

vij-singh avatar Mar 07 '23 14:03 vij-singh

I've locally tested using a custom build to avoid using VThreadList and use the GC continuation list during JIT code cache reclaim which seem to fix the issue. The changes is part of the perf work to avoid lock contention which still have some issues to be addressed. I will add a link to this issue once the PR is ready for review

fengxue-IS avatar Mar 07 '23 15:03 fengxue-IS

Expecting a PR this week

DanHeidinga avatar Mar 07 '23 15:03 DanHeidinga

Re-opening since the fix also needs to be back-ported to the 0.37 release branch.

babsingh avatar Mar 14 '23 13:03 babsingh

Closing issue, backport fix has been merged to 0.37

fengxue-IS avatar Mar 14 '23 22:03 fengxue-IS

@fengxue-IS @babsingh this issue is closed but jdk/internal/vm/Continuation/HumongousStack.java is still excluded in 21+ under this issue. I'll reopen this in the Java 21 milestone until we can sort it out.

The test is configured with @requires vm.openj9 | vm.gc.G1

pshipton avatar Aug 29 '23 18:08 pshipton

test exclude have been removed, but it should still not be running due to vm.gc.G1 flag default to false

fengxue-IS avatar Sep 21 '23 20:09 fengxue-IS