openj9 icon indicating copy to clipboard operation
openj9 copied to clipboard

JDK20 jdk/incubator/concurrent/ScopedValue/StressStackOverflow.java StressStackOverflow$TestFailureException: Unexpected value for ScopedValue

Open JasonFengJ9 opened this issue 2 years ago • 58 comments

Failure link

From an internal build(ub18-aarch64-8):

08:57:42  openjdk version "20-internal" 2023-03-21
08:57:42  OpenJDK Runtime Environment (build 20-internal-adhoc.jenkins.BuildJDK20aarch64linuxPersonal)
08:57:42  Eclipse OpenJ9 VM (build master-8cc902bc75f, JRE 20 Linux aarch64-64-Bit Compressed References 20230316_13 (JIT enabled, AOT enabled)
08:57:42  OpenJ9   - 8cc902bc75f
08:57:42  OMR      - 035ec68ec0c
08:57:42  JCL      - 10a35739269 based on jdk-20+36)

Rerun in Grinder - Change TARGET to run only the failed test targets.

Optional info

Failure output (captured from console output)

09:13:44  variation: Mode150
09:13:44  JVM_OPTIONS:  -XX:+UseCompressedOops 

09:18:25  TEST: jdk/incubator/concurrent/ScopedValue/StressStackOverflow.java

09:18:25  STDERR:
09:18:25  WARNING: Using incubator modules: jdk.incubator.concurrent
09:18:25  java.lang.RuntimeException: StressStackOverflow$TestFailureException: Unexpected value for ScopedValue
09:18:25  	at StressStackOverflow.lambda$run$1(StressStackOverflow.java:196)
09:18:25  	at java.base/jdk.internal.vm.ScopedValueContainer.runWithoutScope(ScopedValueContainer.java:114)
09:18:25  	at java.base/jdk.internal.vm.ScopedValueContainer.run(ScopedValueContainer.java:100)
09:18:25  	at jdk.incubator.concurrent/jdk.incubator.concurrent.ScopedValue$Carrier.run(ScopedValue.java:417)
09:18:25  	at StressStackOverflow.run(StressStackOverflow.java:158)
09:18:25  	at StressStackOverflow.main(StressStackOverflow.java:209)
09:18:25  	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104)
09:18:25  	at java.base/java.lang.reflect.Method.invoke(Method.java:578)
09:18:25  	at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:125)
09:18:25  	at java.base/java.lang.Thread.run(Thread.java:1639)
09:18:25  	Suppressed: jdk.incubator.concurrent.StructureViolationException
09:18:25  		at java.base/jdk.internal.reflect.DirectConstructorHandleAccessor.newInstance(DirectConstructorHandleAccessor.java:67)
09:18:25  		at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:500)
09:18:25  		at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:484)
09:18:25  		at java.base/jdk.internal.misc.StructureViolationExceptions.newException(StructureViolationExceptions.java:45)
09:18:25  		at java.base/jdk.internal.misc.StructureViolationExceptions.throwException(StructureViolationExceptions.java:72)
09:18:25  		at java.base/jdk.internal.misc.ThreadFlock$ThreadContainerImpl.close(ThreadFlock.java:541)
09:18:25  		at java.base/jdk.internal.misc.ThreadFlock.close(ThreadFlock.java:432)
09:18:25  		at jdk.incubator.concurrent/jdk.incubator.concurrent.StructuredTaskScope.close(StructuredTaskScope.java:702)
09:18:25  		at StressStackOverflow.lambda$run$1(StressStackOverflow.java:159)
09:18:25  		... 9 more
09:18:25  	Suppressed: jdk.incubator.concurrent.StructureViolationException
09:18:25  		at java.base/jdk.internal.reflect.DirectConstructorHandleAccessor.newInstance(DirectConstructorHandleAccessor.java:67)
09:18:25  		at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:500)
09:18:25  		at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:484)
09:18:25  		at java.base/jdk.internal.misc.StructureViolationExceptions.newException(StructureViolationExceptions.java:45)
09:18:25  		at java.base/jdk.internal.misc.StructureViolationExceptions.newException(StructureViolationExceptions.java:58)
09:18:25  		at java.base/jdk.internal.vm.ScopedValueContainer.throwIfFailed(ScopedValueContainer.java:205)
09:18:25  		at java.base/jdk.internal.vm.ScopedValueContainer.runWithoutScope(ScopedValueContainer.java:122)
09:18:25  		... 8 more
09:18:25  Caused by: StressStackOverflow$TestFailureException: Unexpected value for ScopedValue
09:18:25  	at StressStackOverflow.<clinit>(StressStackOverflow.java:47)
09:18:25  	at java.base/jdk.internal.misc.Unsafe.ensureClassInitialized0(Native Method)
09:18:25  	at java.base/jdk.internal.misc.Unsafe.ensureClassInitialized(Unsafe.java:1522)
09:18:25  	at java.base/jdk.internal.reflect.MethodHandleAccessorFactory.ensureClassInitialized(MethodHandleAccessorFactory.java:300)
09:18:25  	at java.base/jdk.internal.reflect.MethodHandleAccessorFactory.newMethodAccessor(MethodHandleAccessorFactory.java:71)
09:18:25  	at java.base/jdk.internal.reflect.ReflectionFactory.newMethodAccessor(ReflectionFactory.java:159)
09:18:25  	at java.base/java.lang.reflect.Method.acquireMethodAccessor(Method.java:724)
09:18:25  	at java.base/java.lang.reflect.Method.invoke(Method.java:575)
09:18:25  	... 2 more
09:18:25  	Suppressed: jdk.incubator.concurrent.StructureViolationException
09:18:25  		at java.base/jdk.internal.reflect.DirectConstructorHandleAccessor.newInstance(DirectConstructorHandleAccessor.java:67)
09:18:25  		at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:500)
09:18:25  		at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:484)
09:18:25  		at java.base/jdk.internal.misc.StructureViolationExceptions.newException(StructureViolationExceptions.java:45)
09:18:25  		at java.base/jdk.internal.misc.StructureViolationExceptions.newException(StructureViolationExceptions.java:58)
09:18:25  		at java.base/jdk.internal.vm.ScopedValueContainer.throwIfFailed(ScopedValueContainer.java:205)
09:18:25  		at java.base/jdk.internal.vm.ScopedValueContainer.doCall(ScopedValueContainer.java:194)
09:18:25  		at java.base/jdk.internal.vm.ScopedValueContainer.call(ScopedValueContainer.java:151)
09:18:25  		at jdk.incubator.concurrent/jdk.incubator.concurrent.ScopedValue$Carrier.call(ScopedValue.java:370)
09:18:25  		at StressStackOverflow$DeepRecursion.run(StressStackOverflow.java:72)
09:18:25  		at StressStackOverflow.fibonacci_pad1(StressStackOverflow.java:117)
09:18:25  		at StressStackOverflow.fibonacci_pad1(StressStackOverflow.java:120)
09:18:25  		at StressStackOverflow.fibonacci_pad1(StressStackOverflow.java:120)
09:18:25  		at StressStackOverflow.fibonacci_pad1(StressStackOverflow.java:120)
09:18:25  		at StressStackOverflow.fibonacci_pad1(StressStackOverflow.java:120)
09:18:25  		at StressStackOverflow.fibonacci_pad1(StressStackOverflow.java:120)
09:18:25  		at StressStackOverflow.fibonacci_pad1(StressStackOverflow.java:120)
09:18:25  		at StressStackOverflow.fibonacci_pad1(StressStackOverflow.java:120)
09:18:25  		at StressStackOverflow.fibonacci_pad1(StressStackOverflow.java:120)
09:18:25  		at StressStackOverflow.fibonacci_pad1(StressStackOverflow.java:120)
09:18:25  		at StressStackOverflow.fibonacci_pad1(StressStackOverflow.java:120)
09:18:25  		at StressStackOverflow.fibonacci_pad1(StressStackOverflow.java:120)
09:18:25  		at StressStackOverflow.fibonacci_pad1(StressStackOverflow.java:120)
09:18:25  		at StressStackOverflow.fibonacci_pad1(StressStackOverflow.java:120)
09:18:25  		at StressStackOverflow.fibonacci_pad1(StressStackOverflow.java:120)
09:18:25  		at StressStackOverflow.fibonacci_pad1(StressStackOverflow.java:120)
09:18:25  		at StressStackOverflow.fibonacci_pad(StressStackOverflow.java:128)
09:18:25  		at StressStackOverflow$DeepRecursion.lambda$run$0(StressStackOverflow.java:72)
09:18:25  		at java.base/jdk.internal.vm.ScopedValueContainer.doCall(ScopedValueContainer.java:186)
09:18:25  		at java.base/jdk.internal.vm.ScopedValueContainer.call(ScopedValueContainer.java:151)
09:18:25  		at jdk.incubator.concurrent/jdk.incubator.concurrent.ScopedValue$Carrier.call(ScopedValue.java:370)

09:18:25  JavaTest Message: Test threw exception: java.lang.RuntimeException: StressStackOverflow$TestFailureException: Unexpected value for ScopedValue
09:18:25  JavaTest Message: shutting down test
09:18:25  
09:18:25  STATUS:Failed.`main' threw exception: java.lang.RuntimeException: StressStackOverflow$TestFailureException: Unexpected value for ScopedValue

09:18:25  TEST RESULT: Failed. Execution failed: `main' threw exception: java.lang.RuntimeException: StressStackOverflow$TestFailureException: Unexpected value for ScopedValue
09:18:25  --------------------------------------------------
09:18:25  Test results: passed: 275; failed: 1
09:18:30  Report written to /home/jenkins/workspace/Test_openjdk20_j9_extended.openjdk_aarch64_linux_Personal/aqa-tests/TKG/output_16789716238872/jdk_other_0/report/html/report.html
09:18:30  Results written to /home/jenkins/workspace/Test_openjdk20_j9_extended.openjdk_aarch64_linux_Personal/aqa-tests/TKG/output_16789716238872/jdk_other_0/work
09:18:30  Error: Some tests failed or other problems occurred.
09:18:30  -----------------------------------
09:18:30  jdk_other_0_FAILED

Also appeared in other platforms such as x86-64_linux s390x_linux

JasonFengJ9 avatar Mar 20 '23 18:03 JasonFengJ9

  • Pushing it to 20.02 since https://github.com/eclipse-openj9/openj9/issues/16439 is an incubator.
  • This may be related to https://github.com/eclipse-openj9/openj9/issues/16677.

babsingh avatar Apr 12 '23 16:04 babsingh

I modified scopedValueBindings to return a snapshot where inheritedValue is mapped 42. Under 100,000 iterations, the test passed and the unmodified version failed. But under the original 1,000,000 iterations this still failed. Not sure why would this happen but I think this indicates that the failure is caused by the unimplemented Thread.findScopedValueBindings as @babsingh mentioned.

thallium avatar Apr 22 '23 01:04 thallium

@thallium Open a draft PR with your implementation so it can be reviewed. Compare your implementation with the pseudo-code in https://github.com/eclipse-openj9/openj9/issues/16677#issuecomment-1520187982. Can you provide more details for the failure at 1,000,000 iterations; is it the same failure seen without Thread.findScopedValueBindings?

babsingh avatar Apr 24 '23 13:04 babsingh

Open a draft PR with your implementation so it can be reviewed

It's not a fix, just a hack to check if it's related to Thread.findScopedValueBindings .

is it the same failure seen without Thread.findScopedValueBindings?

Yes. @babsingh

thallium avatar Apr 24 '23 14:04 thallium

@thallium Do you have free cycles to implement the pseudo-code in https://github.com/eclipse-openj9/openj9/issues/16677#issuecomment-1520187982? If yes, refer to the below code to implement the pseudo-code.

https://github.com/eclipse-openj9/openj9/blob/ab8319b15d4975e38c0bf772a8433546cb022f94/runtime/jcl/common/jdk_internal_misc_ScopedMemoryAccess.cpp#L93-L106

babsingh avatar Apr 25 '23 16:04 babsingh

@babsingh Yes I'm working on this

thallium avatar Apr 25 '23 16:04 thallium

Ran a 20x grinder with -Xjit:disableInlining, the test is failing intermittently:

WARNING: Using incubator modules: jdk.incubator.concurrent
java.lang.RuntimeException: StressStackOverflow$TestFailureException: Unexpected value for ScopedValue
	at StressStackOverflow.lambda$run$1(StressStackOverflow.java:196)
	at java.base/jdk.internal.vm.ScopedValueContainer.runWithoutScope(ScopedValueContainer.java:114)
	at java.base/jdk.internal.vm.ScopedValueContainer.run(ScopedValueContainer.java:100)
	at jdk.incubator.concurrent/jdk.incubator.concurrent.ScopedValue$Carrier.run(ScopedValue.java:417)
	at StressStackOverflow.run(StressStackOverflow.java:158)
	at StressStackOverflow.main(StressStackOverflow.java:209)
	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104)
	at java.base/java.lang.reflect.Method.invoke(Method.java:578)
	at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:125)
	at java.base/java.lang.Thread.run(Thread.java:1639)
Caused by: StressStackOverflow$TestFailureException: Unexpected value for ScopedValue
	at StressStackOverflow.<clinit>(StressStackOverflow.java:47)
	at java.base/jdk.internal.misc.Unsafe.ensureClassInitialized0(Native Method)
	at java.base/jdk.internal.misc.Unsafe.ensureClassInitialized(Unsafe.java:1522)
	at java.base/jdk.internal.reflect.MethodHandleAccessorFactory.ensureClassInitialized(MethodHandleAccessorFactory.java:300)
	at java.base/jdk.internal.reflect.MethodHandleAccessorFactory.newMethodAccessor(MethodHandleAccessorFactory.java:71)
	at java.base/jdk.internal.reflect.ReflectionFactory.newMethodAccessor(ReflectionFactory.java:159)
	at java.base/java.lang.reflect.Method.acquireMethodAccessor(Method.java:724)
	at java.base/java.lang.reflect.Method.invoke(Method.java:575)
	... 2 more

thallium avatar Apr 27 '23 23:04 thallium

Ran the test 20x locally with -Xint (need to adjust the time limit) and all passed.

thallium avatar May 03 '23 15:05 thallium

re https://github.com/eclipse-openj9/openj9/issues/16965#issuecomment-1533246242: Thanks, @thallium.

  • @thallium clarified that the test passes with -Xint and without the changes for Thread.findScopedValueBindings https://github.com/eclipse-openj9/openj9/issues/16677.
  • I have added the JIT tag to the issue.
  • @thallium The next step will be to generate a core file during the exception and determine from the !stackslots the source of the Unexpected value for ScopedValue. This will give the JIT a better starting point to investigate.

babsingh avatar May 03 '23 15:05 babsingh

the source of the Unexpected value for ScopedValue

@babsingh Not too much I can find. The test tests whether the scoped value can be recovered correctly after a stack overflow. In each recursive call the scoped value el is increased by one and the equality of the value before and after the recursion. The test fails because the scoped value after the recursion is greater than the value before the recursion, but the Snapshot is correctly chained which means the scoped value wasn't been recovered.

thallium avatar May 04 '23 21:05 thallium

JVM option to create core files during StressStackOverflow$TestFailureException: -Xdump:system:events=catch,filter=*TestFailureException*

Java stack when the exception was thrown; the top frames are JIT'ed + inlined:

Scenario 1:

<202500>        !j9method 0x0000000000254EB0   jdk/incubator/concurrent/ScopedValue$Snapshot.find(Ljdk/incubator/concurrent/ScopedValue;)Ljava/lang/Object;
<202500>        !j9method 0x000000000024AFD8   jdk/incubator/concurrent/ScopedValue.findBinding()Ljava/lang/Object;
<202500>        !j9method 0x000000000024AF98   jdk/incubator/concurrent/ScopedValue.slowGet()Ljava/lang/Object;
<202500>        !j9method 0x000000000024AF78   jdk/incubator/concurrent/ScopedValue.get()Ljava/lang/Object;
<202500>        !j9method 0x000000000023F4E8   StressStackOverflow$DeepRecursion.run()V
<202500>        !j9method 0x000000000023E100   StressStackOverflow.fibonacci_pad1(ILjava/lang/Runnable;)J

Scenario 2:

<202500>        !j9method 0x0000000000258238   jdk/incubator/concurrent/ScopedValue$Carrier.runWith(Ljdk/incubator/concurrent/ScopedValue$Snapshot;Ljava/lang/Runnable;)V
<202500>        !j9method 0x0000000000258218   jdk/incubator/concurrent/ScopedValue$Carrier.run(Ljava/lang/Runnable;)V
<202500>        !j9method 0x000000000023F4E8   StressStackOverflow$DeepRecursion.run()V
<202500>        !j9method 0x000000000023E100   StressStackOverflow.fibonacci_pad1(ILjava/lang/Runnable;)J

@thallium To investigate Unexpected value for ScopedValue:

Based on the code in ScopedValue.scopedValueBindings, this issue might still be related to JLA.findScopedValueBindings (https://github.com/eclipse-openj9/openj9/issues/16677).

babsingh avatar May 12 '23 19:05 babsingh

Got seg fault when trying to print the value of the scoped value:

Type=Segmentation error vmState=0x000501ff
J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000001
Handler1=00007FADF1BA0940 Handler2=00007FADF1A07AF0 InaccessibleAddress=0000000000000008
RDI=00007FACC800E240 RSI=1FFFFFFFFFFFFFFF RAX=0000000000000000 RBX=0000000000000008
RCX=00007FADEC0178D0 RDX=0000000000000001 R8=0000000000000000 R9=0000000000000000
R10=00000000000501FF R11=0000000000000000 R12=00007FADEBEF0060 R13=00007FADEC4310B0
R14=00007FADF152ACE0 R15=00007FACC800E240
RIP=00007FADF0AB361E GS=0000 FS=0000 RSP=00007FACCDD468A0
EFlags=0000000000010246 CS=0033 RBP=00007FACC8003C00 ERR=0000000000000004
TRAPNO=000000000000000E OLDMASK=0000000000000000 CR2=0000000000000008
xmm0 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm1 0000002e000000ba (f: 186.000000, d: 9.761181e-313)
xmm2 000000000000002e (f: 46.000000, d: 2.272702e-322)
xmm3 00000000000000be (f: 190.000000, d: 9.387247e-322)
xmm4 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm5 00007facadd96200 (f: 2916704768.000000, d: 6.935675e-310)
xmm6 00007fadec432e48 (f: 3963825664.000000, d: 6.935939e-310)
xmm7 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm8 6f6e67695f525400 (f: 1599230976.000000, d: 5.762047e+228)
xmm9 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm10 757575767575757e (f: 1970632064.000000, d: 6.444104e+257)
xmm11 0000015200000151 (f: 337.000000, d: 7.172346e-312)
xmm12 0000013d00000140 (f: 320.000000, d: 6.726727e-312)
xmm13 000001380000013f (f: 319.000000, d: 6.620627e-312)
xmm14 0000000008001800 (f: 134223872.000000, d: 6.631540e-316)
xmm15 000001420000013b (f: 315.000000, d: 6.832826e-312)
Module=/home/thallium/openj9-openjdk-jdk20/build/linux-x86_64-server-release/images/jdk/lib/default/libj9jit29.so
Module_base_address=00007FADF086E000

Method_being_compiled=jdk/incubator/concurrent/ScopedValue$Carrier.call(Ljava/util/concurrent/Callable;)Ljava/lang/Object;
Target=2_90_20230329_000000 (Linux 5.15.0-71-generic)
CPU=amd64 (8 logical CPUs) (0x3e808c000 RAM)
----------- Stack Backtrace -----------
_ZN11TR_J9VMBase19getReferenceElementEml+0xde (0x00007FADF0AB361E [libj9jit29.so+0x24561e])
_ZN11TR_J9VMBase45getKnotIndexOfInvokeCacheArrayAppendixElementEPN2TR11CompilationEPm+0x81 (0x00007FADF0AB1961 [libj9jit29.so+0x243961])
_ZN19InterpreterEmulator48updateKnotAndCreateCallSiteUsingInvokeCacheArrayEP19TR_ResolvedJ9MethodPmi+0x3c (0x00007FADF0C1ED9C [libj9jit29.so+0x3b0d9c])
_ZN19InterpreterEmulator35findAndCreateCallsitesFromBytecodesEbb+0x1b0 (0x00007FADF0C23330 [libj9jit29.so+0x3b5330])
_ZN21TR_J9EstimateCodeSize20realEstimateCodeSizeEP13TR_CallTargetP12TR_CallStackbRN2TR6RegionE+0xb2c (0x00007FADF0C1789C [libj9jit29.so+0x3a989c])
_ZN21TR_J9EstimateCodeSize20realEstimateCodeSizeEP13TR_CallTargetP12TR_CallStackbRN2TR6RegionE+0x1ad7 (0x00007FADF0C18847 [libj9jit29.so+0x3aa847])
_ZN21TR_J9EstimateCodeSize20realEstimateCodeSizeEP13TR_CallTargetP12TR_CallStackbRN2TR6RegionE+0x1ad7 (0x00007FADF0C18847 [libj9jit29.so+0x3aa847])
_ZN21TR_J9EstimateCodeSize20realEstimateCodeSizeEP13TR_CallTargetP12TR_CallStackbRN2TR6RegionE+0x1ad7 (0x00007FADF0C18847 [libj9jit29.so+0x3aa847])
_ZN21TR_J9EstimateCodeSize20realEstimateCodeSizeEP13TR_CallTargetP12TR_CallStackbRN2TR6RegionE+0x1ad7 (0x00007FADF0C18847 [libj9jit29.so+0x3aa847])
_ZN21TR_J9EstimateCodeSize20realEstimateCodeSizeEP13TR_CallTargetP12TR_CallStackbRN2TR6RegionE+0x1ad7 (0x00007FADF0C18847 [libj9jit29.so+0x3aa847])
_ZN21TR_J9EstimateCodeSize16estimateCodeSizeEP13TR_CallTargetP12TR_CallStackb.localalias+0x45 (0x00007FADF0C19E55 [libj9jit29.so+0x3abe55])
_ZN19TR_EstimateCodeSize17calculateCodeSizeEP13TR_CallTargetP12TR_CallStackb+0xc3 (0x00007FADF0B87533 [libj9jit29.so+0x319533])
_ZN28TR_MultipleCallTargetInliner13weighCallSiteEP12TR_CallStackP11TR_CallSitebb+0x1fb (0x00007FADF0C0487B [libj9jit29.so+0x39687b])
_ZN28TR_MultipleCallTargetInliner17inlineCallTargetsEPN2TR20ResolvedMethodSymbolEP12TR_CallStackP24TR_InnerPreexistenceInfo+0x16d7 (0x00007FADF0C0D007 [libj9jit29.so+0x39f007])
_ZN14TR_InlinerBase15performInliningEPN2TR20ResolvedMethodSymbolE+0xbb (0x00007FADF0EF013B [libj9jit29.so+0x68213b])
_ZN10TR_Inliner7performEv+0x14a (0x00007FADF0C03DCA [libj9jit29.so+0x395dca])
_ZN3OMR9Optimizer19performOptimizationEPK20OptimizationStrategyiii.localalias+0x865 (0x00007FADF1008E45 [libj9jit29.so+0x79ae45])
_ZN3OMR9Optimizer19performOptimizationEPK20OptimizationStrategyiii.localalias+0xc2f (0x00007FADF100920F [libj9jit29.so+0x79b20f])
_ZN3OMR9Optimizer8optimizeEv+0x1c3 (0x00007FADF100AB93 [libj9jit29.so+0x79cb93])
_ZN3OMR11Compilation7compileEv+0xa85 (0x00007FADF0DF61F5 [libj9jit29.so+0x5881f5])
_ZN2TR28CompilationInfoPerThreadBase7compileEP10J9VMThreadPNS_11CompilationEP17TR_ResolvedMethodR11TR_J9VMBaseP19TR_OptimizationPlanRKNS_16SegmentAllocatorE+0x4c4 (0x00007FADF09E2FA4 [libj9jit29.so+0x174fa4])
_ZN2TR28CompilationInfoPerThreadBase14wrappedCompileEP13J9PortLibraryPv+0x366 (0x00007FADF09E3FA6 [libj9jit29.so+0x175fa6])
omrsig_protect+0x241 (0x00007FADF1A087A1 [libj9prt29.so+0x297a1])
_ZN2TR28CompilationInfoPerThreadBase7compileEP10J9VMThreadP21TR_MethodToBeCompiledRN2J917J9SegmentProviderE+0x382 (0x00007FADF09E1AC2 [libj9jit29.so+0x173ac2])
_ZN2TR24CompilationInfoPerThread12processEntryER21TR_MethodToBeCompiledRN2J917J9SegmentProviderE+0x12c (0x00007FADF09E1E4C [libj9jit29.so+0x173e4c])
_ZN2TR24CompilationInfoPerThread14processEntriesEv+0x35b (0x00007FADF09E0CFB [libj9jit29.so+0x172cfb])
_ZN2TR24CompilationInfoPerThread3runEv+0x42 (0x00007FADF09E1072 [libj9jit29.so+0x173072])
_Z30protectedCompilationThreadProcP13J9PortLibraryPN2TR24CompilationInfoPerThreadE+0x86 (0x00007FADF09E1126 [libj9jit29.so+0x173126])
omrsig_protect+0x241 (0x00007FADF1A087A1 [libj9prt29.so+0x297a1])
_Z21compilationThreadProcPv+0x17f (0x00007FADF09E14FF [libj9jit29.so+0x1734ff])
thread_wrapper+0x187 (0x00007FADF1B4EC07 [libj9thr29.so+0xbc07])
 (0x00007FADF1F6BB43 [libc.so.6+0x94b43])
 (0x00007FADF1FFDA00 [libc.so.6+0x126a00])
---------------------------------------

Stack trace from gdb:

#0  __pthread_kill_implementation (no_tid=0, signo=11, threadid=140380164421184) at ./nptl/pthread_kill.c:44
#1  __pthread_kill_internal (signo=11, threadid=140380164421184) at ./nptl/pthread_kill.c:78
#2  __GI___pthread_kill (threadid=140380164421184, signo=11) at ./nptl/pthread_kill.c:89
#3  0x00007fadf1a06d72 in omrdump_create (portLibrary=0x7fadf1e21400 <j9portLibrary>, filename=0x7faccdd44650 "/home/thallium/aqa-tests/TKG/output_16841642144435/jdk_custom_1/work/scratch/", dumpType=<optimized out>, userData=<optimized out>)
    at /home/thallium/openj9-openjdk-jdk20/omr/port/unix/omrosdump.c:188
#4  0x00007fadf158c003 in doSystemDump (agent=0x7fadec02c580, label=0x7faccdd44650 "/home/thallium/aqa-tests/TKG/output_16841642144435/jdk_custom_1/work/scratch/", context=<optimized out>) at /home/thallium/openj9-openjdk-jdk20/openj9/runtime/rasdump/dmpagent.c:759
#5  0x00007fadf1587c49 in protectedDumpFunction (portLibrary=portLibrary@entry=0x7fadf1e21400 <j9portLibrary>, userData=userData@entry=0x7faccdd445b0) at /home/thallium/openj9-openjdk-jdk20/openj9/runtime/rasdump/dmpagent.c:2893
#6  0x00007fadf1a087a1 in omrsig_protect (portLibrary=0x7fadf1e21400 <j9portLibrary>, fn=0x7fadf1587c30 <protectedDumpFunction>, fn_arg=0x7faccdd445b0, handler=0x7fadf1587c50 <signalHandler>, handler_arg=0x0, flags=505, result=0x7faccdd445a8)
    at /home/thallium/openj9-openjdk-jdk20/omr/port/unix/omrsignal.c:425
#7  0x00007fadf158b476 in runDumpFunction (agent=<optimized out>, label=0x7faccdd44650 "/home/thallium/aqa-tests/TKG/output_16841642144435/jdk_custom_1/work/scratch/", context=<optimized out>) at /home/thallium/openj9-openjdk-jdk20/openj9/runtime/rasdump/dmpagent.c:2871
#8  0x00007fadf158b601 in runDumpAgent (vm=vm@entry=0x7fadec0178d0, agent=agent@entry=0x7fadec02c580, context=context@entry=0x7faccdd44b20, state=state@entry=0x7faccdd44b10, detail=detail@entry=0x7faccdd44ca0 "", timeNow=timeNow@entry=1684164524279)
    at /home/thallium/openj9-openjdk-jdk20/openj9/runtime/rasdump/dmpagent.c:2790
#9  0x00007fadf15a2c15 in triggerDumpAgents (vm=0x7fadec0178d0, self=0x7facc8003c00, eventFlags=8192, eventData=<optimized out>) at /home/thallium/openj9-openjdk-jdk20/openj9/runtime/rasdump/trigger.c:1037
#10 0x00007fadf1ba0532 in generateDiagnosticFiles (portLibrary=portLibrary@entry=0x7fadf1e21400 <j9portLibrary>, userData=userData@entry=0x7faccdd45180) at /home/thallium/openj9-openjdk-jdk20/openj9/runtime/vm/gphandle.c:1162
#11 0x00007fadf1a087a1 in omrsig_protect (portLibrary=0x7fadf1e21400 <j9portLibrary>, fn=0x7fadf1ba03d0 <generateDiagnosticFiles>, fn_arg=0x7faccdd45180, handler=0x7fadf1b9f9c0 <recursiveCrashHandler>, handler_arg=0x7faccdd45150, flags=505, result=0x7faccdd45148)
    at /home/thallium/openj9-openjdk-jdk20/omr/port/unix/omrsignal.c:425
#12 0x00007fadf1ba06ca in vmSignalHandler (portLibrary=0x7fadf1e21400 <j9portLibrary>, gpType=24, gpInfo=<optimized out>, userData=<optimized out>) at /home/thallium/openj9-openjdk-jdk20/openj9/runtime/vm/gphandle.c:837
#13 0x00007fadf1a07d28 in mainSynchSignalHandler (signal=11, sigInfo=0x7faccdd46430, contextInfo=0x7faccdd46300) at /home/thallium/openj9-openjdk-jdk20/omr/port/unix/omrsignal.c:1066
#14 <signal handler called>
#15 j9javaArrayOfObject_load (index=<optimized out>, array=0x7fadebef0060, vmThread=0x7facc8003c00) at /home/thallium/openj9-openjdk-jdk20/openj9/runtime/oti/j9accessbarrierhelpers.h:57
#16 TR_J9VMBase::getReferenceElement (this=<optimized out>, objectPointer=140384964378720, elementIndex=<optimized out>) at /home/thallium/openj9-openjdk-jdk20/openj9/runtime/compiler/env/VMJ9.cpp:1277
#17 0x00007fadf0ab1961 in TR_J9VMBase::getKnotIndexOfInvokeCacheArrayAppendixElement (this=0x7facc800e240, comp=<optimized out>, invokeCacheArray=0x7fadec4310b0) at /home/thallium/openj9-openjdk-jdk20/openj9/runtime/compiler/env/VMJ9.cpp:4856
#18 0x00007fadf0c1ed9c in InterpreterEmulator::updateKnotAndCreateCallSiteUsingInvokeCacheArray (this=0x7faccdd46b80, owningMethod=0x7facadd88730, invokeCacheArray=0x7fadec4310b0, cpIndex=-1)
    at /home/thallium/openj9-openjdk-jdk20/openj9/runtime/compiler/optimizer/InterpreterEmulator.cpp:1312
#19 0x00007fadf0c23330 in InterpreterEmulator::findAndCreateCallsitesFromBytecodes (this=this@entry=0x7faccdd46b80, wasPeekingSuccessfull=wasPeekingSuccessfull@entry=false, withState=withState@entry=false)
    at /home/thallium/openj9-openjdk-jdk20/openj9/runtime/compiler/optimizer/InterpreterEmulator.cpp:1180
#20 0x00007fadf0c1789c in TR_J9EstimateCodeSize::realEstimateCodeSize (this=<optimized out>, calltarget=0x7facadd889d0, prevCallStack=<optimized out>, recurseDown=<optimized out>, cfgRegion=...)
    at /home/thallium/openj9-openjdk-jdk20/openj9/runtime/compiler/optimizer/J9EstimateCodeSize.cpp:1335
#21 0x00007fadf0c18847 in TR_J9EstimateCodeSize::estimateCodeSize (recurseDown=true, prevCallStack=0x7faccdd476f0, calltarget=0x7facadd889d0, this=0x7facadcfe220) at /home/thallium/openj9-openjdk-jdk20/openj9/runtime/compiler/optimizer/J9EstimateCodeSize.cpp:417
#22 TR_J9EstimateCodeSize::realEstimateCodeSize (this=<optimized out>, calltarget=0x7facadd87c20, prevCallStack=<optimized out>, recurseDown=<optimized out>, cfgRegion=...) at /home/thallium/openj9-openjdk-jdk20/openj9/runtime/compiler/optimizer/J9EstimateCodeSize.cpp:1556
#23 0x00007fadf0c18847 in TR_J9EstimateCodeSize::estimateCodeSize (recurseDown=true, prevCallStack=0x7faccdd482e0, calltarget=0x7facadd87c20, this=0x7facadcfe220) at /home/thallium/openj9-openjdk-jdk20/openj9/runtime/compiler/optimizer/J9EstimateCodeSize.cpp:417
#24 TR_J9EstimateCodeSize::realEstimateCodeSize (this=<optimized out>, calltarget=0x7facadd4dec0, prevCallStack=<optimized out>, recurseDown=<optimized out>, cfgRegion=...) at /home/thallium/openj9-openjdk-jdk20/openj9/runtime/compiler/optimizer/J9EstimateCodeSize.cpp:1556
#25 0x00007fadf0c18847 in TR_J9EstimateCodeSize::estimateCodeSize (recurseDown=true, prevCallStack=0x7faccdd48ed0, calltarget=0x7facadd4dec0, this=0x7facadcfe220) at /home/thallium/openj9-openjdk-jdk20/openj9/runtime/compiler/optimizer/J9EstimateCodeSize.cpp:417
#26 TR_J9EstimateCodeSize::realEstimateCodeSize (this=<optimized out>, calltarget=0x7facadd483c0, prevCallStack=<optimized out>, recurseDown=<optimized out>, cfgRegion=...) at /home/thallium/openj9-openjdk-jdk20/openj9/runtime/compiler/optimizer/J9EstimateCodeSize.cpp:1556
#27 0x00007fadf0c18847 in TR_J9EstimateCodeSize::estimateCodeSize (recurseDown=true, prevCallStack=0x7faccdd49ac0, calltarget=0x7facadd483c0, this=0x7facadcfe220) at /home/thallium/openj9-openjdk-jdk20/openj9/runtime/compiler/optimizer/J9EstimateCodeSize.cpp:417
#28 TR_J9EstimateCodeSize::realEstimateCodeSize (this=<optimized out>, calltarget=0x7facadd13640, prevCallStack=<optimized out>, recurseDown=<optimized out>, cfgRegion=...) at /home/thallium/openj9-openjdk-jdk20/openj9/runtime/compiler/optimizer/J9EstimateCodeSize.cpp:1556
#29 0x00007fadf0c18847 in TR_J9EstimateCodeSize::estimateCodeSize (recurseDown=true, prevCallStack=0x7faccdd4a6b0, calltarget=0x7facadd13640, this=0x7facadcfe220) at /home/thallium/openj9-openjdk-jdk20/openj9/runtime/compiler/optimizer/J9EstimateCodeSize.cpp:417
#30 TR_J9EstimateCodeSize::realEstimateCodeSize (this=<optimized out>, calltarget=0x7facadd11b60, prevCallStack=<optimized out>, recurseDown=<optimized out>, cfgRegion=...) at /home/thallium/openj9-openjdk-jdk20/openj9/runtime/compiler/optimizer/J9EstimateCodeSize.cpp:1556
#31 0x00007fadf0c19e55 in TR_J9EstimateCodeSize::estimateCodeSize (this=0x7facadcfe220, calltarget=0x7facadd11b60, prevCallStack=0x7faccdd4c3a0, recurseDown=<optimized out>) at /home/thallium/openj9-openjdk-jdk20/openj9/runtime/compiler/optimizer/J9EstimateCodeSize.cpp:417
#32 0x00007fadf0b87533 in TR_EstimateCodeSize::calculateCodeSize (this=this@entry=0x7facadcfe220, calltarget=calltarget@entry=0x7facadd11b60, callStack=callStack@entry=0x7faccdd4c3a0, recurseDown=recurseDown@entry=true)
    at /home/thallium/openj9-openjdk-jdk20/openj9/runtime/compiler/optimizer/EstimateCodeSize.cpp:92
#33 0x00007fadf0c0487b in TR_MultipleCallTargetInliner::weighCallSite (this=0x7faccdd4d510, callStack=<optimized out>, callsite=0x7faccdd4d1e0, currentBlockHasExceptionSuccessors=<optimized out>, dontAddCalls=<optimized out>)
    at /home/thallium/openj9-openjdk-jdk20/openj9/runtime/compiler/optimizer/InlinerTempForJ9.cpp:3366
#34 0x00007fadf0c0d007 in TR_MultipleCallTargetInliner::inlineCallTargets (this=0x7faccdd4d510, callerSymbol=0x7facadc7a750, prevCallStack=0x0, innerPrexInfo=<optimized out>) at /home/thallium/openj9-openjdk-jdk20/openj9/runtime/compiler/optimizer/InlinerTempForJ9.cpp:3087
#35 0x00007fadf0ef013b in TR_InlinerBase::performInlining (this=this@entry=0x7faccdd4d510, callerSymbol=callerSymbol@entry=0x7facadc7a750) at /home/thallium/openj9-openjdk-jdk20/omr/compiler/optimizer/Inliner.cpp:453
#36 0x00007fadf0c03dca in TR_Inliner::perform (this=0x7facadc850a0) at /home/thallium/openj9-openjdk-jdk20/openj9/runtime/compiler/optimizer/InlinerTempForJ9.cpp:2570
#37 0x00007fadf1008e45 in OMR::Optimizer::performOptimization (this=0x7facadd06f50, optimization=<optimized out>, firstOptIndex=<optimized out>, lastOptIndex=<optimized out>, doTiming=<optimized out>)
    at /home/thallium/openj9-openjdk-jdk20/omr/compiler/optimizer/OMROptimizer.cpp:2064
#38 0x00007fadf100920f in OMR::Optimizer::performOptimization (this=<optimized out>, optimization=<optimized out>, firstOptIndex=<optimized out>, lastOptIndex=2147483647, doTiming=0) at /home/thallium/openj9-openjdk-jdk20/omr/compiler/optimizer/OMROptimizer.cpp:1611
#39 0x00007fadf100ab93 in OMR::Optimizer::optimize (this=0x7facadd06f50) at /home/thallium/openj9-openjdk-jdk20/omr/compiler/optimizer/OMROptimizer.cpp:1128
#40 0x00007fadf0dee3fb in OMR::Compilation::performOptimizations (this=this@entry=0x7facadc75000) at /home/thallium/openj9-openjdk-jdk20/omr/compiler/compile/OMRCompilation.cpp:1267
#41 0x00007fadf0df61f5 in OMR::Compilation::compile (this=this@entry=0x7facadc75000) at /home/thallium/openj9-openjdk-jdk20/omr/compiler/compile/OMRCompilation.cpp:1064
#42 0x00007fadf09e2fa4 in TR::CompilationInfoPerThreadBase::compile (this=0x7facddd6d0b0, vmThread=<optimized out>, compiler=0x7facadc75000, compilee=<optimized out>, vm=..., optimizationPlan=<optimized out>, scratchSegmentProvider=...)
    at /home/thallium/openj9-openjdk-jdk20/openj9/runtime/compiler/control/CompilationThread.cpp:9989
#43 0x00007fadf09e3fa6 in TR::CompilationInfoPerThreadBase::wrappedCompile (portLib=portLib@entry=0x7fadf1e21400 <j9portLibrary>, opaqueParameters=opaqueParameters@entry=0x7faccdd53700)
    at /home/thallium/openj9-openjdk-jdk20/openj9/runtime/compiler/control/CompilationThread.cpp:9482
#44 0x00007fadf1a087a1 in omrsig_protect (portLibrary=0x7fadf1e21400 <j9portLibrary>, fn=0x7fadf09e3c40 <TR::CompilationInfoPerThreadBase::wrappedCompile(J9PortLibrary*, void*)>, fn_arg=0x7faccdd53700,
    handler=0x7fadf09cc6a0 <jitSignalHandler(J9PortLibrary*, uint32_t, void*, void*)>, handler_arg=0x7facc8003c00, flags=505, result=0x7faccdd536d8) at /home/thallium/openj9-openjdk-jdk20/omr/port/unix/omrsignal.c:425
#45 0x00007fadf09e1ac2 in TR::CompilationInfoPerThreadBase::compile (this=0x7facddd6d0b0, vmThread=0x7facc8003c00, entry=0x7fadec317010, scratchSegmentProvider=...) at /home/thallium/openj9-openjdk-jdk20/openj9/runtime/compiler/control/CompilationThread.cpp:8467
#46 0x00007fadf09e1e4c in TR::CompilationInfoPerThread::processEntry (this=0x7facddd6d0b0, entry=..., scratchSegmentProvider=...) at /home/thallium/openj9-openjdk-jdk20/openj9/runtime/compiler/control/CompilationThread.cpp:4692
#47 0x00007fadf09e0cfb in TR::CompilationInfoPerThread::processEntries (this=0x7facddd6d0b0) at /home/thallium/openj9-openjdk-jdk20/openj9/runtime/compiler/control/CompilationThread.cpp:4397
#48 0x00007fadf09e1072 in TR::CompilationInfoPerThread::run (this=this@entry=0x7facddd6d0b0) at /home/thallium/openj9-openjdk-jdk20/openj9/runtime/compiler/control/CompilationThread.cpp:4232
#49 0x00007fadf09e1126 in protectedCompilationThreadProc (compInfoPT=compInfoPT@entry=0x7facddd6d0b0) at /home/thallium/openj9-openjdk-jdk20/openj9/runtime/compiler/control/CompilationThread.cpp:4166
#50 0x00007fadf1a087a1 in omrsig_protect (portLibrary=0x7fadf1e21400 <j9portLibrary>, fn=0x7fadf09e10a0 <protectedCompilationThreadProc(J9PortLibrary*, TR::CompilationInfoPerThread*)>, fn_arg=0x7facddd6d0b0, handler=0x7fadf1ba0940 <structuredSignalHandler>,
    handler_arg=0x7facc8003c00, flags=506, result=0x7faccdd54d08) at /home/thallium/openj9-openjdk-jdk20/omr/port/unix/omrsignal.c:425
#51 0x00007fadf09e14ff in compilationThreadProc (entryarg=0x7facddd6d0b0) at /home/thallium/openj9-openjdk-jdk20/openj9/runtime/compiler/control/CompilationThread.cpp:4071
#52 0x00007fadf1b4ec07 in thread_wrapper (arg=0x7fadec0ea850) at /home/thallium/openj9-openjdk-jdk20/omr/thread/common/omrthread.c:1724
#53 0x00007fadf1f6bb43 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#54 0x00007fadf1ffda00 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

thallium avatar May 15 '23 15:05 thallium

fyi @0xdaryl A segfault in the JIT is noticed while debugging this failure. Please see https://github.com/eclipse-openj9/openj9/issues/16965#issuecomment-1548113080 for more details.

babsingh avatar May 15 '23 15:05 babsingh

@nbhuiyan : inliner crash above. Please investigate for 20.0.2.

0xdaryl avatar May 16 '23 10:05 0xdaryl

It is probably worth opening a separate JIT issue for the seg fault to divorce it from the original issue being investigated, unless anyone feels otherwise.

0xdaryl avatar May 16 '23 10:05 0xdaryl

probably worth opening a separate JIT issue for the seg fault

Original issue: https://github.com/eclipse-openj9/openj9/issues/16965#issue-1632608762 Inliner issue: https://github.com/eclipse-openj9/openj9/issues/16965#issuecomment-1548113080

The original issue goes away with -Xint; ref: https://github.com/eclipse-openj9/openj9/issues/16965#issuecomment-1533283569. The original and inliner issue might be the same issue. In the original issue, an exception is thrown for the unexpected ScopedValue. The inliner crash occurs when the same unexpected ScopedValue is printed. We should do a preliminary analysis before diverging into two separate issues.

babsingh avatar May 16 '23 14:05 babsingh

findScopedValueBindings() is never called in this test and we don't have a test to validate the implementation.

thallium avatar May 23 '23 19:05 thallium

Removed comp:vm based upon the above finding. This issue can focus on the JIT inliner failure.

babsingh avatar May 30 '23 19:05 babsingh

@babsingh this issue might require some insights from the VM side. The reason why we see the problem during inlining is because of an invalid entry in the invokecache array for an invokedynamic bytecode. This problem is seen when trying to obtain the address of the appendix object, which returns 0x8, which is then dereferenced by the VM helper that does array of object loads, resulting in the segfault we see here. The JIT never modifies the entries in the invoke cache.

nbhuiyan avatar Jun 05 '23 21:06 nbhuiyan

@nbhuiyan Can you provide more details about the invokedynamic bytecode? What is the full bytecode? Which method it belongs to? invokeCacheArray is a two-element sized array; invokeCacheArray[0] is the MemberName and invokeCacheArray[1] is the appendix; is there also a bad value for invokeCacheArray[0] == MemberName?

babsingh avatar Jun 06 '23 20:06 babsingh

The method being compiled (or inlined into the caller) is StressStackOverflow.fibonacci_pad(ILjava/lang/Runnable;)J. Both MemberName and appendix can be invalid since the invokeCache we obtain is invalid. In the example I use here, the problem comes from dealing with an invalid MemberName, and I expect the appendix would be invalid as well. By locally reproducing this crash, I have found that the crash happens during IL generation if this method is prevented from getting inlined for the same underlying reason that caused the crash in the inliner reported earlier. The problematic bytecode is 56:

...
   41 ifne 68
   44 getstatic 26 java/lang/System.out Ljava/io/PrintStream;
   47 getstatic 23 StressStackOverflow.inheritedValue Ljdk/incubator/concurrent/ScopedValue;
   50 invokevirtual 18 jdk/incubator/concurrent/ScopedValue.get()Ljava/lang/Object;
   53 invokestatic 28 java/lang/String.valueOf(Ljava/lang/Object;)Ljava/lang/String;
   56 invokedynamic 0 bsm #0:makeConcatWithConstants(Ljava/lang/String;)Ljava/lang/String;
   59 nop
...

When compilation is isolated to the method where this failure is seen, the backtrace looks like this:

#12 <signal handler called>
#13 getOriginalROMMethodUnchecked (method=method@entry=0xe0004) at /root/builds/openj9-openjdk-jdk20/openj9/runtime/util/romhelp.c:33
#14 0x00007fc32fb49bbd in getOriginalROMMethod (method=0xe0004) at /root/builds/openj9-openjdk-jdk20/openj9/runtime/util/romhelp.c:65
#15 0x00007fc32f383ad6 in TR_J9Method::TR_J9Method (this=this@entry=0x7fc31a6a0a00, fe=fe@entry=0x7fc31400a5c0, trMemory=trMemory@entry=0x7fc32b83e780, aMethod=aMethod@entry=0xe0004) at /root/builds/openj9-openjdk-jdk20/openj9/runtime/compiler/env/j9method.cpp:1970
#16 0x00007fc32f383bc4 in TR_ResolvedJ9Method::TR_ResolvedJ9Method (this=0x7fc31a6a0a00, aMethod=0xe0004, fe=0x7fc31400a5c0, trMemory=0x7fc32b83e780, owner=0x7fc32b83e9f8, vTableSlot=0) at /root/builds/openj9-openjdk-jdk20/openj9/runtime/compiler/env/j9method.cpp:2005
#17 0x00007fc32f386010 in TR_J9VMBase::createResolvedMethodWithSignature (this=0x7fc31400a5c0, trMemory=0x7fc32b83e780, aMethod=0xe0004, classForNewInstance=0x0, signature=0x0, signatureLength=-1, owningMethod=0x7fc32b83e9f8, vTableSlot=0) at /root/builds/openj9-openjdk-jdk20/openj9/runtime/compiler/env/j9method.cpp:241
#18 0x00007fc32f38123f in TR_J9VMBase::createResolvedMethod (this=<optimized out>, trMemory=<optimized out>, aMethod=<optimized out>, owningMethod=<optimized out>, classForNewInstance=<optimized out>) at /root/builds/openj9-openjdk-jdk20/openj9/runtime/compiler/env/j9method.cpp:204
#19 0x00007fc32f380cb2 in TR_ResolvedJ9Method::getResolvedDynamicMethod (this=0x7fc32b83e9a0, comp=0x7fc31a600000, callSiteIndex=<optimized out>, unresolvedInCP=<optimized out>, isInvokeCacheAppendixNull=0x7fc32b83975f) at /root/builds/openj9-openjdk-jdk20/openj9/runtime/compiler/env/j9method.cpp:6613
#20 0x00007fc32f2aa760 in J9::SymbolReferenceTable::findOrCreateDynamicMethodSymbol (this=0x7fc31a601d70, owningMethodSymbol=0x7fc31a605c00, callSiteIndex=<optimized out>, unresolvedInCP=unresolvedInCP@entry=0x7fc32b83975e, isInvokeCacheAppendixNull=isInvokeCacheAppendixNull@entry=0x7fc32b83975f) at /root/builds/openj9-openjdk-jdk20/openj9/runtime/compiler/compile/J9SymbolReferenceTable.cpp:429
#21 0x00007fc32f4194f7 in TR_J9ByteCodeIlGenerator::stashArgumentsForOSR (this=this@entry=0x7fc31a622350, byteCode=J9BCinvokedynamic) at /root/builds/openj9-openjdk-jdk20/openj9/runtime/compiler/ilgen/Walker.cpp:1638
#22 0x00007fc32f42eadf in TR_J9ByteCodeIlGenerator::walker (this=this@entry=0x7fc31a622350, prevBlock=prevBlock@entry=0x7fc31a650500) at /root/builds/openj9-openjdk-jdk20/openj9/runtime/compiler/ilgen/Walker.cpp:249
#23 0x00007fc32f40e980 in TR_J9ByteCodeIlGenerator::genExceptionHandlers (this=this@entry=0x7fc31a622350, lastBlock=lastBlock@entry=0x7fc31a650500) at /root/builds/openj9-openjdk-jdk20/openj9/runtime/compiler/ilgen/IlGenerator.cpp:987
#24 0x00007fc32f412466 in TR_J9ByteCodeIlGenerator::genILFromByteCodes (this=0x7fc31a622350) at /root/builds/openj9-openjdk-jdk20/openj9/runtime/compiler/ilgen/IlGenerator.cpp:364
#25 0x00007fc32f412d9b in TR_J9ByteCodeIlGenerator::internalGenIL (this=this@entry=0x7fc31a622350) at /root/builds/openj9-openjdk-jdk20/openj9/runtime/compiler/ilgen/IlGenerator.cpp:290
#26 0x00007fc32f4131c6 in TR_J9ByteCodeIlGenerator::genIL (this=0x7fc31a622350) at /root/builds/openj9-openjdk-jdk20/openj9/runtime/compiler/ilgen/IlGenerator.cpp:144
#27 0x00007fc32f702997 in OMR::ResolvedMethodSymbol::genIL (this=0x7fc31a605c00, fe=0x7fc31400a5c0, comp=comp@entry=0x7fc31a600000, symRefTab=0x7fc31a601d70, customRequest=...) at /root/builds/openj9-openjdk-jdk20/omr/compiler/il/OMRResolvedMethodSymbol.cpp:1207
#28 0x00007fc32f6e7d4c in OMR::Compilation::compile (this=this@entry=0x7fc31a600000) at /root/builds/openj9-openjdk-jdk20/omr/compiler/compile/OMRCompilation.cpp:1008
#29 0x00007fc32f2cc65c in TR::CompilationInfoPerThreadBase::compile (this=this@entry=0x7fc32b858390, vmThread=vmThread@entry=0x1e700, compiler=<optimized out>, compilee=compilee@entry=0x7fc32b83e9f8, vm=..., optimizationPlan=<optimized out>, scratchSegmentProvider=...) at /root/builds/openj9-openjdk-jdk20/openj9/runtime/compiler/control/CompilationThread.cpp:9988
#30 0x00007fc32f2cd6d9 in TR::CompilationInfoPerThreadBase::wrappedCompile (portLib=portLib@entry=0x7fc335c5c3e0 <j9portLibrary>, opaqueParameters=opaqueParameters@entry=0x7fc32b83e7a0) at /root/builds/openj9-openjdk-jdk20/openj9/runtime/compiler/control/CompilationThread.cpp:9486

In this crash, the callSite entry corresponding to the bytecode was not null, verified by checking if the entry at callSiteIndex 0 (like we have in the bytecode):

https://github.com/eclipse-openj9/openj9/blob/a423ccd489865ade4ec51d1deb185b4408ab95a5/runtime/compiler/env/j9method.cpp#L5912-L5916

This indicates to the JIT that the call site table entry corresponding to that index is resolved, so then we try to load the target method from the MemberName entry and construct a resolved method:

https://github.com/eclipse-openj9/openj9/blob/a423ccd489865ade4ec51d1deb185b4408ab95a5/runtime/compiler/env/j9method.cpp#L6601-L6614

Now going back to the crash backtrace, we have

#13 getOriginalROMMethodUnchecked (method=method@entry=0xe0004) at /root/builds/openj9-openjdk-jdk20/openj9/runtime/util/romhelp.c:33

So the J9Method we got from accessing thevmtarget offset from what we are treating like a MemberName object gots us 0xe0004, which does not look like a valid J9Method that is supposed to be the invokedynamic adapter method:

> !J9Method 0xe0004
J9Method at 0xe0004 {
  Fields for J9Method:
	0x0: U8* bytecodes = !j9x 0x3580804000000000
	0x8: struct J9ConstantPool* constantPool = !j9constantpool 0x0001800700007FC0 (flags = 0x3)
	0x10: void* methodRunAddress = !j9x 0x3580804000000000
	0x18: volatile void* extra = !j9x 0x0001800800007FC3
}
Signature: <FAULT>
ROM Method: !j9rommethod 0x3580803FFFFFFFEC
Next Method: !j9method 0x00000000000E0024

From this, what I can conclude is that this problem comes from having the corresponding invokeCache entry populated with something invalid. The problem does not happen in the initial compilations at lower opt levels, where the invokeCache entry is NULL for which the JIT constructs a fake linkToStatic call instead of trying to construct a call to the adapter method.

nbhuiyan avatar Jun 14 '23 21:06 nbhuiyan

The fix for this problem would perhaps involve a more reliable way to check if the call site or the method type table entries are actually resolved with valid entries rather than just checking if the entries are NULL. Not sure how the call site table entries got populated with invalid stuff, which is where I was hoping you could provide some insights @babsingh

nbhuiyan avatar Jun 14 '23 21:06 nbhuiyan

@thallium @nbhuiyan Can you provide me the exact code that was used to print the scoped value in https://github.com/eclipse-openj9/openj9/issues/16965#issuecomment-1548113080 or https://github.com/eclipse-openj9/openj9/issues/16965#issuecomment-1592026646? The invokedynamic instruction is generated for the print statement.

babsingh avatar Jun 16 '23 00:06 babsingh

@babsingh

Here's @thallium's modified test: StressStackOverflow.java

You will also need to cherry-pick the changes in #17402 in your build.

nbhuiyan avatar Jun 16 '23 16:06 nbhuiyan

In OJDK MHs, invokedynamic resolution can either yield an array or exception. In this case, the resolution leads to a StackOverflowError.

In BytecodeInterpreter, OJDK MH invokedynamic logic is as follows:

invokedynamic:
	if (J9_EXPECTED(NULL != invokeCacheArray)) {
		if (J9CLASS_IS_ARRAY(clazz)) {
			/* Logic with memberName and appendix */
		} else {
			/* Missing in JIT. */
			VM_VMHelpers::setExceptionPending(_currentThread, invokeCacheArray);
			rc = GOTO_THROW_CURRENT_EXCEPTION;
		}
	} else { /* Resolve the invokedynamic CP entry. */ }

In the JIT, InterpreterEmulator::visitInvokedynamic() is missing the logic to handle an exception for the resolved value. @nbhuiyan has been briefed and is working on adding the missing logic.

Code references

babsingh avatar Jun 19 '23 16:06 babsingh

https://github.com/eclipse-openj9/openj9/pull/17625 has been opened to address this failure.

babsingh avatar Jun 20 '23 20:06 babsingh

Also failing in sanity.openjdk in jdk21 builds.

https://openj9-jenkins.osuosl.org/job/Grinder_testList_0/231/ - amac jdk21 java/lang/ScopedValue/StressStackOverflow.java

15:07:47  java.lang.RuntimeException: StressStackOverflow$TestFailureException: Unexpected value for ScopedValue
15:07:47  	at StressStackOverflow.lambda$run$1(StressStackOverflow.java:208)
15:07:47  	at java.base/jdk.internal.vm.ScopedValueContainer.runWithoutScope(ScopedValueContainer.java:112)
15:07:47  	at java.base/jdk.internal.vm.ScopedValueContainer.run(ScopedValueContainer.java:98)
15:07:47  	at java.base/java.lang.ScopedValue$Carrier.run(ScopedValue.java:507)
15:07:47  	at StressStackOverflow.run(StressStackOverflow.java:170)
15:07:47  	at StressStackOverflow.main(StressStackOverflow.java:221)
15:07:47  	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
15:07:47  	at java.base/java.lang.reflect.Method.invoke(Method.java:580)
15:07:47  	at com.sun.javatest.regtest.agent.MainWrapper$MainTask.run(MainWrapper.java:138)
15:07:47  	at java.base/java.lang.Thread.run(Thread.java:1595)
15:07:47  Caused by: StressStackOverflow$TestFailureException: Unexpected value for ScopedValue
15:07:47  	at StressStackOverflow.<clinit>(StressStackOverflow.java:46)
15:07:47  	at java.base/jdk.internal.misc.Unsafe.ensureClassInitialized0(Native Method)
15:07:47  	at java.base/jdk.internal.misc.Unsafe.ensureClassInitialized(Unsafe.java:1522)
15:07:47  	at java.base/jdk.internal.reflect.MethodHandleAccessorFactory.ensureClassInitialized(MethodHandleAccessorFactory.java:300)
15:07:47  	at java.base/jdk.internal.reflect.MethodHandleAccessorFactory.newMethodAccessor(MethodHandleAccessorFactory.java:71)
15:07:47  	at java.base/jdk.internal.reflect.ReflectionFactory.newMethodAccessor(ReflectionFactory.java:159)
15:07:47  	at java.base/java.lang.reflect.Method.acquireMethodAccessor(Method.java:726)
15:07:47  	at java.base/java.lang.reflect.Method.invoke(Method.java:577)
15:07:47  	... 2 more

pshipton avatar Jun 21 '23 19:06 pshipton

The JDK 21 sanity.openjdk test failure is seen even with the JIT segfault fix in #17625: x86-64 mac JDK21 with 17625

@babsingh mentioned in #16965 (comment) that the two issues (original failure and the JIT crash) could be the same issue. Using a build with the fixes in #17625, the crashes in the JIT when attempting to print ScopeValues no longer happens, but we seem to still see the unexpected ScopedValue failure in the JDK21 sanity.openjdk tests. This suggests that the JIT problem identified here was not responsible for the original failure reported.

nbhuiyan avatar Jun 22 '23 20:06 nbhuiyan

@nbhuiyan I ran the test (10 iterations) on my xLinux vm and it passed.

thallium avatar Jun 23 '23 15:06 thallium

@thallium

I ran the test (10 iterations) on my xLinux vm and it passed.

Are you talking about your modified test attempting to print ScopeValues? If yes, then this is consistent with my findings too, where there is a crash in the JIT, which gets fixed via #17625. Somehow attempting to print these values result in the problem getting hidden.

We do still see the original failure happening. We can observe it (even with #17625 included) when running JDK21 sanity.openjdk, or JDK20 extended.openjdk. @babsingh reported that this is not reproducible with -Xint, tested with an increased test time to try to cause the unexpected value for ScopedValue failure. Therefore, I will now work on investigating the original problem reported here (unexpected value for ScopedValue), since it could be a JIT-related problem.

This is obviously not going to be resolved in time for 0.40. @pshipton / @0xdaryl, so the path forward is to move this issue to a future milestone while I investigate this, and possibly also add java/lang/ScopedValue/StressStackOverflow.java to the JDK20 test excludes file?

nbhuiyan avatar Jun 23 '23 19:06 nbhuiyan