openj9 icon indicating copy to clipboard operation
openj9 copied to clipboard

InternalError: mh1 != mh2

Open pshipton opened this issue 1 year ago • 25 comments

https://openj9-jenkins.osuosl.org/job/Test_openjdk21_j9_sanity.openjdk_x86-64_windows_Nightly_testList_2/146/

https://openj9-jenkins.osuosl.org/job/Test_openjdk17_j9_sanity.openjdk_x86-64_windows_Nightly_testList_1/141 jdk_util_1 java/util/concurrent/tck/JSR166TestCase.java#forkjoinpool-common-parallelism

06:30:57  STARTED    AtomicLongArrayTest::testCountingInMultipleThreads 'testCountingInMultipleThreads'
06:30:57  java.lang.InternalError: mh1 != mh2: mh1 = MethodHandle(VarHandle,long[],int,long,long)boolean (form: DMH.invokeStatic=Lambda(a0:L,a1:L,a2:L,a3:I,a4:J,a5:J)=>{
06:30:57      t6:L=DirectMethodHandle.internalMemberName(a0:L);
06:30:57      t7:I=MethodHandle.linkToStatic(a1:L,a2:L,a3:I,a4:J,a5:J,t6:L);t7:I}); mh2 = MethodHandle(VarHandle,long[],int,long,long)boolean (form: DMH.invokeStatic=Lambda(a0:L,a1:L,a2:L,a3:I,a4:J,a5:J)=>{
06:30:57      t6:L=DirectMethodHandle.internalMemberName(a0:L);
06:30:57      t7:I=MethodHandle.linkToStatic(a1:L,a2:L,a3:I,a4:J,a5:J,t6:L);t7:I})
06:30:57  ------ stacktrace dump start ------
06:30:57  java.lang.InternalError: mh1 != mh2: mh1 = MethodHandle(VarHandle,long[],int,long,long)boolean (form: DMH.invokeStatic=Lambda(a0:L,a1:L,a2:L,a3:I,a4:J,a5:J)=>{
06:30:57      t6:L=DirectMethodHandle.internalMemberName(a0:L);
06:30:57      t7:I=MethodHandle.linkToStatic(a1:L,a2:L,a3:I,a4:J,a5:J,t6:L);t7:I}); mh2 = MethodHandle(VarHandle,long[],int,long,long)boolean (form: DMH.invokeStatic=Lambda(a0:L,a1:L,a2:L,a3:I,a4:J,a5:J)=>{
06:30:57      t6:L=DirectMethodHandle.internalMemberName(a0:L);
06:30:57      t7:I=MethodHandle.linkToStatic(a1:L,a2:L,a3:I,a4:J,a5:J,t6:L);t7:I})
06:30:57  java.lang.InternalError: mh1 != mh2: mh1 = MethodHandle(VarHandle,long[],int,long,long)boolean (form: DMH.invokeStatic=Lambda(a0:L,a1:L,a2:L,a3:I,a4:J,a5:J)=>{
06:30:57      t6:L=DirectMethodHandle.internalMemberName(a0:L);
06:30:57      t7:I=MethodHandle.linkToStatic(a1:L,a2:L,a3:I,a4:J,a5:J,t6:L);t7:I}); mh2 = MethodHandle(VarHandle,long[],int,long,long)boolean (form: DMH.invokeStatic=Lambda(a0:L,a1:L,a2:L,a3:I,a4:J,a5:J)=>{
06:30:57      t6:L=DirectMethodHandle.internalMemberName(a0:L);
06:30:57      t7:I=MethodHandle.linkToStatic(a1:L,a2:L,a3:I,a4:J,a5:J,t6:L);t7:I})
06:30:57  	at java.base/java.lang.invoke.MethodHandleStatics.newInternalError(MethodHandleStatics.java:147)
06:30:57  	at java.base/java.lang.invoke.MethodHandleImpl.assertSame(MethodHandleImpl.java:1375)
06:30:57  	at java.base/java.util.concurrent.atomic.AtomicLongArray.compareAndSet(AtomicLongArray.java:146)
06:30:57  	at AtomicLongArrayTest$Counter.realRun(AtomicLongArrayTest.java:317)
06:30:57  	at JSR166TestCase$CheckedRunnable.run(JSR166TestCase.java:1698)
06:30:57  	at java.base/java.lang.Thread.run(Thread.java:857)
06:30:57  java.lang.InternalError: mh1 != mh2: mh1 = MethodHandle(VarHandle,long[],int,long,long)boolean (form: DMH.invokeStatic=Lambda(a0:L,a1:L,a2:L,a3:I,a4:J,a5:J)=>{
06:30:57      t6:L=DirectMethodHandle.internalMemberName(a0:L);
06:30:57      t7:I=MethodHandle.linkToStatic(a1:L,a2:L,a3:I,a4:J,a5:J,t6:L);t7:I}); mh2 = MethodHandle(VarHandle,long[],int,long,long)boolean (form: DMH.invokeStatic=Lambda(a0:L,a1:L,a2:L,a3:I,a4:J,a5:J)=>{
06:30:57      t6:L=DirectMethodHandle.internalMemberName(a0:L);
06:30:57      t7:I=MethodHandle.linkToStatic(a1:L,a2:L,a3:I,a4:J,a5:J,t6:L);t7:I})
06:30:57  	at java.base/java.lang.invoke.MethodHandleStatics.newInternalError(MethodHandleStatics.java:147)
06:30:57  	at java.base/java.lang.invoke.MethodHandleImpl.assertSame(MethodHandleImpl.java:1375)
06:30:57  	at java.base/java.util.concurrent.atomic.AtomicLongArray.compareAndSet(AtomicLongArray.java:146)
06:30:57  	at AtomicLongArrayTest$Counter.realRun(AtomicLongArrayTest.java:317)
06:30:57  	at JSR166TestCase$CheckedRunnable.run(JSR166TestCase.java:1698)
06:30:57  	at java.base/java.lang.Thread.run(Thread.java:857)
06:30:57  STATUS:Failed.`main' threw exception: java.lang.InternalError: mh1 != mh2: mh1 = MethodHandle(VarHandle,long[],int,long,long)boolean (form: DMH.invokeStatic=Lambda(a0:L,a1:L,a2:L,a3:I,a4:J,a5:J)=>{ t6:L=DirectMethodHandle.internalMemberName(a0:L); t7:I=MethodHandle.linkToStatic(a1:L,a2:L,a3:I,a4:J,a5:J,t6:L);t7:I}); mh2 = MethodHandle(VarHandle,long[],int,long,long)boolean (form: DMH.invokeStatic=Lambda(a0:L,a1:L,a2:L,a3:I,a4:J,a5:J)=>{ t6:L=DirectMethodHandle.internalMemberName(a0:L); t7:I=MethodHandle.linkToStatic(a1:L,a2:L,a3:I,a4:J,a5:J,t6:L);t7:I})
06:30:57  "main" prio=5 Id=1 WAITING
06:30:57  	at [email protected]/java.lang.Object.waitImpl(Native Method)
06:30:57  	at [email protected]/java.lang.Object.wait(Object.java:251)
06:30:57  	at [email protected]/java.lang.Object.wait(Object.java:219)
06:30:57  	at [email protected]/java.lang.Thread.join(Thread.java:784)
06:30:57  	at [email protected]/java.lang.Thread.join(Thread.java:739)
06:30:57  	at app//com.sun.javatest.regtest.agent.MainWrapper.main(MainWrapper.java:85)
06:30:57  "JIT Compilation Thread-000" prio=10 Id=3 RUNNABLE
06:30:57  "JIT Compilation Thread-001 Suspended" prio=10 Id=4 RUNNABLE
06:30:57  "JIT Compilation Thread-002 Suspended" prio=10 Id=5 RUNNABLE
06:30:57  "JIT Diagnostic Compilation Thread-003 Suspended" prio=10 Id=6 RUNNABLE
06:30:57  "JIT-SamplerThread" prio=10 Id=7 TIMED_WAITING
06:30:57  "IProfiler" prio=5 Id=8 RUNNABLE
06:30:57  "Finalizer thread" prio=5 Id=9 RUNNABLE
06:30:57  "Concurrent Mark Helper" prio=1 Id=10 RUNNABLE
06:30:57  "GC Worker" prio=5 Id=11 RUNNABLE
06:30:57  "GC Worker" prio=5 Id=12 RUNNABLE
06:30:57  "GC Worker" prio=5 Id=13 RUNNABLE
06:30:57  "Attach API wait loop" prio=10 Id=16 RUNNABLE
06:30:57  	at [email protected]/openj9.internal.tools.attach.target.IPC.waitSemaphore(Native Method)
06:30:57  	at [email protected]/openj9.internal.tools.attach.target.CommonDirectory.waitSemaphore(CommonDirectory.java:264)
06:30:57  	at [email protected]/openj9.internal.tools.attach.target.WaitLoop.waitForNotification(WaitLoop.java:66)
06:30:57  	at [email protected]/openj9.internal.tools.attach.target.WaitLoop.run(WaitLoop.java:157)
06:30:57  "MainThread" prio=5 Id=17 WAITING
06:30:57  	at [email protected]/java.lang.Object.waitImpl(Native Method)
06:30:57  	at [email protected]/java.lang.Object.wait(Object.java:251)
06:30:57  	at [email protected]/java.lang.Object.wait(Object.java:219)
06:30:57  	at [email protected]/java.lang.Thread.join(Thread.java:784)
06:30:57  	at [email protected]/java.lang.Thread.join(Thread.java:739)
06:30:57  	at app//AtomicLongArrayTest.testCountingInMultipleThreads(AtomicLongArrayTest.java:340)
06:30:57  	at [email protected]/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
06:30:57  	at [email protected]/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
06:30:57  	at [email protected]/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
06:30:57  	at [email protected]/java.lang.reflect.Method.invoke(Method.java:574)
06:30:57  	at app//junit.framework.TestCase.runTest(TestCase.java:177)
06:30:57  	at app//JSR166TestCase.runTest(JSR166TestCase.java:404)
06:30:57  	at app//junit.framework.TestCase.runBare(TestCase.java:142)
06:30:57  	at app//JSR166TestCase.runBare(JSR166TestCase.java:395)
06:30:57  	at app//junit.framework.TestResult$1.protect(TestResult.java:122)
06:30:57  	at app//junit.framework.TestResult.runProtected(TestResult.java:142)
06:30:57  	at app//junit.framework.TestResult.run(TestResult.java:125)
06:30:57  	at app//junit.framework.TestCase.run(TestCase.java:130)
06:30:57  	at app//junit.framework.TestSuite.runTest(TestSuite.java:241)
06:30:57  	at app//junit.framework.TestSuite.run(TestSuite.java:236)
06:30:57  	at app//junit.framework.TestSuite.runTest(TestSuite.java:241)
06:30:57  	at app//junit.framework.TestSuite.run(TestSuite.java:236)
06:30:57  	at app//org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:90)
06:30:57  	at app//org.junit.runner.JUnitCore.run(JUnitCore.java:137)
06:30:57  	at app//org.junit.runner.JUnitCore.run(JUnitCore.java:115)
06:30:57  	at app//org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:42)
06:30:57  	at app//org.junit.vintage.engine.VintageTestEngine.executeAllChildren(VintageTestEngine.java:80)
06:30:57  	at app//org.junit.vintage.engine.VintageTestEngine.execute(VintageTestEngine.java:72)
06:30:57  	at app//org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:147)
06:30:57  	at app//org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:127)
06:30:57  	at app//org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:90)
06:30:57  	at app//org.junit.platform.launcher.core.EngineExecutionOrchestrator.lambda$execute$0(EngineExecutionOrchestrator.java:55)
06:30:57  	at app//org.junit.platform.launcher.core.EngineExecutionOrchestrator.withInterceptedStreams(EngineExecutionOrchestrator.java:102)
06:30:57  	at app//org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:54)
06:30:57  	at app//org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:114)
06:30:57  	at app//org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:86)
06:30:57  	at app//org.junit.platform.launcher.core.DefaultLauncherSession$DelegatingLauncher.execute(DefaultLauncherSession.java:86)
06:30:57  	at app//com.sun.javatest.regtest.agent.JUnitRunner.runWithJUnitPlatform(JUnitRunner.java:142)
06:30:57  	at app//com.sun.javatest.regtest.agent.JUnitRunner.main(JUnitRunner.java:95)
06:30:57  	at app//com.sun.javatest.regtest.agent.JUnitRunner.main(JUnitRunner.java:61)
06:30:57  	at [email protected]/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
06:30:57  	at [email protected]/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
06:30:57  	at [email protected]/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
06:30:57  	at [email protected]/java.lang.reflect.Method.invoke(Method.java:574)
06:30:57  	at app//com.sun.javatest.regtest.agent.MainWrapper$MainTask.run(MainWrapper.java:138)
06:30:57  	at [email protected]/java.lang.Thread.run(Thread.java:857)
06:30:57  "ClassByNameCache Reaper" prio=5 Id=464 WAITING
06:30:57  	at [email protected]/java.lang.Object.waitImpl(Native Method)
06:30:57  	at [email protected]/java.lang.Object.wait(Object.java:251)
06:30:57  	at [email protected]/java.lang.Object.wait(Object.java:219)
06:30:57  	at [email protected]/java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:137)
06:30:57  	at [email protected]/java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:109)
06:30:57  	at [email protected]/java.io.ClassByNameCache$Reaper.run(ClassByNameCache.java:232)
06:30:57  	at [email protected]/java.lang.Thread.run(Thread.java:857)
06:30:57  "Thread-185" prio=5 Id=677 RUNNABLE
06:30:57  	at [email protected]/com.ibm.java.lang.management.internal.ThreadMXBeanImpl.dumpAllThreadsImpl(Native Method)
06:30:57  	at [email protected]/com.ibm.java.lang.management.internal.ThreadMXBeanImpl.dumpAllThreadsCommon(ThreadMXBeanImpl.java:691)
06:30:57  	at [email protected]/com.ibm.java.lang.management.internal.ThreadMXBeanImpl.dumpAllThreads(ThreadMXBeanImpl.java:673)
06:30:57  	at app//JSR166TestCase.dumpTestThreads(JSR166TestCase.java:1203)
06:30:57  	at app//JSR166TestCase.threadRecordFailure(JSR166TestCase.java:834)
06:30:57  	at app//JSR166TestCase.threadUnexpectedException(JSR166TestCase.java:1019)
06:30:57  	at app//JSR166TestCase$CheckedRunnable.run(JSR166TestCase.java:1700)
06:30:57  	at [email protected]/java.lang.Thread.run(Thread.java:857)
06:30:57  "Thread-186" prio=5 Id=678 RUNNABLE
06:30:57  	at [email protected]/java.lang.invoke.MethodType.toString(MethodType.java:894)
06:30:57  	at [email protected]/java.lang.String.valueOf(String.java:4988)
06:30:57  	at [email protected]/java.lang.StringBuilder.append(StringBuilder.java:173)
06:30:57  	at [email protected]/java.lang.invoke.MethodHandle.standardString(MethodHandle.java:1600)
06:30:57  	at [email protected]/java.lang.invoke.MethodHandle.toString(MethodHandle.java:1597)
06:30:57  	at [email protected]/java.util.Formatter$FormatSpecifier.printString(Formatter.java:3056)
06:30:57  	at [email protected]/java.util.Formatter$FormatSpecifier.print(Formatter.java:2933)
06:30:57  	at [email protected]/java.util.Formatter.format(Formatter.java:2689)
06:30:57  	at [email protected]/java.util.Formatter.format(Formatter.java:2625)
06:30:57  	at [email protected]/java.lang.String.format(String.java:4913)
06:30:57  	at [email protected]/java.lang.invoke.MethodHandleImpl.assertSame(MethodHandleImpl.java:1372)
06:30:57  	at [email protected]/java.util.concurrent.atomic.AtomicLongArray.compareAndSet(AtomicLongArray.java:146)
06:30:57  	at app//AtomicLongArrayTest$Counter.realRun(AtomicLongArrayTest.java:317)
06:30:57  	at app//JSR166TestCase$CheckedRunnable.run(JSR166TestCase.java:1698)
06:30:57  	at [email protected]/java.lang.Thread.run(Thread.java:857)
06:30:57  ------ stacktrace dump end ------

Changes from previous nightly build https://github.com/eclipse-openj9/openj9/compare/ac66d8cf7...edbf0e1a5bc

pshipton avatar Feb 02 '24 16:02 pshipton

https://openj9-jenkins.osuosl.org/job/Test_openjdk17_j9_sanity.openjdk_x86-64_windows_Nightly_testList_1/143 jdk_util_1 java/util/stream/test/org/openjdk/tests/java/util/stream/WhileOpTest.java

07:55:17  test org.openjdk.tests.java.util.stream.WhileOpTest.testDropTakeWhileOps("array:0..1000", RefTestData[array:0..1000]): failure
07:55:17  java.lang.AssertionError: 1 failure(s) for test data: RefTestData[array:0..1000]
07:55:17  1: java.lang.Error: PAR_STREAM_COLLECT_TO_LIST: java.lang.InternalError: mh1 != mh2: mh1 = MethodHandle(VarHandle,ForkJoinTask[],int)ForkJoinTask (form: DMH.invokeStatic=Lambda(a0:L,a1:L,a2:L,a3:I)=>{
07:55:17      t4:L=DirectMethodHandle.internalMemberName(a0:L);
07:55:17      t5:L=MethodHandle.linkToStatic(a1:L,a2:L,a3:I,t4:L);t5:L}); mh2 = MethodHandle(VarHandle,ForkJoinTask[],int)ForkJoinTask (form: DMH.invokeStatic=Lambda(a0:L,a1:L,a2:L,a3:I)=>{
07:55:17      t4:L=DirectMethodHandle.internalMemberName(a0:L);
07:55:17      t5:L=MethodHandle.linkToStatic(a1:L,a2:L,a3:I,t4:L);t5:L})
07:55:17  java.lang.InternalError: mh1 != mh2: mh1 = MethodHandle(VarHandle,ForkJoinTask[],int)ForkJoinTask (form: DMH.invokeStatic=Lambda(a0:L,a1:L,a2:L,a3:I)=>{
07:55:17      t4:L=DirectMethodHandle.internalMemberName(a0:L);
07:55:17      t5:L=MethodHandle.linkToStatic(a1:L,a2:L,a3:I,t4:L);t5:L}); mh2 = MethodHandle(VarHandle,ForkJoinTask[],int)ForkJoinTask (form: DMH.invokeStatic=Lambda(a0:L,a1:L,a2:L,a3:I)=>{
07:55:17      t4:L=DirectMethodHandle.internalMemberName(a0:L);
07:55:17      t5:L=MethodHandle.linkToStatic(a1:L,a2:L,a3:I,t4:L);t5:L})
07:55:17  	at java.base/java.lang.invoke.MethodHandleStatics.newInternalError(MethodHandleStatics.java:147)
07:55:17  	at java.base/java.lang.invoke.MethodHandleImpl.assertSame(MethodHandleImpl.java:1375)
07:55:17  	at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.getSlot(ForkJoinPool.java:900)
07:55:17  	at java.base/java.util.concurrent.ForkJoinPool.helpComplete(ForkJoinPool.java:1934)
07:55:17  	at java.base/java.util.concurrent.ForkJoinTask.awaitDone(ForkJoinTask.java:433)
07:55:17  	at java.base/java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:687)
07:55:17  	at java.base/java.util.stream.WhileOps$4.opEvaluateParallel(WhileOps.java:254)
07:55:17  	at java.base/java.util.stream.WhileOps$4.opEvaluateParallelLazy(WhileOps.java:240)
07:55:17  	at java.base/java.util.stream.AbstractPipeline.sourceSpliterator(AbstractPipeline.java:447)
07:55:17  	at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:238)
07:55:17  	at java.base/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:682)
07:55:17  	at java.base/java.util.stream.StreamTestScenario$15.run(StreamTestScenario.java:185)
07:55:17  	at java.base/java.util.stream.StreamTestScenario.run(StreamTestScenario.java:272)
07:55:17  	at java.base/java.util.stream.OpTestCase$ExerciseDataStreamBuilder.exercise(OpTestCase.java:404)
07:55:17  	at org.openjdk.tests.java.util.stream.WhileOpTest.testWhileMulti(WhileOpTest.java:322)
07:55:17  	at org.openjdk.tests.java.util.stream.WhileOpTest.testWhileMulti(WhileOpTest.java:310)
07:55:17  	at org.openjdk.tests.java.util.stream.WhileOpTest.testDropTakeWhileOps(WhileOpTest.java:106)

pshipton avatar Feb 06 '24 16:02 pshipton

@TobiAjila @hzongaro this recently started failing.

pshipton avatar Feb 06 '24 16:02 pshipton

https://openj9-jenkins.osuosl.org/job/Test_openjdk21_j9_sanity.openjdk_ppc64le_linux_Nightly_testList_0/151 jdk_util_1 java/util/concurrent/tck/JSR166TestCase.java#others

pshipton avatar Feb 06 '24 16:02 pshipton

https://openj9-jenkins.osuosl.org/job/Test_openjdk21_j9_sanity.openjdk_s390x_linux_Nightly_testList_0/155 jdk_util_1 java/util/HashSet/Serialization.java.Serialization java/util/Hashtable/DeserializedLength.java.DeserializedLength java/util/Hashtable/ReadObject.java.ReadObject java/util/Hashtable/SerializationDeadlock.java.SerializationDeadlock java/util/concurrent/ConcurrentHashMap/WhiteBox.java.WhiteBox java/util/concurrent/ConcurrentLinkedQueue/WhiteBox.java.WhiteBox java/util/concurrent/TimeUnit/Basic.java.Basic

pshipton avatar Feb 07 '24 12:02 pshipton

@hzongaro I've set this as a blocker. Based on how it fails, I'm guessing it's JIT.

pshipton avatar Feb 07 '24 12:02 pshipton

Also observed at internal JDK21 x86-64_windows build

java version "21.0.2-beta" 2024-01-16
IBM Semeru Runtime Certified Edition 21.0.2+13-202402060359 (build 21.0.2-beta+13-202402060359)
Eclipse OpenJ9 VM 21.0.2+13-202402060359 (build master-c92d58374, JRE 21 Windows Server 2022 amd64-64-Bit Compressed References 20240206_98 (JIT enabled, AOT enabled)
OpenJ9   - c92d58374
OMR      - 410ef0d23
JCL      - a409f230e based on jdk-21.0.2+13)

[2024-02-06T10:44:29.654Z] variation: -Xdump:system:none -Xdump:heap:none -Xdump:system:events=gpf+abort+traceassert+corruptcache Mode150
[2024-02-06T10:44:29.994Z] JVM_OPTIONS:  -Xdump:system:none -Xdump:heap:none -Xdump:system:events=gpf+abort+traceassert+corruptcache -XX:+UseCompressedOops -Xverbosegclog 

[2024-02-06T11:24:13.815Z] TEST: java/util/concurrent/tck/JSR166TestCase.java#default

[2024-02-06T11:24:13.824Z] STARTED    AtomicIntegerArrayTest::testCountingInMultipleThreads 'testCountingInMultipleThreads'
[2024-02-06T11:24:13.824Z] java.lang.InternalError: mh1 != mh2: mh1 = MethodHandle(VarHandle,int[],int,int,int)boolean (form: DMH.invokeStatic=Lambda(a0:L,a1:L,a2:L,a3:I,a4:I,a5:I)=>{
[2024-02-06T11:24:13.824Z]     t6:L=DirectMethodHandle.internalMemberName(a0:L);
[2024-02-06T11:24:13.824Z]     t7:I=MethodHandle.linkToStatic(a1:L,a2:L,a3:I,a4:I,a5:I,t6:L);t7:I}); mh2 = MethodHandle(VarHandle,int[],int,int,int)boolean (form: DMH.invokeStatic=Lambda(a0:L,a1:L,a2:L,a3:I,a4:I,a5:I)=>{
[2024-02-06T11:24:13.824Z]     t6:L=DirectMethodHandle.internalMemberName(a0:L);
[2024-02-06T11:24:13.824Z]     t7:I=MethodHandle.linkToStatic(a1:L,a2:L,a3:I,a4:I,a5:I,t6:L);t7:I})
[2024-02-06T11:24:13.824Z] ------ stacktrace dump start ------
[2024-02-06T11:24:13.824Z] java.lang.InternalError: mh1 != mh2: mh1 = MethodHandle(VarHandle,int[],int,int,int)boolean (form: DMH.invokeStatic=Lambda(a0:L,a1:L,a2:L,a3:I,a4:I,a5:I)=>{
[2024-02-06T11:24:13.824Z]     t6:L=DirectMethodHandle.internalMemberName(a0:L);
[2024-02-06T11:24:13.824Z]     t7:I=MethodHandle.linkToStatic(a1:L,a2:L,a3:I,a4:I,a5:I,t6:L);t7:I}); mh2 = MethodHandle(VarHandle,int[],int,int,int)boolean (form: DMH.invokeStatic=Lambda(a0:L,a1:L,a2:L,a3:I,a4:I,a5:I)=>{
[2024-02-06T11:24:13.824Z]     t6:L=DirectMethodHandle.internalMemberName(a0:L);
[2024-02-06T11:24:13.824Z]     t7:I=MethodHandle.linkToStatic(a1:L,a2:L,a3:I,a4:I,a5:I,t6:L);t7:I})
[2024-02-06T11:24:13.824Z] java.lang.InternalError: mh1 != mh2: mh1 = MethodHandle(VarHandle,int[],int,int,int)boolean (form: DMH.invokeStatic=Lambda(a0:L,a1:L,a2:L,a3:I,a4:I,a5:I)=>{
[2024-02-06T11:24:13.824Z]     t6:L=DirectMethodHandle.internalMemberName(a0:L);
[2024-02-06T11:24:13.824Z]     t7:I=MethodHandle.linkToStatic(a1:L,a2:L,a3:I,a4:I,a5:I,t6:L);t7:I}); mh2 = MethodHandle(VarHandle,int[],int,int,int)boolean (form: DMH.invokeStatic=Lambda(a0:L,a1:L,a2:L,a3:I,a4:I,a5:I)=>{
[2024-02-06T11:24:13.824Z]     t6:L=DirectMethodHandle.internalMemberName(a0:L);
[2024-02-06T11:24:13.824Z]     t7:I=MethodHandle.linkToStatic(a1:L,a2:L,a3:I,a4:I,a5:I,t6:L);t7:I})
[2024-02-06T11:24:13.824Z] 	at java.base/java.lang.invoke.MethodHandleStatics.newInternalError(MethodHandleStatics.java:155)
[2024-02-06T11:24:13.824Z] 	at java.base/java.lang.invoke.MethodHandleImpl.assertSame(MethodHandleImpl.java:1498)
[2024-02-06T11:24:13.824Z] 	at java.base/java.util.concurrent.atomic.AtomicIntegerArray.compareAndSet(AtomicIntegerArray.java:146)
[2024-02-06T11:24:13.824Z] 	at AtomicIntegerArrayTest$Counter.realRun(AtomicIntegerArrayTest.java:318)
[2024-02-06T11:24:13.824Z] 	at JSR166TestCase$CheckedRunnable.run(JSR166TestCase.java:1730)
[2024-02-06T11:24:13.824Z] 	at java.base/java.lang.Thread.run(Thread.java:1595)
[2024-02-06T11:24:13.824Z] java.lang.InternalError: mh1 != mh2: mh1 = MethodHandle(VarHandle,int[],int,int,int)boolean (form: DMH.invokeStatic=Lambda(a0:L,a1:L,a2:L,a3:I,a4:I,a5:I)=>{
[2024-02-06T11:24:13.824Z]     t6:L=DirectMethodHandle.internalMemberName(a0:L);
[2024-02-06T11:24:13.824Z]     t7:I=MethodHandle.linkToStatic(a1:L,a2:L,a3:I,a4:I,a5:I,t6:L);t7:I}); mh2 = MethodHandle(VarHandle,int[],int,int,int)boolean (form: DMH.invokeStatic=Lambda(a0:L,a1:L,a2:L,a3:I,a4:I,a5:I)=>{
[2024-02-06T11:24:13.824Z]     t6:L=DirectMethodHandle.internalMemberName(a0:L);
[2024-02-06T11:24:13.824Z]     t7:I=MethodHandle.linkToStatic(a1:L,a2:L,a3:I,a4:I,a5:I,t6:L);t7:I})
[2024-02-06T11:24:13.824Z] 	at java.base/java.lang.invoke.MethodHandleStatics.newInternalError(MethodHandleStatics.java:155)
[2024-02-06T11:24:13.824Z] 	at java.base/java.lang.invoke.MethodHandleImpl.assertSame(MethodHandleImpl.java:1498)
[2024-02-06T11:24:13.824Z] 	at java.base/java.util.concurrent.atomic.AtomicIntegerArray.compareAndSet(AtomicIntegerArray.java:146)
[2024-02-06T11:24:13.824Z] 	at AtomicIntegerArrayTest$Counter.realRun(AtomicIntegerArrayTest.java:318)
[2024-02-06T11:24:13.824Z] 	at JSR166TestCase$CheckedRunnable.run(JSR166TestCase.java:1730)
[2024-02-06T11:24:13.824Z] 	at java.base/java.lang.Thread.run(Thread.java:1595)
[2024-02-06T11:24:13.824Z] STATUS:Failed.`main' threw exception: java.lang.InternalError: mh1 != mh2: mh1 = MethodHandle(VarHandle,int[],int,int,int)boolean (form: DMH.invokeStatic=Lambda(a0:L,a1:L,a2:L,a3:I,a4:I,a5:I)=>{ t6:L=DirectMethodHandle.internalMemberName(a0:L); t7:I=MethodHandle.linkToStatic(a1:L,a2:L,a3:I,a4:I,a5:I,t6:L);t7:I}); mh2 = MethodHandle(VarHandle,int[],int,int,int)boolean (form: DMH.invokeStatic=Lambda(a0:L,a1:L,a2:L,a3:I,a4:I,a5:I)=>{ t6:L=DirectMethodHandle.internalMemberName(a0:L); t7:I=MethodHandle.linkToStatic(a1:L,a2:L,a3:I,a4:I,a5:I,t6:L);t7:I})

[2024-02-06T13:09:11.366Z] jdk_util_0_FAILED

50x internal grinder - 6/40 failed

JasonFengJ9 avatar Feb 07 '24 15:02 JasonFengJ9

@nbhuiyan, may I ask you to take a look at this problem?

hzongaro avatar Feb 07 '24 15:02 hzongaro

https://openj9-jenkins.osuosl.org/job/Test_openjdk21_j9_sanity.openjdk_aarch64_mac_OMR_testList_0/48 jdk_util_1 java/util/concurrent/tck/JSR166TestCase.java#forkjoinpool-common-parallelism

pshipton avatar Feb 08 '24 14:02 pshipton

https://openj9-jenkins.osuosl.org/job/Test_openjdk17_j9_sanity.openjdk_s390x_linux_Nightly_testList_2/148 jdk_util_0 java/util/stream/boottest/java.base/java/util/stream/NodeTest.java

https://openj9-jenkins.osuosl.org/job/Test_openjdk17_j9_sanity.openjdk_x86-64_windows_Nightly_testList_0/145 jdk_util_0 java/util/stream/boottest/java.base/java/util/stream/LongNodeTest.java

https://openj9-jenkins.osuosl.org/job/Test_openjdk21_j9_sanity.openjdk_aarch64_mac_Nightly_testList_0/112 jdk_util_1 java/util/concurrent/tck/JSR166TestCase.java#others

pshipton avatar Feb 08 '24 14:02 pshipton

Due to the number of failures, spread across a number of tests making it less desirable to exclude them all, resolving this problem is high priority.

pshipton avatar Feb 08 '24 14:02 pshipton

This test failure started as a result of #18755, which added capability to inline the VarHandle access/operation-related methods. This requires retrieving the MethodHandle for the VarHandle operation during compilation for inlining the call chains. In the generated LambdaForm methods that are part of the VarHandle operation call chains, there is a check that asserts that the receiver MethodHandle is the same as the MethodHandle in the stack, and even though they have identical MethodType signature and LambdaForms, the check fails. I am working on determining why this happens for which I need to look at a failing JIT log. I have not been successful in reproducing this failure in small test cases performing the VarHandle operations seen in the failures above, and strangely 60X Grinders running jdk_util_1 on x86-64 linux and ppc linux (few iterations left as of writing this) passed. I will have further updates as I learn more.

nbhuiyan avatar Feb 08 '24 15:02 nbhuiyan

https://openj9-jenkins.osuosl.org/job/Test_openjdk17_j9_sanity.openjdk_x86-64_windows_Nightly_testList_1/146 java/util/stream/test/org/openjdk/tests/java/util/stream/IntUniqOpTest.java

pshipton avatar Feb 09 '24 16:02 pshipton

This is quite strange. Running 100X Grinder (on s390x linux) for the failed test case in https://openj9-jenkins.osuosl.org/job/Test_openjdk17_j9_sanity.openjdk_s390x_linux_Nightly_testList_2/148.

Also, 100X Grinder (on Windows) of the failed test case in https://openj9-jenkins.osuosl.org/job/Test_openjdk17_j9_sanity.openjdk_x86-64_windows_Nightly_testList_0/145 also passed.

Both the Grinder jobs above used the same nightly builds where the failures were reported.

Locally, 1000+ iterations of java/util/concurrent/tck/JSR166TestCase.java in both x86-64 linux and ppc linux completed without failures.

Next I will try to run these tests on the machines where these failures are happening.

nbhuiyan avatar Feb 09 '24 18:02 nbhuiyan

https://openj9-jenkins.osuosl.org/job/Test_openjdk17_j9_sanity.openjdk_x86-64_windows_Nightly_testList_1/146 java/util/stream/test/org/openjdk/tests/java/util/stream/IntUniqOpTest.java

20X Grinder passed on the same machine where the failure was reported: https://openj9-jenkins.osuosl.org/job/Grinder/3290/

20X Grinder in a different Windows machine also passed: https://openj9-jenkins.osuosl.org/job/Grinder/3289/

One thing I did notice is that the test script fails to destroy the SCC in the Windows jobs during setup and teardown.

14:05:45  Attempting to destroy all caches in cacheDir C:\Users\jenkins\AppData\Local\javasharedresources\
14:05:45  
14:05:45  JVMSHRC806I Compressed references persistent shared cache "sharedcc_jenkins" has been destroyed. Use option -Xnocompressedrefs if you want to destroy a non-compressed references cache.
14:05:45  JVMSHRC241E Error: unable to delete shared class cache file
14:05:45  JVMSHRC336E Port layer error code = -100
14:05:45  JVMSHRC337E Platform error message: (32) The process cannot access the file because it is being used by another process.
14:05:45  JVMSHRC430I Failed to remove current generation of shared class cache "jenkins"
14:05:45  
14:05:45  Attempting to destroy all caches in cacheDir C:\Users\jenkins\AppData\Local\javasharedresources\
14:05:45  
14:05:45  JVMSHRC241E Error: unable to delete shared class cache file
14:05:45  JVMSHRC336E Port layer error code = -100
14:05:45  JVMSHRC337E Platform error message: (32) The process cannot access the file because it is being used by another process.
14:05:45  JVMSHRC430I Failed to remove current generation of shared class cache "jenkins"
14:05:45  cache cleanup done
14:05:46  jdk_custom_0 Finish Time: Fri Feb  9 14:05:45 2024 Epoch Time (ms): 1707505545690

Probably not the cause of the test failures since this problem is not seen in the failed tests on other platforms.

nbhuiyan avatar Feb 09 '24 20:02 nbhuiyan

I was able to reproduce the failure on S390x linux by running the entire sanity.openjdk test suite. Now that I know I can reproduce the failure this way, I can work on obtaining logs and implementing a fix.

nbhuiyan avatar Feb 12 '24 18:02 nbhuiyan

Interesting since the JVM is invoked separately for java_lang_N, java_util_N, etc. It probably means something was written into the shared cache by an earlier test that causes the problem.

pshipton avatar Feb 12 '24 18:02 pshipton

I think VarHandle inlining introduced in #18755 may not be responsible for this failure, as I was able to reproduce the failure on s390x linux using a build that has the MethodHandleTransformer and InterpreterEmulator changes in #18755 disabled, disabling any known object info propagation and inlining. There are a few other possible causes I am currently investigating.

nbhuiyan avatar Feb 13 '24 20:02 nbhuiyan

The test failure seems to be introduced by #18847 and/or #18832, which contains changes that affect AOT/SCC. The oldest nightly build with which I can reproduce the failure: https://github.com/eclipse-openj9/openj9/commit/ac66d8cf7c584ff003bfbe8c44d1a79330c3036e Jan 31 Nightly build.

The failure is not reproducible using the nightly build prior to that: https://github.com/eclipse-openj9/openj9/commit/40a4d38b2e1fd6c7ed134dbd764dc1ad800bb85c Jan 30 Nightly build.

FYI @cjjdespres @dsouzai @mpirvu. I would appreciate your thoughts on this.

nbhuiyan avatar Feb 15 '24 23:02 nbhuiyan

@cjjdespres Could you please triage this bug by removing #18847 and #18832 one a time and see which one is at fault? It may not be possible to do an automatic revert due to the recent code that was delivered and depends on those two PRs. So, maybe you should start with an older code base, reproduce, then remove one PR and try to reproduce again, etc.

mpirvu avatar Feb 16 '24 00:02 mpirvu

Sure, I will do that. If one of the two is at fault, my guess would be https://github.com/eclipse-openj9/openj9/pull/18832, but I'll try both separately.

cjjdespres avatar Feb 16 '24 00:02 cjjdespres

Internal grinders on s390x_linux of sanity.openjdk with both PRs included, https://github.com/eclipse-openj9/openj9/pull/18847 removed, and https://github.com/eclipse-openj9/openj9/pull/18832 removed don't seem to have failed yet, though they haven't finished all their iterations yet. How reliable is the failure?

I started at 43aa01d8d0b77a834082d9603f337694a493d165, then reverted each PR individually.

If I should be running it on https://openj9-jenkins.osuosl.org/job/Grinder, can I just take the SDKs from the internal arifactory and use them there?

cjjdespres avatar Feb 16 '24 11:02 cjjdespres

Okay, the grinder https://hyc-runtimes-jenkins.swg-devops.com/job/Grinder/37980/ with both PRs still in appears to have produced some of these failures in one of its test lists. I'll wait for the others.

cjjdespres avatar Feb 16 '24 14:02 cjjdespres

Internal grinders on s390x_linux of sanity.openjdk with both PRs included, https://github.com/eclipse-openj9/openj9/pull/18847 removed, and https://github.com/eclipse-openj9/openj9/pull/18832 removed don't seem to have failed yet, though they haven't finished all their iterations yet. How reliable is the failure?

Running on s390x linux, the failure rate is around 1/5 iterations of the sanity.openjdk target. I have been doing around 15 iterations per build to determine if the failure is reproducible with the build or not, and parallelising them so that they do not take days to complete.

If I should be running it on https://openj9-jenkins.osuosl.org/job/Grinder, can I just take the SDKs from the internal arifactory and use them there?

I am not sure if it is possible to use the builds in the internal artifactory in the external jobs. Since it is reproducible in the internal Grinders, it might be better to continue investigations there since it has a lot more resources.

nbhuiyan avatar Feb 16 '24 17:02 nbhuiyan

I re-tested using a build that excludes both #18847 and #18832 using a personal build at https://github.com/eclipse-openj9/openj9/commit/c0cbb3e8d20a1dab2862bcf15cee0f9f8e33888f, which is right after the Jan 30 nightly build but before those PRs got merged. With 20X iterations, I was able to get 2 test failures, which is a lower failure rate than newer builds. Therefore, if these PRs are doing anything to the failures at all, they are only increasing the likelihood of the failure happening rather than being the cause. The main issue still seems to be that the SCC contains something that it should not have. I will continue to try and isolate when this issue was introduced but in order to do that with builds older than Jan 30 nightly build, I have to increase the number of iterations and it will take some time to get answers.

nbhuiyan avatar Feb 16 '24 17:02 nbhuiyan

Do you know anything about the characteristics of the method that shouldn't be in the SCC? Speaking naively here because I haven't looked at this path in great depth before, but is it possible to add a check for this characteristic (purely as a debug measure) as methods are added to the SCC and then abort immediately when that happens? It might bring you closer to why this is happening in the first place rather than waiting for the existence of the corrupt method to cause problems.

I also wonder if even in passing cases whether such a method exists in the SCC but other factors prevent us from tripping over it. That is to say the corrupt method might always be in the SCC even in passing runs but we just don't realize it.

0xdaryl avatar Feb 17 '24 03:02 0xdaryl

@0xdaryl I am not sure of the characteristic of the method in the SCC that may be responsible for a failure like this. Initially I suspected that we were AOT-compiling compareAndSet as if it was being done in JIT-mode and putting that into the SCC.

public final boolean compareAndSet(int i, long expectedValue, long newValue) {
        return AA.compareAndSet(array, i, expectedValue, newValue); // AA: private static final VarHandle field
    }

However, I was not able to reproduce that behaviour in my local tests, as compareAndSet contains invokehandle bytecode which under AOT aborts the compilation at ILGen.

I am currently checking methods that are added to the SCC when running the failing test while I work on isolating down to a change that may have introduced the problem.

nbhuiyan avatar Feb 20 '24 19:02 nbhuiyan

I suspect that this is caused by a method handle table entry initialization race condition together with the JIT's handling of known objects, and that the apparent connection to AOT shows up just because the race condition is highly sensitive to timing.

I'm able to reliably reproduce this symptom with Java 21 using a bit of instrumentation to artificially introduce a delay into the initialization of the table entry: hack-sleep-jdk21.patch, Test.java

$ java -esa -Xshareclasses:none -Djava.lang.invoke.MethodHandle.CUSTOMIZE_THRESHOLD=0 -Xjit:'limit=go,count=2,disableAsyncCompilation,initialOptLevel=warm,inhibitRecompilation' Test

This results in assertSame() failing with two distinct method handles with identical structure:

Exception in thread "main" java.lang.InternalError: mh1 != mh2: mh1 = MethodHandle(VarHandle,int,int)boolean (form: DMH.invokeStatic=Lambda(a0:L,a1:L,a2:I,a3:I)=>{
    t4:L=DirectMethodHandle.internalMemberName(a0:L);
    t5:I=MethodHandle.linkToStatic(a1:L,a2:I,a3:I,t4:L);t5:I}); mh2 = MethodHandle(VarHandle,int,int)boolean (form: DMH.invokeStatic=Lambda(a0:L,a1:L,a2:I,a3:I)=>{
    t4:L=DirectMethodHandle.internalMemberName(a0:L);
    t5:I=MethodHandle.linkToStatic(a1:L,a2:I,a3:I,t4:L);t5:I})
        at java.base/java.lang.invoke.MethodHandleStatics.newInternalError(MethodHandleStatics.java:155)
        at java.base/java.lang.invoke.MethodHandleImpl.assertSame(MethodHandleImpl.java:1498)
        at Test.go(Test.java:37)
        at Test.main(Test.java:31)

They even call internalMemberName() followed by linkToStatic() just like in the output shown in this issue's initial comment. (The signatures here are a bit different just because I used a static field of type int instead of an array element of type long.)

Fundamentally, this race condition results in the table entry getting written twice (or more, but for simplicity let's assume it's only twice.) With a long enough time between the first and second write, the first handle's LF can get customized. If we then compile a caller in which the VarHandle is known, and if the compiler derives a known object as the result of Invokers.checkVarHandleGenericType() (#18755), it can inline the customized LF based on that known object. The resulting code will continue to load the method handle table entry at run time, and before much longer that load will start to see the second write, but the inlined LF belongs to the MH from the first write. If system assertions are enabled (-esa), then assertSame() will detect the discrepancy.

The above repro stops failing if I comment out parts of #18755. One of the grinders that seemed to exonerate #18755 turns out to have hit a different failure instead, and the other has disappeared by now. @nbhuiyan is verifying now with some more tests whether or not any of these failures can still be observed without #18755.

Since it's possible there could be another bug at the root of this issue, here's an experiment we can try: vh-mhtable-cas.patch. This prevents the table entry (and the table reference itself) from being written more than once after it's initialized for the first time using atomic CAS. If this patch prevents the failures seen in this issue, then I think we can be reasonably confident that the problem here is the interaction described above between the race condition and the JIT's known object handling.

I think the best way to fix this race condition/known object interaction would be #16616. In the meantime, though, we could potentially apply the above CAS patch to prevent the method handle table and its entries from changing once initialized. Otherwise, I think we'd have to make sure not to derive known object indices for method handle table entries, and probably also for the method handle table itself.

jdmpapin avatar Feb 21 '24 22:02 jdmpapin

Great analysis @jdmpapin!

0xdaryl avatar Feb 22 '24 12:02 0xdaryl

I believe the race condition that @jdmpapin described in his analysis is the root cause of the test failure. With my latest Grinder runs, I have not been able to reproduce the failure without #18755.

Using vh-mhtable-cas.patch, 20X Grinder passed. I also ran another 20X Grinder without the CAS patch that had 3 of the MH mismatch failures.

As there are no straightforward ways to deal with this race condition within the JIT's control, I think we should go with modifying the getMethodHandle code in the JCL to use CAS to prevent overwrites of the MH table as described by Devin.

nbhuiyan avatar Feb 22 '24 15:02 nbhuiyan

@TobiAjila @babsingh : any opinions on Devin's proposed approach to modify getMethodHandle ?

0xdaryl avatar Feb 22 '24 16:02 0xdaryl