openj9 icon indicating copy to clipboard operation
openj9 copied to clipboard

jdk_foreign_0_FAILED - java/foreign/TestHandshake.java

Open JasonFengJ9 opened this issue 4 years ago • 58 comments

Failure link

https://openj9-jenkins.osuosl.org/job/Test_openjdk16_j9_sanity.openjdk_ppc64_aix_Nightly/55/consoleFull

22:09:40  openjdk version "16.0.1-internal" 2021-04-20
22:09:40  OpenJDK Runtime Environment (build 16.0.1-internal+0-adhoc.****.BuildJDK16ppc64aixNightly)
22:09:40  Eclipse OpenJ9 VM (build master-074746f7710, JRE 16 AIX ppc64-64-Bit Compressed References 20210718_55 (JIT enabled, AOT enabled)
22:09:40  OpenJ9   - 074746f7710
22:09:40  OMR      - 88427d6420f
22:09:40  JCL      - 5dd08a5cc3f based on jdk-16.0.1+9)

Rerun in Grinder

Optional info

Failure output (captured from console output)

06:14:39  ===============================================
06:14:39  java/foreign/TestHandshake.java
06:14:39  Total tests run: 6, Failures: 1, Skips: 0
06:14:39  ===============================================
06:14:39  

06:14:39  ...
06:14:39  Output overflow:
06:14:39  JT Harness has limited the test output to the text
06:14:39  at the beginning and the end, so that you can see how the
06:14:39  test began, and how it completed.
06:14:39  
06:14:39  If you need to see more of the output from the test,
06:14:39  set the system property javatest.maxOutputSize to a higher
06:14:39  value. The current value is 100000
06:14:39  ...

06:14:39  STDERR:
06:14:39  WARNING: Using incubator modules: jdk.incubator.foreign
06:14:39  java.lang.Exception: failures: 1
06:14:39  	at com.sun.javatest.regtest.agent.TestNGRunner.main(TestNGRunner.java:96)
06:14:39  	at com.sun.javatest.regtest.agent.TestNGRunner.main(TestNGRunner.java:54)
06:14:39  	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
06:14:39  	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:78)
06:14:39  	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
06:14:39  	at java.base/java.lang.reflect.Method.invoke(Method.java:567)
06:14:39  	at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127)
06:14:39  	at java.base/java.lang.Thread.run(Thread.java:883)
06:14:39  
06:14:39  JavaTest Message: Test threw exception: java.lang.Exception: failures: 1

06:14:39  rerun:
06:14:39  cd /home/jenkins/workspace/Test_openjdk16_j9_sanity.openjdk_ppc64_aix_Nightly/aqa-tests/TKG/output_16266607888323/jdk_foreign_0/work/scratch && \
06:14:39  DISPLAY=:0 \
06:14:39  HOME=/home/jenkins \
06:14:39  LANG=en_US \
06:14:39  PATH=/bin:/usr/bin:/usr/sbin \
06:14:39  TZ=EST5EDT \
06:14:39  CLASSPATH=/home/jenkins/workspace/Test_openjdk16_j9_sanity.openjdk_ppc64_aix_Nightly/aqa-tests/TKG/output_16266607888323/jdk_foreign_0/work/classes/java/foreign/TestHandshake.d:/home/jenkins/workspace/Test_openjdk16_j9_sanity.openjdk_ppc64_aix_Nightly/aqa-tests/openjdk/openjdk-jdk/test/jdk/java/foreign:/home/jenkins/workspace/Test_openjdk16_j9_sanity.openjdk_ppc64_aix_Nightly/jvmtest/openjdk/jtreg/lib/testng.jar:/home/jenkins/workspace/Test_openjdk16_j9_sanity.openjdk_ppc64_aix_Nightly/jvmtest/openjdk/jtreg/lib/jcommander.jar:/home/jenkins/workspace/Test_openjdk16_j9_sanity.openjdk_ppc64_aix_Nightly/jvmtest/openjdk/jtreg/lib/javatest.jar:/home/jenkins/workspace/Test_openjdk16_j9_sanity.openjdk_ppc64_aix_Nightly/jvmtest/openjdk/jtreg/lib/jtreg.jar \
06:14:39      /home/jenkins/workspace/Test_openjdk16_j9_sanity.openjdk_ppc64_aix_Nightly/openjdkbinary/j2sdk-image/bin/java \
06:14:39          -Dtest.vm.opts='-ea -esa -Xmx512m --add-modules jdk.incubator.foreign -XX:+UseCompressedOops' \
06:14:39          -Dtest.tool.vm.opts='-J-ea -J-esa -J-Xmx512m -J--add-modules -Jjdk.incubator.foreign -J-XX:+UseCompressedOops' \
06:14:39          -Dtest.compiler.opts= \
06:14:39          -Dtest.java.opts= \
06:14:39          -Dtest.jdk=/home/jenkins/workspace/Test_openjdk16_j9_sanity.openjdk_ppc64_aix_Nightly/openjdkbinary/j2sdk-image \
06:14:39          -Dcompile.jdk=/home/jenkins/workspace/Test_openjdk16_j9_sanity.openjdk_ppc64_aix_Nightly/openjdkbinary/j2sdk-image \
06:14:39          -Dtest.timeout.factor=8.0 \
06:14:39          -Dtest.nativepath=/home/jenkins/workspace/Test_openjdk16_j9_sanity.openjdk_ppc64_aix_Nightly/openjdkbinary/openjdk-test-image/jdk/jtreg/native \
06:14:39          -Dtest.root=/home/jenkins/workspace/Test_openjdk16_j9_sanity.openjdk_ppc64_aix_Nightly/aqa-tests/openjdk/openjdk-jdk/test/jdk \
06:14:39          -Dtest.name=java/foreign/TestHandshake.java \
06:14:39          -Dtest.file=/home/jenkins/workspace/Test_openjdk16_j9_sanity.openjdk_ppc64_aix_Nightly/aqa-tests/openjdk/openjdk-jdk/test/jdk/java/foreign/TestHandshake.java \
06:14:39          -Dtest.src=/home/jenkins/workspace/Test_openjdk16_j9_sanity.openjdk_ppc64_aix_Nightly/aqa-tests/openjdk/openjdk-jdk/test/jdk/java/foreign \
06:14:39          -Dtest.src.path=/home/jenkins/workspace/Test_openjdk16_j9_sanity.openjdk_ppc64_aix_Nightly/aqa-tests/openjdk/openjdk-jdk/test/jdk/java/foreign \
06:14:39          -Dtest.classes=/home/jenkins/workspace/Test_openjdk16_j9_sanity.openjdk_ppc64_aix_Nightly/aqa-tests/TKG/output_16266607888323/jdk_foreign_0/work/classes/java/foreign/TestHandshake.d \
06:14:39          -Dtest.class.path=/home/jenkins/workspace/Test_openjdk16_j9_sanity.openjdk_ppc64_aix_Nightly/aqa-tests/TKG/output_16266607888323/jdk_foreign_0/work/classes/java/foreign/TestHandshake.d \
06:14:39          -Dtest.class.path.prefix=/home/jenkins/workspace/Test_openjdk16_j9_sanity.openjdk_ppc64_aix_Nightly/aqa-tests/TKG/output_16266607888323/jdk_foreign_0/work/classes/java/foreign/TestHandshake.d:/home/jenkins/workspace/Test_openjdk16_j9_sanity.openjdk_ppc64_aix_Nightly/aqa-tests/openjdk/openjdk-jdk/test/jdk/java/foreign \
06:14:39          -Dtest.modules='jdk.incubator.foreign java.base/jdk.internal.vm.annotation java.base/jdk.internal.misc' \
06:14:39          --add-modules jdk.incubator.foreign,java.base \
06:14:39          --add-exports java.base/jdk.internal.vm.annotation=ALL-UNNAMED \
06:14:39          --add-exports java.base/jdk.internal.misc=ALL-UNNAMED \
06:14:39          -ea \
06:14:39          -esa \
06:14:39          -Xmx512m \
06:14:39          -XX:+UseCompressedOops \
06:14:39          -Djava.library.path=/home/jenkins/workspace/Test_openjdk16_j9_sanity.openjdk_ppc64_aix_Nightly/openjdkbinary/openjdk-test-image/jdk/jtreg/native \
06:14:39          -Xint \
06:14:39          com.sun.javatest.regtest.agent.MainWrapper /home/jenkins/workspace/Test_openjdk16_j9_sanity.openjdk_ppc64_aix_Nightly/aqa-tests/TKG/output_16266607888323/jdk_foreign_0/work/java/foreign/TestHandshake.d/testng.1.jta java/foreign/TestHandshake.java false TestHandshake
06:14:39  
06:14:39  TEST RESULT: Failed. Execution failed: `main' threw exception: java.lang.Exception: failures: 1
06:14:39  --------------------------------------------------
06:15:34  Test results: passed: 30; failed: 1
06:15:43  Report written to /home/jenkins/workspace/Test_openjdk16_j9_sanity.openjdk_ppc64_aix_Nightly/jvmtest/openjdk/report/html/report.html
06:15:43  Results written to /home/jenkins/workspace/Test_openjdk16_j9_sanity.openjdk_ppc64_aix_Nightly/aqa-tests/TKG/output_16266607888323/jdk_foreign_0/work
06:15:43  Error: Some tests failed or other problems occurred.
06:15:43  
06:15:43  jdk_foreign_0_FAILED

JasonFengJ9 avatar Jul 19 '21 12:07 JasonFengJ9

A slightly different error at JDK16 0.27 release build https://openj9-jenkins.osuosl.org/job/Test_openjdk16_j9_sanity.openjdk_x86-64_windows_Release/6/consoleFull

08:31:49  openjdk version "16.0.2-internal" 2021-07-21
08:31:49  OpenJDK Runtime Environment (build 16.0.2-internal+0-adhoc.****.buildjdk16x86-64windowsrelease)
08:31:49  Eclipse OpenJ9 VM (build openj9-0.27.0, JRE 16 Windows Server 2012 R2 amd64-64-Bit Compressed References 20210722_7 (JIT enabled, AOT enabled)
08:31:49  OpenJ9   - 1851b0074f8
08:31:49  OMR      - 9db1c87
08:31:49  JCL      - 50b0cc91164 based on jdk-16.0.2+3)

Rerun in Grinder

09:26:56  --------------------------------------------------
09:26:56  TEST: java/foreign/TestHandshake.java

09:26:56  STDERR:
09:26:56  WARNING: Using incubator modules: jdk.incubator.foreign
09:26:56  java.lang.NullPointerException
09:26:56  	at java.base/jdk.internal.util.ArraysSupport.vectorizedMismatch(ArraysSupport.java)
09:26:56  	at java.base/jdk.internal.misc.ScopedMemoryAccess.vectorizedMismatchInternal(ScopedMemoryAccess.java:236)
09:26:56  	at java.base/jdk.internal.misc.ScopedMemoryAccess.vectorizedMismatch(ScopedMemoryAccess.java:217)
09:26:56  	at jdk.incubator.foreign/jdk.internal.foreign.AbstractMemorySegmentImpl.vectorizedMismatchLargeForBytes(AbstractMemorySegmentImpl.java:203)
09:26:56  	at jdk.incubator.foreign/jdk.internal.foreign.AbstractMemorySegmentImpl.mismatch(AbstractMemorySegmentImpl.java:166)
09:26:56  	at TestHandshake$SegmentMismatchAccessor.doAccess(TestHandshake.java:189)
09:26:56  	at TestHandshake$AbstractSegmentAccessor.run(TestHandshake.java:94)
09:26:56  	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
09:26:56  	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630)
09:26:56  	at java.base/java.lang.Thread.run(Thread.java:883)
09:26:56  STATUS:Failed.`main' threw exception: java.lang.NullPointerException

JasonFengJ9 avatar Jul 22 '21 14:07 JasonFengJ9

Another one at JDK 16 0.27 release build https://openj9-jenkins.osuosl.org/job/Test_openjdk16_j9_sanity.openjdk_ppc64le_linux_Release/6/consoleFull

11:03:11  test TestHandshake.testHandshake("SegmentMismatchAccessor", TestHandshake$$Lambda$30/0x000000006ece8dc0@f523f37a): failure
11:03:11  java.lang.AssertionError: expected [true] but found [false]
11:03:11  	at org.testng.Assert.fail(Assert.java:94)
11:03:11  	at org.testng.Assert.failNotEquals(Assert.java:496)
11:03:11  	at org.testng.Assert.assertTrue(Assert.java:42)
11:03:11  	at org.testng.Assert.assertTrue(Assert.java:52)
11:03:11  	at TestHandshake.testHandshake(TestHandshake.java:76)
11:03:11  	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
11:03:11  	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:78)
11:03:11  	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
11:03:11  	at java.base/java.lang.reflect.Method.invoke(Method.java:567)
11:03:11  	at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:85)
11:03:11  	at org.testng.internal.Invoker.invokeMethod(Invoker.java:639)
11:03:11  	at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:821)
11:03:11  	at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1131)
11:03:11  	at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:125)
11:03:11  	at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:108)
11:03:11  	at org.testng.TestRunner.privateRun(TestRunner.java:773)
11:03:11  	at org.testng.TestRunner.run(TestRunner.java:623)
11:03:11  	at org.testng.SuiteRunner.runTest(SuiteRunner.java:357)
11:03:11  	at org.testng.SuiteRunner.runSequentially(SuiteRunner.java:352)
11:03:11  	at org.testng.SuiteRunner.privateRun(SuiteRunner.java:310)
11:03:11  	at org.testng.SuiteRunner.run(SuiteRunner.java:259)
11:03:11  	at org.testng.SuiteRunnerWorker.runSuite(SuiteRunnerWorker.java:52)
11:03:11  	at org.testng.SuiteRunnerWorker.run(SuiteRunnerWorker.java:86)
11:03:11  	at org.testng.TestNG.runSuitesSequentially(TestNG.java:1185)
11:03:11  	at org.testng.TestNG.runSuitesLocally(TestNG.java:1110)
11:03:11  	at org.testng.TestNG.run(TestNG.java:1018)
11:03:11  	at com.sun.javatest.regtest.agent.TestNGRunner.main(TestNGRunner.java:94)
11:03:11  	at com.sun.javatest.regtest.agent.TestNGRunner.main(TestNGRunner.java:54)
11:03:11  	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
11:03:11  	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:78)
11:03:11  	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
11:03:11  	at java.base/java.lang.reflect.Method.invoke(Method.java:567)
11:03:11  	at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127)
11:03:11  	at java.base/java.lang.Thread.run(Thread.java:883)

11:03:11  ===============================================
11:03:11  java/foreign/TestHandshake.java
11:03:11  Total tests run: 6, Failures: 1, Skips: 0
11:03:11  ===============================================
11:03:11  
11:03:11  STDERR:
11:03:11  WARNING: Using incubator modules: jdk.incubator.foreign
11:03:11  java.lang.Exception: failures: 1
11:03:11  	at com.sun.javatest.regtest.agent.TestNGRunner.main(TestNGRunner.java:96)
11:03:11  	at com.sun.javatest.regtest.agent.TestNGRunner.main(TestNGRunner.java:54)
11:03:11  	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
11:03:11  	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:78)
11:03:11  	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
11:03:11  	at java.base/java.lang.reflect.Method.invoke(Method.java:567)
11:03:11  	at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127)
11:03:11  	at java.base/java.lang.Thread.run(Thread.java:883)
11:03:11  
11:03:11  JavaTest Message: Test threw exception: java.lang.Exception: failures: 1

11:04:20  jdk_foreign_0_FAILED

JasonFengJ9 avatar Jul 22 '21 15:07 JasonFengJ9

@EricYangIBM This may be related to https://github.com/eclipse-openj9/openj9/issues/13234

tajila avatar Jul 26 '21 20:07 tajila

I believe the issue is that https://github.com/ibmruntimes/openj9-openjdk-jdk16/blob/openj9/test/jdk/java/foreign/TestHandshake.java#L242 keeps throwing and looping without ever successfully closing the segment because there is always another thread accessing that segment. I'm not sure why it only happens for the SegmentMismatchAccessor variation, I think it is because ScopedMemoryAccess.vectorizedMismatch takes longer and therefore gives fewer opportunities to close the segment.

Would having closeScope0 interrupt threads that are accessing the scope during close with an exception increase the success rate of close? I get the feeling that this is an issue with the test since it is a try until success - there is a probability that close will always happen during segment access.

Also, I am intermittently getting this failure locally. Do the release builds run the test over multiple iterations?

EricYangIBM avatar Jul 27 '21 20:07 EricYangIBM

Release builds don't do anything special, they just run the test.

pshipton avatar Jul 27 '21 20:07 pshipton

Can you set MAX_EXECUTOR_WAIT_SECONDS to a longer timeout to see if that fixes the issue.

tajila avatar Jul 28 '21 12:07 tajila

Can you set MAX_EXECUTOR_WAIT_SECONDS to a longer timeout to see if that fixes the issue.

Testing that right now, but I think it will. https://hyc-runtimes-jenkins.swg-devops.com/view/Test_grinder/job/Grinder/17169/console shows SegmentMismatchAccessor iterations that successfully close the segment (e.g. at 348 ms, 5049 ms for iteration 0 and 1 for the last grinder iteration). If the timeout is increased I think there will be more successful SegmentMismatchAccessor iterations.

EricYangIBM avatar Jul 28 '21 14:07 EricYangIBM

I think the problem is that SegmentMismatchAccessor takes too long (it compares two 1 000 000 byte native memory segments) which prevents the main thread from obtaining the lock. In https://hyc-runtimes-jenkins.swg-devops.com/view/Test_grinder/job/Grinder/17198/consoleFull you can see that an accessor (thread) often holds the lock for 2000 ms. In the hotspot implementation closeScope interrupts accessor threads so the closer thread can immediately acquire the lock. Our implementation doesn't do this so we get the long hold times.

EricYangIBM avatar Jul 28 '21 17:07 EricYangIBM

Update: Replacing https://github.com/ibmruntimes/openj9-openjdk-jdk16/blob/openj9/test/jdk/java/foreign/TestHandshake.java#L76 with

while (!accessExecutor.isTerminated()) {
    Thread.sleep(5000);
}

eventually results in the test passing (locally on x86-64 linux). I will open a PR to exclude this test and an issue to add interrupts to closeScope0.

EricYangIBM avatar Jul 28 '21 18:07 EricYangIBM

The AssertionError is still observed in latest internal run Test_openjdk16_j9_sanity.openjdk_s390x_linux/72/

openjdk version "16.0.2" 2021-07-20
IBM Semeru Runtime Open Edition 16.0.2.0 (build 16.0.2+7-202108050408)
Eclipse OpenJ9 VM 16.0.2.0 (build master-9af2e37a0, JRE 16 Linux s390x-64-Bit Compressed References 20210804_68 (JIT enabled, AOT enabled)
OpenJ9   - 9af2e37a0
OMR      - 45440608c
JCL      - ee030f1f04 based on jdk-16.0.2+7)

Rerun in Grinder

[2021-08-05T07:38:25.508Z] Running test jdk_foreign_0 ...
[2021-08-05T07:38:25.508Z] ===============================================
[2021-08-05T07:38:25.508Z] jdk_foreign_0 Start Time: Thu Aug  5 00:38:25 2021 Epoch Time (ms): 1628149105425
[2021-08-05T07:38:25.508Z] "/home/jenkins/workspace/Test_openjdk16_j9_sanity.openjdk_s390x_linux/openjdkbinary/j2sdk-image/bin/java" -Xshareclasses:destroyAll; "/home/jenkins/workspace/Test_openjdk16_j9_sanity.openjdk_s390x_linux/openjdkbinary/j2sdk-image/bin/java" -Xshareclasses:groupAccess,destroyAll; echo "cache cleanup done";
[2021-08-05T07:38:26.308Z] JVMSHRC005I No shared class caches available
[2021-08-05T07:38:26.308Z] JVMSHRC005I No shared class caches available
[2021-08-05T07:38:26.308Z] cache cleanup done
[2021-08-05T07:38:26.308Z] variation: Mode150
[2021-08-05T07:38:26.308Z] JVM_OPTIONS:  -XX:+UseCompressedOops 

[2021-08-05T07:39:40.416Z] --------------------------------------------------
[2021-08-05T07:39:40.416Z] TEST: java/foreign/TestHandshake.java

[2021-08-05T07:39:40.429Z] Segment closed - delay (ms): 10392
[2021-08-05T07:39:40.429Z] test TestHandshake.testHandshake("SegmentMismatchAccessor", TestHandshake$$Lambda$30/0x0000000084e26d88@e3103462): failure
[2021-08-05T07:39:40.429Z] java.lang.AssertionError: expected [true] but found [false]
[2021-08-05T07:39:40.429Z] 	at org.testng.Assert.fail(Assert.java:94)
[2021-08-05T07:39:40.429Z] 	at org.testng.Assert.failNotEquals(Assert.java:496)
[2021-08-05T07:39:40.429Z] 	at org.testng.Assert.assertTrue(Assert.java:42)
[2021-08-05T07:39:40.429Z] 	at org.testng.Assert.assertTrue(Assert.java:52)
[2021-08-05T07:39:40.429Z] 	at TestHandshake.testHandshake(TestHandshake.java:76)
[2021-08-05T07:39:40.429Z] 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[2021-08-05T07:39:40.429Z] 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:78)
[2021-08-05T07:39:40.429Z] 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[2021-08-05T07:39:40.429Z] 	at java.base/java.lang.reflect.Method.invoke(Method.java:567)
[2021-08-05T07:39:40.429Z] 	at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:85)
[2021-08-05T07:39:40.429Z] 	at org.testng.internal.Invoker.invokeMethod(Invoker.java:639)
[2021-08-05T07:39:40.429Z] 	at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:821)
[2021-08-05T07:39:40.429Z] 	at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1131)
[2021-08-05T07:39:40.429Z] 	at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:125)
[2021-08-05T07:39:40.429Z] 	at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:108)
[2021-08-05T07:39:40.429Z] 	at org.testng.TestRunner.privateRun(TestRunner.java:773)
[2021-08-05T07:39:40.429Z] 	at org.testng.TestRunner.run(TestRunner.java:623)
[2021-08-05T07:39:40.429Z] 	at org.testng.SuiteRunner.runTest(SuiteRunner.java:357)
[2021-08-05T07:39:40.429Z] 	at org.testng.SuiteRunner.runSequentially(SuiteRunner.java:352)
[2021-08-05T07:39:40.429Z] 	at org.testng.SuiteRunner.privateRun(SuiteRunner.java:310)
[2021-08-05T07:39:40.429Z] 	at org.testng.SuiteRunner.run(SuiteRunner.java:259)
[2021-08-05T07:39:40.429Z] 	at org.testng.SuiteRunnerWorker.runSuite(SuiteRunnerWorker.java:52)
[2021-08-05T07:39:40.429Z] 	at org.testng.SuiteRunnerWorker.run(SuiteRunnerWorker.java:86)
[2021-08-05T07:39:40.429Z] 	at org.testng.TestNG.runSuitesSequentially(TestNG.java:1185)
[2021-08-05T07:39:40.429Z] 	at org.testng.TestNG.runSuitesLocally(TestNG.java:1110)
[2021-08-05T07:39:40.429Z] 	at org.testng.TestNG.run(TestNG.java:1018)
[2021-08-05T07:39:40.429Z] 	at com.sun.javatest.regtest.agent.TestNGRunner.main(TestNGRunner.java:94)
[2021-08-05T07:39:40.429Z] 	at com.sun.javatest.regtest.agent.TestNGRunner.main(TestNGRunner.java:54)
[2021-08-05T07:39:40.429Z] 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[2021-08-05T07:39:40.429Z] 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:78)
[2021-08-05T07:39:40.429Z] 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[2021-08-05T07:39:40.429Z] 	at java.base/java.lang.reflect.Method.invoke(Method.java:567)
[2021-08-05T07:39:40.429Z] 	at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127)
[2021-08-05T07:39:40.429Z] 	at java.base/java.lang.Thread.run(Thread.java:883)
[2021-08-05T07:39:40.429Z] ITERATION 0

[2021-08-05T07:39:40.452Z] ===============================================
[2021-08-05T07:39:40.452Z] java/foreign/TestHandshake.java
[2021-08-05T07:39:40.452Z] Total tests run: 6, Failures: 1, Skips: 0
[2021-08-05T07:39:40.452Z] ===============================================
[2021-08-05T07:39:40.452Z] 
[2021-08-05T07:39:40.452Z] STDERR:
[2021-08-05T07:39:40.452Z] WARNING: Using incubator modules: jdk.incubator.foreign
[2021-08-05T07:39:40.452Z] java.lang.Exception: failures: 1
[2021-08-05T07:39:40.452Z] 	at com.sun.javatest.regtest.agent.TestNGRunner.main(TestNGRunner.java:96)
[2021-08-05T07:39:40.452Z] 	at com.sun.javatest.regtest.agent.TestNGRunner.main(TestNGRunner.java:54)
[2021-08-05T07:39:40.452Z] 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[2021-08-05T07:39:40.452Z] 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:78)
[2021-08-05T07:39:40.452Z] 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[2021-08-05T07:39:40.452Z] 	at java.base/java.lang.reflect.Method.invoke(Method.java:567)
[2021-08-05T07:39:40.452Z] 	at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127)
[2021-08-05T07:39:40.452Z] 	at java.base/java.lang.Thread.run(Thread.java:883)

[2021-08-05T07:39:40.452Z] TEST RESULT: Failed. Execution failed: `main' threw exception: java.lang.Exception: failures: 1
[2021-08-05T07:39:40.452Z] --------------------------------------------------
[2021-08-05T07:39:59.221Z] Test results: passed: 30; failed: 1
[2021-08-05T07:40:05.038Z] Report written to /home/jenkins/workspace/Test_openjdk16_j9_sanity.openjdk_s390x_linux/jvmtest/openjdk/report/html/report.html
[2021-08-05T07:40:05.038Z] Results written to /home/jenkins/workspace/Test_openjdk16_j9_sanity.openjdk_s390x_linux/aqa-tests/TKG/output_16281397277782/jdk_foreign_0/work
[2021-08-05T07:40:05.038Z] Error: Some tests failed or other problems occurred.
[2021-08-05T07:40:05.038Z] 
[2021-08-05T07:40:05.038Z] jdk_foreign_0_FAILED

fyi @llxia

JasonFengJ9 avatar Aug 06 '21 21:08 JasonFengJ9

Seems the test was excluded after the build which failed.

pshipton avatar Aug 09 '21 17:08 pshipton

The test in question has been excluded, sanity.openjdk is green w/ a recent JDK16 AIX build at https://hyc-runtimes-jenkins.swg-devops.com/job/Test_openjdk16_j9_sanity.openjdk_ppc64_aix_Personal/4/consoleFull

08:38:16  openjdk version "16.0.2-internal" 2021-07-20
08:38:16  OpenJDK Runtime Environment (build 16.0.2-internal+0-adhoc.jenkins.BuildJDK16ppc64aixPersonal)
08:38:16  Eclipse OpenJ9 VM (build master-cd8db74f1d4, JRE 16 AIX ppc64-64-Bit Compressed References 20210819_35 (JIT enabled, AOT enabled)
08:38:16  OpenJ9   - cd8db74f1d4
08:38:16  OMR      - ac3ab4821a5
08:38:16  JCL      - 44316ee515b based on jdk-16.0.2+7)

Moving the issue to 0.29.

JasonFengJ9 avatar Aug 19 '21 19:08 JasonFengJ9

Runtimes for TestHandShake: https://github.com/eclipse-openj9/openj9/issues/15204#issuecomment-1155347134

ExceutorService:

Step 1: Launch 10 x Threads {
	outer: while (MemorySegment.session().isAlive()) {}
	    try {
                doAccess(MemorySegment)
	    } catch (IllegalStateException) {
                //suspending - elapased ... ms
		Thread.sleep X ms
		//resuming - elapased ... ms
		continue outer
	    }
	}
	//terminated - elapsed ... ms
}

Step 2: Thread.sleep Y ms

Step 3: Launch Thread {
	while (true) {
	    try {
                MemorySession.close()
	    } catch (IllegalStateException) {
                Thread.onSpinWait
            }
	}
	//Segment closed - elapsed ... ms
}
  • In the old implementation, Step 3 (closeScope) would have interrupted all threads accessing the shared memory segment. This would have thrown an IllegalStateException in the threads launched in Step 1. While the Step 1 threads are in the catch block sleeping for X ms, this would have allowed Step 3 to easily succeed.
  • In the new implementation, Step 3 (closeScope) does not interrupt accessor threads with an IllegalStateException. closeScope will only succeed if no thread is accessing the shared memory segment. Otherwise, an IllegalStateException will ONLY be thrown in the thread that has invoked closeScope.

From the perspective of the new impl:

  • In Step 1, the threads repeatedly access the shared memory segment in a while loop until closeScope succeeds i.e. the memory session is no longer alive.
  • In Step 3, the thread can only successfully closeScope if no threads from Step 1 are executing doAccess i.e. accessing the memory segment.
  • Steps 1 and 3 can end up creating an INFINITE loop if Step 3 is unable to succeed which depends on NO threads from Step 1 to ACCESS the memory segment.
  • If doAccess takes longer time as seen for SegmentMismatchAccessor due to lack of intrinsic support, then chances of Step 3 to succeed further reduce. closeScope will fail many times before satisfying the success condition i.e. no threads from Step 1 should access the memory segment.
  • Due to the INFINITE loop, the test takes longer runtimes.

In order to remove the INFINITE loop, the test needs to restrict the number of times the memory segment is accessed in Step 1 while still creating collisions between memory segment accesses (Step 1: doAccess) and closeScope (Step 3).

There are also alternatives: i) remove timeouts for the test; or ii) exclude the test since it is technically invalid due to the INFINITE loop.

babsingh avatar Jun 15 '22 20:06 babsingh

The issue with vectorizedMismatch() is still detected in JDK19 on Windows as follows:

...
ITERATION 3
Started first thread: "Accessor #" + id ; elapsed (ms): 44
Starting handshaker with delay set to 126 millis
STDERR:
java.lang.NullPointerException
        at java.base/jdk.internal.util.ArraysSupport.vectorizedMismatch(Apport.java:126)
java.lang.NullPointerException
        at java.base/jdk.internal.misc.ScopedMemoryAccess.vectorizedMismarnal(ScopedMemoryAccess.java:232)
        at java.base/jdk.internal.misc.ScopedMemoryAccess.vectorizedMismapedMemoryAccess.java:213)
        at java.base/jdk.internal.foreign.AbstractMemorySegmentImpl.vectosmatchLargeForBytes(AbstractMemorySegmentImpl.java:208)
        at java.base/jdk.internal.foreign.AbstractMemorySegmentImpl.mismatractMemorySegmentImpl.java:171)
        at TestHandshake$SegmentMismatchAccessor.doAccess(TestHandshake.j)
        at TestHandshake$AbstractSegmentAccessor.run(TestHandshake.java:1
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThlExecutor.java:1144)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(TolExecutor.java:642)
        at java.base/java.lang.Thread.run(Thread.java:1573)
STATUS:Failed.`main' threw exception: java.lang.NullPointerException
        at java.base/jdk.internal.util.ArraysSupport.vectorizedMismatch(Apport.java:126)
        at java.base/jdk.internal.misc.ScopedMemoryAccess.vectorizedMismarnal(ScopedMemoryAccess.java:232)
        at java.base/jdk.internal.misc.ScopedMemoryAccess.vectorizedMismapedMemoryAccess.java:213)
        at java.base/jdk.internal.foreign.AbstractMemorySegmentImpl.vectosmatchLargeForBytes(AbstractMemorySegmentImpl.java:208)
        at java.base/jdk.internal.foreign.AbstractMemorySegmentImpl.mismatractMemorySegmentImpl.java:171)
        at TestHandshake$SegmentMismatchAccessor.doAccess(TestHandshake.j)
        at TestHandshake$AbstractSegmentAccessor.run(TestHandshake.java:1
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThlExecutor.java:1144)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(TolExecutor.java:642)
        at java.base/java.lang.Thread.run(Thread.java:1573)
STATUS:Failed.`main' threw exception: java.lang.NullPointerException

Given this issue has not yet been resolved via https://github.com/eclipse-openj9/openj9/issues/15204, it still needs to be excluded for the moment. So far, this issue is never spotted on other platforms except Windows.

ChengJin01 avatar Nov 01 '22 03:11 ChengJin01

@ChengJin01 is there any update on this? Will it be resolved in 0.44 or should it move out?

pshipton avatar Feb 23 '24 21:02 pshipton

@ChengJin01 is there any update on this? Will it be resolved in 0.44 or should it move out?

@jdmpapin, I notice the issue related to vectorizedMismatch at https://github.com/eclipse-openj9/openj9/issues/15204 was resolved via https://github.com/eclipse-openj9/openj9/pull/16662 in 0.43. If so, there should be no problem with this test suite, am I correct?

ChengJin01 avatar Feb 23 '24 21:02 ChengJin01

I'm not familiar enough at the moment to understand the reason for the change in behaviour described in https://github.com/eclipse-openj9/openj9/pull/15161#issuecomment-1143061836, whereby previously closeScope() would cause accessors to fail, but now accessors will cause close() to fail instead. I'm guessing this was done to match an upstream change, even though https://github.com/eclipse-openj9/openj9/issues/13211#issuecomment-888491766 suggests that HotSpot matches our old behaviour. (If not, then we have a difference in behaviour that introduces accidental infinite loops :open_mouth:)

Based on https://github.com/eclipse-openj9/openj9/issues/13211#issuecomment-1156886243, my understanding is that with a faster vectorizedMismatch this test will have fewer spurious failures - perhaps almost none in practice - but it remains fundamentally broken in that an infinite loop is still possible.

jdmpapin avatar Feb 23 '24 21:02 jdmpapin

I ran a 5x grinder on jdk17,jdk21,jdk22 on each AIX, Windows, pLinux, zLinux. Two of these failed. Windows jdk22 crashed. I'll move this issue into the Java 22 milestone to investigate the crash.

https://openj9-jenkins.osuosl.org/view/Test/job/Grinder/3340 - plinux jdk17, iteration 2 The failure isn't shown in the test output due to an Output overflow.

https://openj9-jenkins.osuosl.org/view/Test/job/Grinder/3339 - Windows jdk22, iteration 1

https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Grinder/3339/openjdk_test_output.tar.gz

14:51:13  Type=Segmentation error vmState=0x00000000
14:51:13  Windows_ExceptionCode=c0000005 J9Generic_Signal=00000004 ExceptionAddress=00007FFBF3D5E6E8 ContextFlags=0010005f
14:51:13  Handler1=00007FFBF3E20CF0 Handler2=00007FFBF40AABA0 InaccessibleReadAddress=0000020793C54838
14:51:13  RDI=0000003A787FF858 RSI=0000003A787FF860 RAX=0000020793C54838 RBX=0000003A787FF860
14:51:13  RCX=0000000000387100 RDX=000000000034FEB8 R8=000000000008B430 R9=000000000000000F
14:51:13  R10=0000000000000000 R11=0000020780085F7A R12=0000003A787FF850 R13=0000003A787FF848
14:51:13  R14=0000000000387100 R15=0000000000000000
14:51:13  RIP=00007FFBF3D5E6E8 RSP=0000003A787FF370 RBP=0000000000387100 EFLAGS=0000000000010246
14:51:13  FS=0053 ES=002B DS=002B
14:51:13  Unhandled exception
14:51:13  Type=Segmentation error vmState=0x00000000
14:51:13  Windows_ExceptionCode=c0000005 J9Generic_Signal=00000004 ExceptionAddress=00007FFBF3D5E6E8 ContextFlags=0010005f
14:51:13  Handler1=00007FFBF3E20CF0 Handler2=00007FFBF40AABA0 InaccessibleReadAddress=0000020793D3FCA8
14:51:13  RDI=0000003A788FF458 RSI=0000003A788FF460 RAX=0000020793D3FCA8 RBX=0000003A788FF460
14:51:13  RCX=000000000035D900 RDX=00000000003591B8 R8=0000000000000001 R9=0000000000000018
14:51:13  R10=0000000000000000 R11=0000020780085F7A R12=0000003A788FF450 R13=0000003A788FF448
14:51:13  R14=000000000035D900 R15=0000000000000000
14:51:13  RIP=00007FFBF3D5E6E8 RSP=0000003A788FEF70 RBP=000000000035D900 EFLAGS=0000000000010246
14:51:13  FS=0053 ES=002B DS=002B
14:51:13  Unhandled exception
14:51:13  Type=Segmentation error vmState=0x00000000
14:51:13  Windows_ExceptionCode=c0000005 J9Generic_Signal=00000004 ExceptionAddress=00007FFBF3D5E6E8 ContextFlags=0010005f
14:51:13  Handler1=00007FFBF3E20CF0 Handler2=00007FFBF40AABA0 InaccessibleReadAddress=0000020793B7AE90
14:51:13  RDI=0000003A789FF328 RSI=0000003A789FF330 RAX=0000020793B7AE90 RBX=0000003A789FF330
14:51:13  RCX=000000000034AE00 RDX=000000000035C208 R8=0000000000000001 R9=0000000000000018
14:51:13  R10=0000000000000000 R11=0000020780085F7A R12=0000003A789FF320 R13=0000003A789FF318
14:51:13  R14=000000000034AE00 R15=0000000000000000
14:51:13  RIP=00007FFBF3D5E6E8 RSP=0000003A789FEE40 RBP=000000000034AE00 EFLAGS=0000000000010246
14:51:13  FS=0053 ES=002B DS=002B
14:51:13  Unhandled exception
14:51:13  Type=Segmentation error vmState=0x00000000
14:51:13  Windows_ExceptionCode=c0000005 J9Generic_Signal=00000004 ExceptionAddress=00007FFBF3D5E6E8 ContextFlags=0010005f
14:51:13  Handler1=00007FFBF3E20CF0 Handler2=00007FFBF40AABA0 InaccessibleReadAddress=0000020793F24B78
14:51:13  RDI=0000003A78AFF6C8 RSI=0000003A78AFF6D0 RAX=0000020793F24B78 RBX=0000003A78AFF6D0
14:51:13  RCX=0000000000355C00 RDX=0000000000360EC8 R8=0000000000000001 R9=0000000000000018
14:51:13  R10=0000000000000000 R11=0000020780085F7A R12=0000003A78AFF6C0 R13=0000003A78AFF6B8
14:51:13  R14=0000000000355C00 R15=0000000000000000
14:51:13  RIP=00007FFBF3D5E6E8 RSP=0000003A78AFF1E0 RBP=0000000000355C00 EFLAGS=0000000000010246
14:51:13  FS=0053 ES=002B DS=002B
14:51:13  XMM0 0000020780085f7a (f: 2148032256.000000, d: 1.102377e-311)
14:51:13  XMM1 0000000000000000 (f: 0.000000, d: 0.000000e+00)
14:51:13  XMM2 0000000000000000 (f: 0.000000, d: 0.000000e+00)
14:51:13  XMM3 0000000000000000 (f: 0.000000, d: 0.000000e+00)
14:51:13  XMM4 0000000000000000 (f: 0.000000, d: 0.000000e+00)
14:51:13  XMM5 0000000000000000 (f: 0.000000, d: 0.000000e+00)
14:51:13  XMM6 0000000000360ef0 (f: 3542768.000000, d: 1.750360e-317)
14:51:13  XMM7 0000000000360ef0 (f: 3542768.000000, d: 1.750360e-317)
14:51:13  XMM8 0000000000000000 (f: 0.000000, d: 0.000000e+00)
14:51:13  XMM9 0000000000000000 (f: 0.000000, d: 0.000000e+00)
14:51:13  XMM10 0000000000000000 (f: 0.000000, d: 0.000000e+00)
14:51:13  XMM11 0000000000000000 (f: 0.000000, d: 0.000000e+00)
14:51:13  XMM12 0000000000000000 (f: 0.000000, d: 0.000000e+00)
14:51:13  XMM13 0000000000000000 (f: 0.000000, d: 0.000000e+00)
14:51:13  XMM14 0000000000000000 (f: 0.000000, d: 0.000000e+00)
14:51:13  XMM15 0000000000000000 (f: 0.000000, d: 0.000000e+00)
14:51:13  Module=C:\Users\jenkins\workspace\Grinder\jdkbinary\j2sdk-image\bin\default\j9vm29.dll
14:51:13  Module_base_address=00007FFBF3D50000 Offset_in_DLL=000000000000e6e8
14:51:13  Target=2_90_20240223_22 (Windows Server 2019 10.0 build 17763)
14:51:13  CPU=amd64 (4 logical CPUs) (0x3fff77000 RAM)
14:51:13  ----------- Stack Backtrace -----------
14:51:13  XMM0 0000020780085f7a (f: 2148032256.000000, d: 1.102377e-311)
14:51:13  XMM1 0000000000000000 (f: 0.000000, d: 0.000000e+00)
14:51:13  XMM2 0000000000000000 (f: 0.000000, d: 0.000000e+00)
14:51:13  XMM3 0000000000000000 (f: 0.000000, d: 0.000000e+00)
14:51:13  XMM4 0000000000000000 (f: 0.000000, d: 0.000000e+00)
14:51:13  XMM5 0000000000000000 (f: 0.000000, d: 0.000000e+00)
14:51:13  XMM6 000000000035c230 (f: 3523120.000000, d: 1.740653e-317)
14:51:13  XMM7 000000000035c230 (f: 3523120.000000, d: 1.740653e-317)
14:51:13  XMM8 0000000000000000 (f: 0.000000, d: 0.000000e+00)
14:51:13  XMM9 0000000000000000 (f: 0.000000, d: 0.000000e+00)
14:51:13  XMM10 0000000000000000 (f: 0.000000, d: 0.000000e+00)
14:51:13  XMM11 0000000000000000 (f: 0.000000, d: 0.000000e+00)
14:51:13  XMM12 0000000000000000 (f: 0.000000, d: 0.000000e+00)
14:51:13  XMM13 0000000000000000 (f: 0.000000, d: 0.000000e+00)
14:51:13  XMM14 0000000000000000 (f: 0.000000, d: 0.000000e+00)
14:51:13  XMM15 0000000000000000 (f: 0.000000, d: 0.000000e+00)
14:51:13  Module=C:\Users\jenkins\workspace\Grinder\jdkbinary\j2sdk-image\bin\default\j9vm29.dll
14:51:13  Module_base_address=00007FFBF3D50000 Offset_in_DLL=000000000000e6e8
14:51:13  Target=2_90_20240223_22 (Windows Server 2019 10.0 build 17763)
14:51:13  CPU=amd64 (4 logical CPUs) (0x3fff77000 RAM)
14:51:13  ----------- Stack Backtrace -----------
14:51:13  XMM0 0000020780085f7a (f: 2148032256.000000, d: 1.102377e-311)
14:51:13  XMM1 0000000000000000 (f: 0.000000, d: 0.000000e+00)
14:51:13  XMM2 0000000000000000 (f: 0.000000, d: 0.000000e+00)
14:51:13  XMM3 0000000000000000 (f: 0.000000, d: 0.000000e+00)
14:51:13  XMM4 0000000000000000 (f: 0.000000, d: 0.000000e+00)
14:51:13  XMM5 0000000000000000 (f: 0.000000, d: 0.000000e+00)
14:51:13  XMM6 00000000003591e0 (f: 3510752.000000, d: 1.734542e-317)
14:51:13  XMM7 00000000003591e0 (f: 3510752.000000, d: 1.734542e-317)
14:51:13  XMM8 0000000000000000 (f: 0.000000, d: 0.000000e+00)
14:51:13  XMM9 0000000000000000 (f: 0.000000, d: 0.000000e+00)
14:51:13  XMM10 0000000000000000 (f: 0.000000, d: 0.000000e+00)
14:51:13  XMM11 0000000000000000 (f: 0.000000, d: 0.000000e+00)
14:51:13  XMM12 0000000000000000 (f: 0.000000, d: 0.000000e+00)
14:51:13  XMM13 0000000000000000 (f: 0.000000, d: 0.000000e+00)
14:51:13  XMM14 0000000000000000 (f: 0.000000, d: 0.000000e+00)
14:51:13  XMM15 0000000000000000 (f: 0.000000, d: 0.000000e+00)
14:51:13  Module=C:\Users\jenkins\workspace\Grinder\jdkbinary\j2sdk-image\bin\default\j9vm29.dll
14:51:13  Module_base_address=00007FFBF3D50000 Offset_in_DLL=000000000000e6e8
14:51:13  Target=2_90_20240223_22 (Windows Server 2019 10.0 build 17763)
14:51:13  CPU=amd64 (4 logical CPUs) (0x3fff77000 RAM)
14:51:13  ----------- Stack Backtrace -----------
14:51:13  XMM0 0000020780085f7a (f: 2148032256.000000, d: 1.102377e-311)
14:51:13  XMM1 0000000000000000 (f: 0.000000, d: 0.000000e+00)
14:51:13  XMM2 0000000000000000 (f: 0.000000, d: 0.000000e+00)
14:51:13  XMM3 0000000000000000 (f: 0.000000, d: 0.000000e+00)
14:51:13  XMM4 0000000000000000 (f: 0.000000, d: 0.000000e+00)
14:51:13  XMM5 0000000000000000 (f: 0.000000, d: 0.000000e+00)
14:51:13  XMM6 000000000034fee0 (f: 3473120.000000, d: 1.715949e-317)
14:51:13  XMM7 000000000034fee0 (f: 3473120.000000, d: 1.715949e-317)
14:51:13  XMM8 0000000000000000 (f: 0.000000, d: 0.000000e+00)
14:51:13  XMM9 0000000000000000 (f: 0.000000, d: 0.000000e+00)
14:51:13  XMM10 0000000000000000 (f: 0.000000, d: 0.000000e+00)
14:51:13  XMM11 0000000000000000 (f: 0.000000, d: 0.000000e+00)
14:51:13  XMM12 0000000000000000 (f: 0.000000, d: 0.000000e+00)
14:51:13  XMM13 0000000000000000 (f: 0.000000, d: 0.000000e+00)
14:51:13  XMM14 0000000000000000 (f: 0.000000, d: 0.000000e+00)
14:51:13  XMM15 0000000000000000 (f: 0.000000, d: 0.000000e+00)
14:51:13  Module=C:\Users\jenkins\workspace\Grinder\jdkbinary\j2sdk-image\bin\default\j9vm29.dll
14:51:13  Module_base_address=00007FFBF3D50000 Offset_in_DLL=000000000000e6e8
14:51:13  Target=2_90_20240223_22 (Windows Server 2019 10.0 build 17763)
14:51:13  CPU=amd64 (4 logical CPUs) (0x3fff77000 RAM)
14:51:13  ----------- Stack Backtrace -----------
14:51:13  (0x00007FFBF3D5E6E8 [j9vm29+0xe6e8])
14:51:13  (0x0000003A78AFF6C0)
14:51:13  (0x000000000008BAB0)
14:51:13  (0x0000000000360F10)
14:51:13  (0x0000003A78AFF6D0)
14:51:13  (0x0000003A78AFF2C0)
14:51:13  (0x0000000100000001)
14:51:13  (0x000000000000000A)
14:51:13  (0x0000003A78AFF380)
14:51:13  (0x0000000000000002)
14:51:13  (0x0000003A78AFF6D0)
14:51:13  (0x0000003A78AFF6C8)
14:51:13  (0x0000000000360EF0)
14:51:13  (0x0000003A78AFF6B8)
14:51:13  (0x0000003A78AFF6E0)
14:51:13  (0x0000003A78AFF6C0)
14:51:13  ---------------------------------------

pshipton avatar Feb 26 '24 21:02 pshipton

It looks like the crash above at https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Grinder/3339/openjdk_test_output.tar.gz was still related to vectorizedMismatch as follows:

1XMCURTHDINFO  Current thread
3XMTHREADINFO      "pool-16-thread-3" J9VMThread:0x000000000034AE00, omrthread_t:0x0000020793AF5498, java/lang/Thread:0x00000000FFE4D300, state:R, prio=5
3XMJAVALTHREAD            (java/lang/Thread getId:0x5A, isDaemon:false)
3XMJAVALTHRCCL            jdk/internal/loader/ClassLoaders$AppClassLoader(0x00000000E004D948)
3XMTHREADINFO1            (native thread ID:0x4D4, native priority:0x5, native policy:UNKNOWN, vmstate:R, vm thread flags:0x00000020)
3XMCPUTIME               CPU usage total: 1.093750000 secs, user: 1.015625000 secs, system: 0.078125000 secs, current category="Application"
3XMHEAPALLOC             Heap bytes allocated since last GC cycle=0 (0x0)
3XMTHREADINFO3           Java callstack:
4XESTACKTRACE                at jdk/internal/misc/Unsafe.getLong(Native Method)
4XESTACKTRACE                at jdk/internal/misc/Unsafe.getLongUnaligned(Unsafe.java:5540)
4XESTACKTRACE                at jdk/internal/util/ArraysSupport.vectorizedMismatch(ArraysSupport.java:130)
4XESTACKTRACE                at jdk/internal/misc/ScopedMemoryAccess.vectorizedMismatchInternal(ScopedMemoryAccess.java:232)
4XESTACKTRACE                at jdk/internal/misc/ScopedMemoryAccess.vectorizedMismatch(ScopedMemoryAccess.java:213)
4XESTACKTRACE                at jdk/internal/foreign/AbstractMemorySegmentImpl.vectorizedMismatchLargeForBytes(AbstractMemorySegmentImpl.java:222)
4XESTACKTRACE                at jdk/internal/foreign/AbstractMemorySegmentImpl.mismatch(AbstractMemorySegmentImpl.java:695)
4XESTACKTRACE                at java/lang/foreign/MemorySegment.mismatch(MemorySegment.java:2571)
4XESTACKTRACE                at jdk/internal/foreign/AbstractMemorySegmentImpl.mismatch(AbstractMemorySegmentImpl.java:289)
4XESTACKTRACE                at TestHandshake$SegmentMismatchAccessor.doAccess(TestHandshake.java:232)
4XESTACKTRACE                at TestHandshake$AbstractSegmentAccessor.run(TestHandshake.java:106)
4XESTACKTRACE                at java/util/concurrent/ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
4XESTACKTRACE                at java/util/concurrent/ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
4XESTACKTRACE                at java/lang/Thread.runWith(Thread.java:1595)
4XESTACKTRACE                at java/lang/Thread.run(Thread.java:1582)

ChengJin01 avatar Feb 26 '24 22:02 ChengJin01

Hi @jdmpapin, any input/update as to the test crash above related to vectorizedMismatch?

ChengJin01 avatar Mar 04 '24 21:03 ChengJin01

As discussed with Devin offline, the crash shouldn't happen whatever it is supposed to be given the intention of the test suite is to verify the lifetime of segment which is only related to the scopedMemoryAccess. Assuming the crash is unrelated to JIT, the only reason I can image is that the problem might still come from scopedMemoryAccess in which case the native access with unsafe occurred after the underlying memory segment was closed (so the crash was caused by the invalid access to the native memory).

ChengJin01 avatar Mar 04 '24 22:03 ChengJin01

Hi @babsingh, could you take a look at the dumps at https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Grinder/3339/openjdk_test_output.tar.gz to how this happened via ScopedMemoryAccess given part of related native code exists in OpenJ9?

ChengJin01 avatar Mar 04 '24 22:03 ChengJin01

could you take a look at the dumps

Taking a look ...

babsingh avatar Mar 06 '24 16:03 babsingh

In JDK22, the implementation of ScopedMemoryAccess.closeScope0 has changed; see https://github.com/ibmruntimes/openj9-openjdk-jdk22/commit/3dd9ec970a375e480e278f746cf84a1b52f1063e. The new implementation resolves the following issue: https://bugs.openjdk.org/browse/JDK-8319782.

In the Windows crash, memory operations are performed while the memory session/scope is closed. Memory operations should either be aborted by throwing an exception or not performed if the meomory session/scope is closed. To resolve the crash, we will need to update ScopedMemoryAccess.closeScope0 as per https://github.com/ibmruntimes/openj9-openjdk-jdk22/commit/3dd9ec970a375e480e278f746cf84a1b52f1063e.

babsingh avatar Mar 07 '24 23:03 babsingh

@pshipton, could you help run grinders to see whether the crash disappears given the fix at https://github.com/eclipse-openj9/openj9/pull/19167 was merged?

ChengJin01 avatar Mar 19 '24 16:03 ChengJin01

Windows 5x x 3 https://openj9-jenkins.osuosl.org/view/Test/job/Grinder/3406/ - passed 10x x 3 https://openj9-jenkins.osuosl.org/view/Test/job/Grinder/3410/ - some failures, see next comment plinux 10x x 3 https://openj9-jenkins.osuosl.org/view/Test/job/Grinder/3409/ - passed AIX 10x x 3 https://openj9-jenkins.osuosl.org/job/Grinder/3412 - passed zlinux 10x x 3 https://openj9-jenkins.osuosl.org/job/Grinder/3413 - passed

pshipton avatar Mar 19 '24 21:03 pshipton

https://openj9-jenkins.osuosl.org/job/Grinder_iteration_0/455

18:51:42  JVMCDRT000E Unable to locate JIT stack map - aborting VM
18:51:42  JVMCDRT001E Method: jdk/internal/util/ArraysSupport.vectorizedMismatch(Ljava/lang/Object;JLjava/lang/Object;JII)I (00000000000AB030)
18:51:42  JVMCDRT002E Failing PC: 00007FF8E8640586 (offset 00000000000000DE), metaData = 000001D9135BB068
18:51:42  22:51:07.615 0x3a8f00j9codertvm(j9ji.110    *   ** ASSERTION FAILED ** at f:\users\jenkins\workspace\build_jdk22_x86-64_windows_nightly\openj9\runtime\codert_vm\jswalk.c:534: ((0 ))

https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Grinder_iteration_0/455/openjdk_test_output.tar.gz


https://openj9-jenkins.osuosl.org/job/Grinder_iteration_1/456

18:36:26  ITERATION 4
18:36:26  Started first thread: "Accessor #" + id ; elapsed (ms): 0
18:36:26  Starting handshaker with delay set to 265 millis
18:36:26  test TestHandshake.testHandshake("SegmentCopyAccessor", TestHandshake$$Lambda/0x0000000070c23750@bc203ac): failure
18:36:26  java.lang.AssertionError: expected [true] but found [false]
18:36:26  	at org.testng.Assert.fail(Assert.java:99)
18:36:26  	at org.testng.Assert.failNotEquals(Assert.java:1037)
18:36:26  	at org.testng.Assert.assertTrue(Assert.java:45)
18:36:26  	at org.testng.Assert.assertTrue(Assert.java:55)
18:36:26  	at TestHandshake.testHandshake(TestHandshake.java:86)

pshipton avatar Mar 19 '24 22:03 pshipton

The javacore at https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Grinder_iteration_0/455/openjdk_test_output.tar.gz indicates ArraysSupport.vectorizedMismatch was still in JIT as follows (the assertion failure occurred in jitWalkFrame at openj9\runtime\codert_vm\jswalk.c:534):

1XMCURTHDINFO  Current thread
3XMTHREADINFO      "pool-16-thread-4" J9VMThread:0x00000000003A8F00, omrthread_t:0x000001D96EC686A0, java/lang/Thread:0x00000000FFE4B1E0, state:R, prio=5
3XMJAVALTHREAD            (java/lang/Thread getId:0x61, isDaemon:false)
3XMJAVALTHRCCL            jdk/internal/loader/ClassLoaders$AppClassLoader(0x00000000E0030988)
3XMTHREADINFO1            (native thread ID:0xA04, native priority:0x5, native policy:UNKNOWN, vmstate:R, vm thread flags:0x00100020)
3XMCPUTIME               CPU usage total: 0.265625000 secs, user: 0.203125000 secs, system: 0.062500000 secs, current category="Application"
3XMHEAPALLOC             Heap bytes allocated since last GC cycle=4096 (0x1000)
1INTERNAL                    Unable to obtain lock context information
3XMTHREADINFO3           Java callstack:
---> 4XESTACKTRACE                at jdk/internal/util/ArraysSupport.vectorizedMismatch(ArraysSupport.java(Compiled Code))
4XESTACKTRACE                at jdk/internal/misc/ScopedMemoryAccess.vectorizedMismatchInternal(ScopedMemoryAccess.java:232)
4XESTACKTRACE                at jdk/internal/misc/ScopedMemoryAccess.vectorizedMismatch(ScopedMemoryAccess.java:213)
4XESTACKTRACE                at jdk/internal/foreign/AbstractMemorySegmentImpl.vectorizedMismatchLargeForBytes(AbstractMemorySegmentImpl.java:222)
4XESTACKTRACE                at jdk/internal/foreign/AbstractMemorySegmentImpl.mismatch(AbstractMemorySegmentImpl.java:695)
4XESTACKTRACE                at java/lang/foreign/MemorySegment.mismatch(MemorySegment.java:2571)
4XESTACKTRACE                at jdk/internal/foreign/AbstractMemorySegmentImpl.mismatch(AbstractMemorySegmentImpl.java:289)
4XESTACKTRACE                at TestHandshake$SegmentMismatchAccessor.doAccess(TestHandshake.java:232)
4XESTACKTRACE                at TestHandshake$AbstractSegmentAccessor.run(TestHandshake.java:106(Compiled Code))
4XESTACKTRACE                at java/util/concurrent/ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144(Compiled Code))
4XESTACKTRACE                at java/util/concurrent/ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
4XESTACKTRACE                at java/lang/Thread.runWith(Thread.java:1595)
4XESTACKTRACE                at java/lang/Thread.run(Thread.java:1582)

@jdmpapin, does it mean the work with ArraysSupport.vectorizedMismatch is not yet ready at https://github.com/eclipse-openj9/openj9/issues/16717 or anything associated with the fix at https://github.com/eclipse-openj9/openj9/pull/16662?

ChengJin01 avatar Mar 20 '24 02:03 ChengJin01

I don't think that off-heap is in use (see #14005), so #16717 should be irrelevant

I'm pretty sure the transformation from #16662 is not involved. If it were, then the caller (vectorizedMismatchInternal) would also have to be compiled. Additionally, when the transformation replaces the call with (mainly) arraycmplen, the replacement IL doesn't have a GC point, so we shouldn't be walking the stack while running it. It's possible that there could be an unexpected segfault during arraycmplen, which could be taken as the failure of an implicit null check and trigger stack walking, but if that were to happen, then ArraysSupport.vectorizedMismatch wouldn't appear on the stack in the javacore

jdmpapin avatar Mar 22 '24 20:03 jdmpapin

Is it possible anymore to get the build that was running in https://openj9-jenkins.osuosl.org/job/Grinder_iteration_0/455?

jdmpapin avatar Mar 28 '24 15:03 jdmpapin