aqa-tests icon indicating copy to clipboard operation
aqa-tests copied to clipboard

RISC-V test plan

Open luhenry opened this issue 1 year ago • 13 comments
trafficstars

Relates to https://github.com/adoptium/temurin-build/issues/3591

Action Items

  • [ ] Rework gnu.testlet.java.lang.Math and gnu.testlet.java.lang.StrictMath in Mauve directly, the tests are broken for platform that have canonical NaNs

Test Failures

MiniMix_aot_5m
LT  FAIL: gnu.testlet.java.lang.StrictMath.tanh_strictfp (number 22)
LT  got 9221120237041090560 but expected 9223231299366420480
LT  FAIL: gnu.testlet.java.lang.StrictMath.tanh_strictfp (number 23)
LT  got 9221120237041090560 but expected -140737488355328
LT  FAIL: gnu.testlet.java.lang.StrictMath.tanh_strictfp (number 24)
LT  got 9221120237041090560 but expected 9223232550370790895
LT  FAIL: gnu.testlet.java.lang.StrictMath.tanh_strictfp (number 25)
LT  got 9221120237041090560 but expected -139486483984913
LT  FAIL: gnu.testlet.java.lang.StrictMath.tanh_strictfp (number 26)
LT  got 9221120237041090560 but expected 9223090561878065153
LT  FAIL: gnu.testlet.java.lang.StrictMath.tanh_strictfp (number 27)
LT  got 9221120237041090560 but expected -281474976710655
LT  FAIL: gnu.testlet.java.lang.StrictMath.tanh_strictfp (number 28)
LT  got 9221120237041090560 but expected 9223220665868348875
LT  FAIL: gnu.testlet.java.lang.StrictMath.tanh_strictfp (number 29)
LT  got 9221120237041090560 but expected -151370986426933
MiniMix_5m
LT  FAIL: gnu.testlet.java.lang.StrictMath.atan (number 25)
LT  got 9221120237041090560 but expected 9223231299366420480
LT  FAIL: gnu.testlet.java.lang.StrictMath.atan (number 26)
LT  got 9221120237041090560 but expected -140737488355328
LT  FAIL: gnu.testlet.java.lang.StrictMath.atan (number 27)
LT  got 9221120237041090560 but expected 9223232550370790895
LT  FAIL: gnu.testlet.java.lang.StrictMath.atan (number 28)
LT  got 9221120237041090560 but expected -139486483984913
LT  FAIL: gnu.testlet.java.lang.StrictMath.atan (number 29)
LT  got 9221120237041090560 but expected 9223090561878065153
LT  FAIL: gnu.testlet.java.lang.StrictMath.atan (number 30)
LT  got 9221120237041090560 but expected -281474976710655
LT  FAIL: gnu.testlet.java.lang.StrictMath.atan (number 31)
LT  got 9221120237041090560 but expected 9223220665868348875
LT  FAIL: gnu.testlet.java.lang.StrictMath.atan (number 32)
LT  got 9221120237041090560 but expected -151370986426933
MauveSingleThrdLoad_HS_5m
LT  FAIL: gnu.testlet.java.lang.Math.acos (number 25)
LT  got 9221120237041090560 but expected 9223231299366420480
LT  FAIL: gnu.testlet.java.lang.Math.acos (number 26)
LT  got 9221120237041090560 but expected -140737488355328
LT  FAIL: gnu.testlet.java.lang.Math.acos (number 27)
LT  got 9221120237041090560 but expected 9223232550370790895
LT  FAIL: gnu.testlet.java.lang.Math.acos (number 28)
LT  got 9221120237041090560 but expected -139486483984913
LT  FAIL: gnu.testlet.java.lang.Math.acos (number 29)
LT  got 9221120237041090560 but expected 9223090561878065153
LT  FAIL: gnu.testlet.java.lang.Math.acos (number 30)
LT  got 9221120237041090560 but expected -281474976710655
LT  FAIL: gnu.testlet.java.lang.Math.acos (number 31)
LT  got 9221120237041090560 but expected 9223220665868348875
LT  FAIL: gnu.testlet.java.lang.Math.acos (number 32)
LT  got 9221120237041090560 but expected -151370986426933
MauveSingleInvocLoad_HS_5m
LT  FAIL: gnu.testlet.java.lang.Math.acos (number 25)
LT  got 9221120237041090560 but expected 9223231299366420480
LT  FAIL: gnu.testlet.java.lang.Math.acos (number 26)
LT  got 9221120237041090560 but expected -140737488355328
LT  FAIL: gnu.testlet.java.lang.Math.acos (number 27)
LT  got 9221120237041090560 but expected 9223232550370790895
LT  FAIL: gnu.testlet.java.lang.Math.acos (number 28)
LT  got 9221120237041090560 but expected -139486483984913
LT  FAIL: gnu.testlet.java.lang.Math.acos (number 29)
LT  got 9221120237041090560 but expected 9223090561878065153
LT  FAIL: gnu.testlet.java.lang.Math.acos (number 30)
LT  got 9221120237041090560 but expected -281474976710655
LT  FAIL: gnu.testlet.java.lang.Math.acos (number 31)
LT  got 9221120237041090560 but expected 9223220665868348875
LT  FAIL: gnu.testlet.java.lang.Math.acos (number 32)
LT  got 9221120237041090560 but expected -151370986426933
MauveMultiThrdLoad_5m
LT  FAIL: gnu.testlet.java.lang.StrictMath.acos_strictfp (number 25)
LT  got 9221120237041090560 but expected 9223231299366420480
LT  FAIL: gnu.testlet.java.lang.StrictMath.acos_strictfp (number 26)
LT  got 9221120237041090560 but expected -140737488355328
LT  FAIL: gnu.testlet.java.lang.StrictMath.acos_strictfp (number 27)
LT  got 9221120237041090560 but expected 9223232550370790895
LT  FAIL: gnu.testlet.java.lang.StrictMath.acos_strictfp (number 28)
LT  got 9221120237041090560 but expected -139486483984913
LT  FAIL: gnu.testlet.java.lang.StrictMath.acos_strictfp (number 29)
LT  got 9221120237041090560 but expected 9223090561878065153
LT  FAIL: gnu.testlet.java.lang.StrictMath.acos_strictfp (number 30)
LT  got 9221120237041090560 but expected -281474976710655
LT  FAIL: gnu.testlet.java.lang.StrictMath.acos_strictfp (number 31)
LT  got 9221120237041090560 but expected 9223220665868348875
LT  FAIL: gnu.testlet.java.lang.StrictMath.acos_strictfp (number 32)
LT  got 9221120237041090560 but expected -151370986426933
ConcurrentLoadTest_5m
LT  testFailure: testAPI(net.adoptopenjdk.test.concurrent.atomic.AtomicLongArrayTest): 30 : weakCompareAndSet() expected:<true> but was:<false>
LT  junit.framework.AssertionFailedError: 30 : weakCompareAndSet() expected:<true> but was:<false>
LT  	at junit.framework.Assert.fail(Assert.java:57)
LT  	at junit.framework.Assert.failNotEquals(Assert.java:329)
LT  	at junit.framework.Assert.assertEquals(Assert.java:78)
LT  	at junit.framework.Assert.assertEquals(Assert.java:174)
LT  	at junit.framework.TestCase.assertEquals(TestCase.java:333)
LT  	at net.adoptopenjdk.test.concurrent.atomic.AtomicLongArrayTest.testAPI(AtomicLongArrayTest.java:140)
LT  	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
LT  	at java.base/java.lang.reflect.Method.invoke(Method.java:580)
LT  	at junit.framework.TestCase.runTest(TestCase.java:176)
LT  	at junit.framework.TestCase.runBare(TestCase.java:141)
LT  	at junit.framework.TestResult$1.protect(TestResult.java:122)
LT  	at junit.framework.TestResult.runProtected(TestResult.java:142)
LT  	at junit.framework.TestResult.run(TestResult.java:125)
LT  	at junit.framework.TestCase.run(TestCase.java:129)
LT  	at junit.framework.TestSuite.runTest(TestSuite.java:252)
LT  	at junit.framework.TestSuite.run(TestSuite.java:247)
LT  	at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:86)
LT  	at org.junit.runners.Suite.runChild(Suite.java:128)
LT  	at org.junit.runners.Suite.runChild(Suite.java:27)
LT  	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
LT  	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
LT  	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
LT  	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
LT  	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
LT  	at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
LT  	at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
LT  	at org.junit.runner.JUnitCore.run(JUnitCore.java:115)
LT  	at net.adoptopenjdk.loadTest.adaptors.JUnitAdaptor.executeTest(JUnitAdaptor.java:130)
LT  	at net.adoptopenjdk.loadTest.LoadTestRunner$2.run(LoadTestRunner.java:182)
LT  	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
LT  	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
LT  	at java.base/java.lang.Thread.run(Thread.java:1583)
renaissance-philosophers
Exception in thread "Thread-125" java.lang.IllegalMonitorStateException: current thread is not owner
	at java.base/java.lang.Object.notifyAll(Native Method)
	at scala.concurrent.stm.ccstm.TxnLevelImpl.notifyCompleted(TxnLevelImpl.scala:138)
	at scala.concurrent.stm.ccstm.TxnLevelImpl.setCommitted(TxnLevelImpl.scala:104)
	at scala.concurrent.stm.ccstm.InTxnImpl.attemptTopLevelComplete(InTxnImpl.scala:723)
	at scala.concurrent.stm.ccstm.InTxnImpl.topLevelComplete(InTxnImpl.scala:618)
	at scala.concurrent.stm.ccstm.InTxnImpl.topLevelAttempt(InTxnImpl.scala:529)
	at scala.concurrent.stm.ccstm.InTxnImpl.topLevelAtomicImpl(InTxnImpl.scala:398)
	at scala.concurrent.stm.ccstm.InTxnImpl.atomic(InTxnImpl.scala:259)
	at scala.concurrent.stm.ccstm.CCSTMExecutor.apply(CCSTMExecutor.scala:24)
	at org.renaissance.scala.stm.RealityShowPhilosophers$PhilosopherThread.$anonfun$run$1(RealityShowPhilosophers.scala:36)
	at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:158)
	at org.renaissance.scala.stm.RealityShowPhilosophers$PhilosopherThread.run(RealityShowPhilosophers.scala:27)
Exception in thread "Thread-28" java.lang.ClassCastException: class scala.concurrent.stm.Txn$Committing$ cannot be cast to class scala.concurrent.stm.Txn$RolledBack (scala.concurrent.stm.Txn$Committing$ and scala.concurrent.stm.Txn$RolledBack are in unnamed module of loader java.net.URLClassLoader @79ad8b2f)
	at scala.concurrent.stm.ccstm.InTxnImpl.topLevelComplete(InTxnImpl.scala:623)
	at scala.concurrent.stm.ccstm.InTxnImpl.topLevelAttempt(InTxnImpl.scala:529)
	at scala.concurrent.stm.ccstm.InTxnImpl.topLevelAtomicImpl(InTxnImpl.scala:398)
	at scala.concurrent.stm.ccstm.InTxnImpl.atomic(InTxnImpl.scala:259)
	at scala.concurrent.stm.ccstm.CCSTMExecutor.apply(CCSTMExecutor.scala:24)
	at org.renaissance.scala.stm.RealityShowPhilosophers$PhilosopherThread.$anonfun$run$1(RealityShowPhilosophers.scala:29)
	at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:158)
	at org.renaissance.scala.stm.RealityShowPhilosophers$PhilosopherThread.run(RealityShowPhilosophers.scala:27)

luhenry avatar Jan 16 '24 16:01 luhenry

From a look over the results of https://ci.adoptium.net/job/Test_openjdk17_hs_extended.openjdk_riscv64_linux/10, I can identify at least two fixes that we need to do in OpenJDK, none are specific to RISC-V:

  1. backport https://bugs.openjdk.org/browse/JDK-8286447 to fix all the java.lang.UnsatisfiedLinkError: Can't load library: /path/to/lib/runtime/lib/libawt_xawt.so. This error is the root cause for all failures in tools/jpackage
  2. use a proper IP for java/net/Socket/B8312065.java which may not be used on the local network, see https://www.rfc-editor.org/rfc/rfc5737 for the right way of doing it.
  3. increase the timeout for jdk/incubator/vector/VectorMaxConversionTests.java#id1, the test doesn't deadlock as it's still running. The cause of the timeout is simply no hardware acceleration (the hardware doesn't support it) with slow hardware.

I still have to identify the cause of the crash for sun/tools/jhsdb/HeapDumpTest.java.

luhenry avatar Jan 19 '24 21:01 luhenry

  1. backport https://bugs.openjdk.org/browse/JDK-8286447 to fix all the java.lang.UnsatisfiedLinkError: Can't load library: /path/to/lib/runtime/lib/libawt_xawt.so. This error is the root cause for all failures in tools/jpackage
  • [ ] Should be fixed by https://github.com/openjdk/jdk17u-dev/pull/2150
  1. use a proper IP for java/net/Socket/B8312065.java which may not be used on the local network, see https://www.rfc-editor.org/rfc/rfc5737 for the right way of doing it.
  • [ ] Should be fixed by https://github.com/openjdk/jdk17u-dev/pull/2149
  1. increase the timeout for jdk/incubator/vector/VectorMaxConversionTests.java#id1, the test doesn't deadlock as it's still running. The cause of the timeout is simply no hardware acceleration (the hardware doesn't support it) with slow hardware.
  • [x] Should be fixed by https://github.com/adoptium/aqa-tests/pull/4990

I still have to identify the cause of the crash for sun/tools/jhsdb/HeapDumpTest.java

From early debugging, it's most likely related to the message WARNING: getThreadIntegerRegisterSet0: get_lwp_regs failed for lwp in jmap's stderr in the test

luhenry avatar Jan 22 '24 09:01 luhenry

I still have to identify the cause of the crash for sun/tools/jhsdb/HeapDumpTest.java

From early debugging, it's most likely related to the message WARNING: getThreadIntegerRegisterSet0: get_lwp_regs failed for lwp in jmap's stderr in the test

  • [x] Should be fixed by https://github.com/openjdk/jdk17u-dev/pull/2152

luhenry avatar Jan 22 '24 11:01 luhenry

From the latest run on 21.0.2+13:

  • MiniMix in extended.system failed with the January aqa-tests branch but passed with master so that issue has been resolved in the PRs
  • sanity.system TBC (Mauve failures may have been resolved in master)
  • The renaissance-philosophers test in extended.perf still seems to be a problem (But re-grinding with master to verify)
  • Several failures in extended.openjdk: TestJcmdDefaults (jdk_tools_1) and VectorMaxConversionTests_Zsinglegen (jdk_vector_[01]) plus anything from testList_2 when it completes

References:

sxa avatar Mar 08 '24 16:03 sxa

The renaissance-philosophers test in extended.perf still seems to be a problem

We excluded that perf target for AIX here, it is not unreasonable to exclude for risc-v, though I have not seen 'how it fails' so will be worth understanding before excluding

smlambert avatar Mar 08 '24 17:03 smlambert

The renaissance-philosophers test in extended.perf still seems to be a problem

We excluded that perf target for AIX here, it is not unreasonable to exclude for risc-v, though I have not seen 'how it fails' so will be worth understanding before excluding

Noting that Grinder 9060 seemed to have the philosophers test disabled, and therefore extended.perf passed overall so I guess it was excluded in master on riscv64.

sxa avatar Mar 11 '24 10:03 sxa

TestJlmRemoteNotifierProxyAuth_0 failed in sanity.system at https://ci.adoptium.net/job/Test_openjdk21_hs_sanity.system_riscv64_linux/56/console but seemed to pass in the automatic re-run at https://ci.adoptium.net/job/Test_openjdk21_hs_sanity.system_riscv64_linux_rerun/9/consoleFull (Both were on the scaleway-1 machine.

Grinding with 10 iterations at https://ci.adoptium.net/job/Grinder/9083/ (EDIT: Failed 1/10)

testList_2 had a few targets failed in the run above but a subsequent run passed at https://ci.adoptium.net/job/Test_openjdk21_hs_extended.openjdk_riscv64_linux_testList_2/5/console

So I believe we're only left with the final bullet point from the earlier comment: Failures in extended.openjdk: TestJcmdDefaults (jdk_tools_1) and VectorMaxConversionTests_Zsinglegen (jdk_vector_[01])

sxa avatar Mar 11 '24 10:03 sxa

VectorMaxConversionTests_Zsinglegen passed in https://ci.adoptium.net/job/Grinder/9111/ on master TestJCmdDefaults passed in https://ci.adoptium.net/job/Grinder/9103/testResults (Compare pass, fail)

sxa avatar Mar 12 '24 15:03 sxa

This looks good. We should gather the .tap files from test jobs and Grinders and attach to this issue to close.

smlambert avatar Mar 12 '24 15:03 smlambert

This looks good. We should gather the .tap files from test jobs and Grinders and attach to this issue to close.

TAP Files: jdk21.0.2+13-riscv64-releasetaps.tar.gz Excludes two grinders which didn't have TAP files as artifacts:

  • 9059 (sanity.openjdk AKSerialNumber re-run)
  • 9060 (sanity.system without philosophers)

sxa avatar Mar 12 '24 16:03 sxa

9059 (sanity.openjdk AKSerialNumber re-run) 9060 (sanity.perf without philosophers)

not sure why those Grinders would not have tap file artifact, from console of Grinder/9060

12:56:38  Processing '/home/jenkins/.jenkins/jobs/Grinder/builds/9060/tap-master-files/aqa-tests/TKG/output_170991517469/Grinder_20240308160003.tap'
12:56:38  Parsing TAP test result [/home/jenkins/.jenkins/jobs/Grinder/builds/9060/tap-master-files/aqa-tests/TKG/output_170991517469/Grinder_20240308160003.tap].
12:56:38  TAP Reports Processing: FINISH
[Pipeline] echo
12:56:38  Saving aqa-tests/testenv/testenv.properties file on jenkins.
[Pipeline] archiveArtifacts
12:56:38  Archiving artifacts
12:56:38  Recording fingerprints
[Pipeline] echo
12:56:38  Saving aqa-tests/TKG/**/*.tap file on jenkins.
[Pipeline] archiveArtifacts
12:56:38  Archiving artifacts
12:56:38  Recording fingerprints
[Pipeline] sh
12:56:39  + tar -cf benchmark_test_output.tar.gz ./aqa-tests/TKG/output_170991517469
[Pipeline] echo
12:56:39  ARTIFACTORY_SERVER is not set. Saving artifacts on jenkins.
[Pipeline] archiveArtifacts
12:56:39  Archiving artifacts

and also https://ci.adoptium.net/job/Grinder/9060/tapResults/ indicates it was available to present with the TAP plugin. Also the benchmark_output.tar.gz and testenv.properties file should have been archived but are not present.

Screenshot 2024-03-12 at 3 33 09 PM - should not matter for testenv.properties and .tap files, which should always be archived.

smlambert avatar Mar 12 '24 19:03 smlambert

I've re-run those Grinders (9135, 9136), and collected the additional two tap files: jdk21.0.2+13-riscv64-grindertaps.tar.gz

Ref the comment from earlier in this issue: Passed AKSerialNumber but failed java/security/misc/Versions.java (when running from aqa-tests master branch) my assumption here is that something has changed since 21.0.2 which has caused this failure, as it worked correctly using the test material directly from the 21.0.2 release in the initial sanity.openjdk run, so this is not a stop-ship issue.

sxa avatar Mar 13 '24 10:03 sxa

Given we have seen all AQAvit and TCK tests passing for this platform, it is 👍 to publish.

smlambert avatar Mar 13 '24 12:03 smlambert

https://github.com/adoptium/infrastructure/issues/3984 has also been see on riscv. Noting here because the jdk17 exclude file says:

## linux-riscv64 excluded tests are tracked with https://github.com/adoptium/aqa-tests/issues/4976

adamfarley avatar Jul 28 '25 09:07 adamfarley