aqa-tests icon indicating copy to clipboard operation
aqa-tests copied to clipboard

jdk-25 Windows aarch64 aqa testcases "hang" or take a very long time...

Open andrew-m-leonard opened this issue 3 months ago • 10 comments

jdk-25 Windows aarch64 aqa testcases (in at least sanity.openjdk) seem to "hang" indefinitely eg.https://ci.adoptium.net/job/Test_openjdk25_hs_sanity.openjdk_aarch64_windows/22/console

Saved large output as maybe useful, attached..

Win_aarch64_console.txt

Looks like jdk_lang_0 virtual thread tests still running... This bug maybe related: https://bugs.openjdk.org/browse/JDK-8344577 https://github.com/openjdk/jdk/pull/22357 Possibly an issue on Win aarch64 as well...

Running a Grinder of jdk_lang_o, with timeoutfactor reduced to x1, and a 100 hour timeout: https://ci.adoptium.net/job/Grinder/14619/

andrew-m-leonard avatar Sep 24 '25 09:09 andrew-m-leonard

~~I see some of the failures in the console output of Grinders gives this error: Use -nativepath to specify the location of native code are we missing/not building the native test libraries for this platform? and that is why jdk_lang testcases are hanging/timing out?~~ Strike that, this was seen on a run against the Azul JDK25 aarch64 Windows build and I failed to pass a native test library during this run https://ci.adoptium.net/view/Test_grinder/job/Grinder/14656/consoleFull

13:41:29          -classpath 'C:\Users\jenkins\workspace\Grinder\aqa-tests\openjdk\openjdk-jdk\test\jdk\java\lang\StackWalker;C:\Users\jenkins\workspace\Grinder\aqa-tests\TKG\output_17587351345823\jdk_lang_0\work\classes\0\java\lang\StackWalker\NativeMethod.d' 'C:\Users\jenkins\workspace\Grinder\aqa-tests\openjdk\openjdk-jdk\test\jdk\java\lang\StackWalker\NativeMethod.java'
13:41:29  
13:41:29  TEST RESULT: Error. Use -nativepath to specify the location of native code
13:41:29  --------------------------------------------------
13:42:16  TEST: java/lang/String/nativeEncoding/StringPlatformChars.java

Rerun with testimage in https://ci.adoptium.net/view/Test_grinder/job/Grinder/14657/

smlambert avatar Sep 24 '25 19:09 smlambert

Noting for future reference that this is not new in JDK25 and is also why we did not ship JDK24 earlier this year because of this (Ref previous issue at https://github.com/adoptium/aqa-tests/issues/6231 - that one should probably have been kept open rather than this one but 🤷🏻 )

sxa avatar Sep 25 '25 08:09 sxa

re: https://github.com/adoptium/aqa-tests/issues/6623#issuecomment-3330284517

I have aborted https://ci.adoptium.net/view/Test_grinder/job/Grinder/14657/ (the run with the Azul build, was also taking too long and not progressing).

smlambert avatar Sep 26 '25 22:09 smlambert

I wonder how it's possible that Microsoft OpenJDK, which is apparently based on Temurin's build scripts and is tested against the same tests suite, has published the binaries for Windows Aarch64: https://learn.microsoft.com/en-us/java/openjdk/download#openjdk-2500-lts 🤔 Did they ignore these failing tests? Did they manage to fix them in their build? Maybe it's worth asking someone from Microsoft about it? (Sorry, I don't remember anyone's handle)

Alovchin91 avatar Sep 30 '25 11:09 Alovchin91

I wonder how it's possible that Microsoft OpenJDK, which is apparently based on Temurin's build scripts and is tested against the same tests suite, has published the binaries for Windows Aarch64: https://learn.microsoft.com/en-us/java/openjdk/download#openjdk-2500-lts 🤔 Did they ignore these failing tests? Did they manage to fix them in their build? Maybe it's worth asking someone from Microsoft about it? (Sorry, I don't remember anyone's handle)

Microsoft here :-). Despite running the same test suites (and more), we do run on different infra internally. That said, there are a bunch of fixes / ideas / config / exclusions that we intend to bring over in the coming days to help get the Temurin build out.

karianna avatar Sep 30 '25 12:09 karianna

@karianna That's great to know, thank you!

Alovchin91 avatar Sep 30 '25 12:09 Alovchin91

A few additional test runs on Adoptium's Windows 11/aarch64 Azure systems since we don't have any other results which specifically narrows down the test cases that are causing the problems. In general this is from the tests in java/lang/Thread/virtual/stress (Tests chosen as they were seen to be a little slower than others on the x64 runs):

Test JDK Result
Skynet Temurin 25 53 minutes per iteration (x64 finishes in <3 minutes) - tests fail hitting timeout
Skynet Azul 25 53 minutes per iteration - tests fail
Skynet Microsoft 53 minutes per iteration - tests fail
ParkALot Temurin Passes quickly (1m34) - took ~10m on x64
STraceALot Temurin All fail. Some about 30 seconds, others <5s) Unexpected exit from test
TimedWaitALot Temurin TBC - Still running after 1 day ... 3m33s per iteration on x64
Skynet100k Temurin Varying iteration times on -2 but all <5m. Was 3m39 per test on x64. Failed 4/10 (failures were in <5s Unexpected exit from test 25 on x64

For reference this is the full name of the tests above if needed for a jdk_custom Grinder run:

Shortname Test name
Skynet java/lang/Thread/virtual/stress/Skynet.java
ParkALot java/lang/Thread/virtual/stress/ParkALot.java
STraceALot java/lang/Thread/virtual/stress/GetStackTraceALotWhenBlocking.java
TimedWaitALot java/lang/Thread/virtual/stress/TimedWaitALot.java
Skynet100k java/lang/Thread/virtual/stress/Skynet100kWithMonitors.java

sxa avatar Oct 14 '25 11:10 sxa

Hi @karianna Do you have any updates to share?

Alovchin91 avatar Oct 21 '25 15:10 Alovchin91

It's going to be a few weeks yet at best, we're finding more issues internally as well (which we are upstreaming fixes for).

karianna avatar Oct 23 '25 00:10 karianna

As this interferes with plans to adopt Java 25 , what is the status here?

akurtakov avatar Dec 01 '25 13:12 akurtakov