openj9 icon indicating copy to clipboard operation
openj9 copied to clipboard

Fix intermittent LogGeneratedClassesTest failures

Open singh264 opened this issue 1 year ago • 10 comments

Fix intermittent LogGeneratedClassesTest failures by removing use of lambda expression in static block of InternalCRIUSupport, which avoids the unexpected creation of lambda class files for InternalCRIUSupport during the test.

Issue: https://github.com/eclipse-openj9/openj9/issues/18556 Signed-off-by: Amarpreet Singh [email protected]

singh264 avatar Feb 16 '24 16:02 singh264

@tajila requesting your review.

Could openj9 grinders be started on a x86-64 linux machine ub16x64j95 to reproduce the intermittent LogGeneratedClassesTest failures and verify if the code changes resolve the issue? I am unable to start openj9 grinders due to lack of permissions.

It seems like the test failures would be reproducible in openj9 grinders on ub16x64j95, where the test has previously failed and where I have also reproduced the issue locally. The test failures are not reproducible in the internal grinders on x86-64 linux machines when the test is run 1000 times for jdk11, jdk17 and jdk21.

singh264 avatar Feb 16 '24 17:02 singh264

Im not sure this is the right approach. The test used to pass before, so changing the test is not the solution. Unless the test was always broken?

tajila avatar Feb 16 '24 18:02 tajila

@TobiAjila this isn't a test change.

pshipton avatar Feb 16 '24 20:02 pshipton

My mistake, didnt look carefully

tajila avatar Feb 16 '24 20:02 tajila

We only need unsafe for the registerRestoreEnvVariables so we can put that init code in a function and call when the restore hook is registered. That way we can keep the static init empty.

tajila avatar Feb 16 '24 21:02 tajila

@tajila re-requesting your review.

singh264 avatar Feb 17 '24 02:02 singh264

@tajila re-requesting your review.

singh264 avatar Feb 19 '24 02:02 singh264

jenkins test sanity plinux jdk17

tajila avatar Feb 20 '24 14:02 tajila

jenkins test sanity alinux64 jdk21

tajila avatar Feb 20 '24 14:02 tajila

jenkins test sanity plinux jdk17 jenkins test sanity alinux64 jdk21

It seems like the ppc64le linux build and the aarch64 linux build failed due to omr changes and openj9 changes that were recently reverted.

10:22:33  /home/jenkins/workspace/Build_JDK17_ppc64le_linux_Personal/omr/compiler/optimizer/ValuePropagationCommon.cpp:2360:37: error: 'java_lang_StringCoding_implEncodeAsciiArray' is not a member of 'TR'; did you mean 'java_lang_StringCoding_implEncodeISOArray'?
10:22:33   2360 |    bool isAsciiEncoder = (rm == TR::java_lang_StringCoding_implEncodeAsciiArray) && !disableImplEncodeAsciiArray;
10:22:33        |                                     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
10:22:33        |                                     java_lang_StringCoding_implEncodeISOArray

singh264 avatar Feb 20 '24 15:02 singh264

jenkins test sanity alinux64 jdk21

tajila avatar Feb 20 '24 17:02 tajila

jenkins test sanity plinux jdk17

tajila avatar Feb 20 '24 17:02 tajila

jenkins test sanity alinux64 jdk21 jenkins test sanity plinux jdk17

The ppc64le linux build passed, but the aarch64 linux build failed due to a testDateScheduledBeforeCheckpointDone timeout.

Testing: Create CRIU checkpoint image and restore three times - testDateScheduledBeforeCheckpointDone
Test start time: 2024/02/20 20:11:07 Coordinated Universal Time
Running command: bash /home/jenkins/workspace/Test_openjdk21_j9_sanity.functional_aarch64_linux_Personal_testList_0/aqa-tests/TKG/../../jvmtest/functional/cmdLineTests/criu/criuScript.sh /home/jenkins/workspace/Test_openjdk21_j9_sanity.functional_aarch64_linux_Personal_testList_0/aqa-tests/TKG/../../jvmtest/functional/cmdLineTests/criu /home/jenkins/workspace/Test_openjdk21_j9_sanity.functional_aarch64_linux_Personal_testList_0/jdkbinary/j2sdk-image/bin/java " -Xgcpolicy:optthruput " "org.openj9.criu.TimeChangeTest testDateScheduledBeforeCheckpointDone" 3 3 false false
Time spent starting: 11 milliseconds
***[TEST INFO 2024/02/20 20:16:07] ProcessKiller detected a timeout after 300000 milliseconds!***
***[TEST INFO 2024/02/20 20:16:07] executing /usr/bin/gdb -batch -x /tmp/debugger687231450370899115.txt bash 215349***
GDB OUT No shared libraries loaded at this time.
INFO: Running '/usr/bin/gdb' failed with rc = 1
GDB ERR Could not attach to process.  If your uid matches the uid of the target
GDB ERR process, check the setting of /proc/sys/kernel/yama/ptrace_scope, or try
GDB ERR again as the root user.  For more details, see /etc/sysctl.d/10-ptrace.conf
GDB ERR ptrace: Operation not permitted.
GDB ERR /home/jenkins/workspace/Test_openjdk21_j9_sanity.functional_aarch64_linux_Personal_testList_0/aqa-tests/TKG/output_1708455580932/cmdLineTester_criu_nonPortableRestore_5/215349: No such file or directory.
GDB ERR /tmp/debugger687231450370899115.txt:2: Error in sourced command file:
GDB ERR The program has no registers now.

INFO: Sleep for 60000 millis before next capture.
***[TEST INFO 2024/02/20 20:17:07] executing /usr/bin/gdb -batch -x /tmp/debugger687231450370899115.txt bash 215349***
GDB OUT No shared libraries loaded at this time.
INFO: Running '/usr/bin/gdb' failed with rc = 1
GDB ERR Could not attach to process.  If your uid matches the uid of the target
GDB ERR process, check the setting of /proc/sys/kernel/yama/ptrace_scope, or try
GDB ERR again as the root user.  For more details, see /etc/sysctl.d/10-ptrace.conf
GDB ERR ptrace: Operation not permitted.
GDB ERR /home/jenkins/workspace/Test_openjdk21_j9_sanity.functional_aarch64_linux_Personal_testList_0/aqa-tests/TKG/output_1708455580932/cmdLineTester_criu_nonPortableRestore_5/215349: No such file or directory.
GDB ERR /tmp/debugger687231450370899115.txt:2: Error in sourced command file:
GDB ERR The program has no registers now.

***[TEST INFO 2024/02/20 20:17:08] executing kill -ABRT 215349***
Time spent executing: 360797 milliseconds
Test result: FAILED
Output from test:
 [OUT] start running script
 [OUT] export GLIBC_TUNABLES=glibc.cpu.hwcaps=-XSAVEC,-XSAVE,-AVX2,-ERMS,-AVX,-AVX_Fast_Unaligned_Load
 [OUT] export LD_BIND_NOT=on
 [OUT] /home/jenkins/workspace/Test_openjdk21_j9_sanity.functional_aarch64_linux_Personal_testList_0/jdkbinary/j2sdk-image/bin/java -XX:+EnableCRIUSupport  -Xgcpolicy:optthruput  -cp /home/jenkins/workspace/Test_openjdk21_j9_sanity.functional_aarch64_linux_Personal_testList_0/aqa-tests/TKG/../../jvmtest/functional/cmdLineTests/criu/criu.jar org.openj9.criu.TimeChangeTest testDateScheduledBeforeCheckpointDone 3 3
 [ERR] /home/jenkins/workspace/Test_openjdk21_j9_sanity.functional_aarch64_linux_Personal_testList_0/aqa-tests/TKG/../../jvmtest/functional/cmdLineTests/criu/criuScript.sh: line 41: 215350 Killed                  $2 -XX:+EnableCRIUSupport $3 -cp "$1/criu.jar" $4 $5 $6 > testOutput 2>&1

***[TEST INFO 2024/02/20 20:17:08] kill -ABRT signal sent***
***[TEST INFO 2024/02/20 20:17:08] ABRT completed***

singh264 avatar Feb 20 '24 21:02 singh264