openj9
openj9 copied to clipboard
AIX build failure in ClassFileOracle
See https://openj9-jenkins.osuosl.org/job/Build_JDK11_ppc64_aix_OMR/760, for example
[2025-05-08T11:28:06.393Z] [ 36%] Building CXX object runtime/bcutil/CMakeFiles/j9dyn.dir/ClassFileOracle.cpp.o
[2025-05-08T11:28:16.426Z] 1500-004: (U) INTERNAL COMPILER ERROR while compiling ClassFileOracle::LocalVariablesIterator::hasGenericSignature(). Compilation ended. Contact your Service Representative and provide the following information: Internal abort. For more information visit: http://www.ibm.com/support/docview.wss?uid=swg21110810
[2025-05-08T11:28:17.818Z] 1586-346 (U) An error occurred during code generation. The code generation return code was 1.
[2025-05-08T11:28:19.316Z] gmake[6]: *** [runtime/bcutil/CMakeFiles/j9dyn.dir/build.make:133: runtime/bcutil/CMakeFiles/j9dyn.dir/ClassFileOracle.cpp.o] Error 1
[2025-05-08T11:28:19.316Z] gmake[6]: *** Waiting for unfinished jobs....
The last successful build I found for JDK11 on AIX is https://openj9-jenkins.osuosl.org/job/Pipeline-OMR-Acceptance/873. The first failing build was https://openj9-jenkins.osuosl.org/job/Pipeline-OMR-Acceptance/876.
Changes:
- openj9: https://github.com/eclipse-openj9/openj9/compare/eb473bd9d39..a9136b8a79f
- omr: https://github.com/eclipse-omr/omr/compare/38fbca611ff..38fbca611ff
- jdk11: https://github.com/ibmruntimes/openj9-openjdk-jdk11/compare/aef46aabde...01db13aecfc
It still builds in the nightly builds, so I think it's intermittent. i.e. from last night https://openj9-jenkins.osuosl.org/job/Pipeline-Build-Test-JDK11/1062/
That nightly succeeded on p8-java1-ibm10 while https://openj9-jenkins.osuosl.org/job/Pipeline-OMR-Acceptance/879 failed (again) on p8-java1-ibm08. Failures were on several different machines:
- https://openj9-jenkins.osuosl.org/job/Build_JDK11_ppc64_aix_OMR/759 - p8-java1-ibm12
- https://openj9-jenkins.osuosl.org/job/Build_JDK11_ppc64_aix_OMR/760 - p8-java1-ibm09
- https://openj9-jenkins.osuosl.org/job/Build_JDK11_ppc64_aix_OMR/761 - p8-java1-ibm08
I can't think of any reason it would pass in the nightly builds but fail in the OMR builds. Last night it passed on p8-java1-ibm12, which includes the OMR changes from yesterday. https://openj9-jenkins.osuosl.org/job/Build_JDK11_ppc64_aix_Nightly/1047/ - p8-java1-ibm12
I think "intermittent" is a rather generous term to describe the situation. That it only seems to fail for jdk11 seems relevant, but I didn't find any changes in that source file, nor any included file, that would explain this.
The only difference I can find is the build directory name. Build_JDK11_ppc64_aix_Nightly vs Build_JDK11_ppc64_aix_OMR
Nightly build job passed, that I filled in with the same parameters as a failing OMR build job. https://openj9-jenkins.osuosl.org/job/Build_JDK11_ppc64_aix_Nightly/1051/
https://openj9-jenkins.osuosl.org/job/Build_JDK11_ppc64_aix_OMR/766
@zl-wang can the XLC team take a look at this INTERNAL COMPILER ERROR with 16.01.0000.0020.
https://openj9-jenkins.osuosl.org/job/Build_JDK11_ppc64_aix_OMR/781/
There is a fixpack 21, maybe we need to try it.
Created https://github.ibm.com/runtimes/infrastructure/issues/10828 so we can try it out.
yes, try out the latest PTF first, before i get xlC team involved
@zl-wang we tried 16.1.0.21 but the same problem occurs.
IBM XL C/C++ for AIX, V16.1.0 (5725-C72, 5765-J12)
Version: 16.01.0000.0021
https://openj9-jenkins.osuosl.org/job/Build_JDK11_ppc64_aix_OMR/784/
08:55:18 1500-004: (U) INTERNAL COMPILER ERROR while compiling ClassFileOracle::LocalVariablesIterator::hasGenericSignature(). Compilation ended. Contact your Service Representative and provide the following information: Internal abort. For more information visit: http://www.ibm.com/support/docview.wss?uid=swg21110810
Repeat that problematic compilation command line now but with an additional option -P (I believed), i.e. only pre-processing. Then, it generates a pre-processed file (written in file <OriginalFileName>.i) i.e. every include etc is consolidated in that .i file. Send it to me, and I will let them take over. so that, they can do investigations with that file only (no need header files etc).
i am right about the option: -P Preprocesses the C or C++ source files named in the compiler invocation and creates an output preprocessed source file for each input source file. The preprocessed output file has the same name as the input file, with a .i suffix.
The following recreates it for me on both AIX 7.2 and 7.3 machines.
/opt/IBM/xlC/16.1.0/bin/xlclang++ -x c++ -DAIXPPC -DIPv6_FUNCTION_SUPPORT -DJ9_INTERNAL_TO_VM -DOPENJ9_BUILD -DPPC -DPPC64 -DRS6000 -D_ALL_SOURCE -D_LARGE_FILES -qnoeh -fno-exceptions -g -qalias=noansi -qxflag=LTOL:LTOL0 -q64 -qxlcompatmacros -O3 -qstackprotect -fno-rtti -qlanglvl=extended0x -qlanglvl=extended0x -qnortti -qsuppress=1540-1087:1540-1088:1540-1090 -fPIC -qhalt=w -o ClassFileOracle.cpp.o -c ClassFileOracle.i
a defect was opened in xlC side:
https://compjazz.rtp.raleigh.ibm.com:9443/jazz/resource/itemName/com.ibm.team.workitem.WorkItem/174570
ICE goes away if -qstackprotect option is removed though.
update in the RTC defect: (jist of it: looks like a normal OOM issue)
The traceback comes from AS, the final assembly pass, which does binary encoding and object file creation. Specifically the top level driver for AS - when doing a memory allocate. The stack protect code was created by epilogue.cpp a long time before, and AS doesn't have anything directly to do with it. I think this is just an out of memory error. Following that theory, I was able to see a successful compile if I removed -g from the compile command, or if I used -qlinedebug in place of -g, or if I added -qcompact. I looked at the code listings just before AS, and I only see < 10 instructions related to stack protect, in 1 function, so I don't see much evidence that it is doing anything crazy to blow things up. And the compilation will still work with stackprotect, if we change other options to reduce memory.