Look at moving macstadium machines to orka
I need to request a new machine:
- New machine operating system (e.g. linux/windows/macos/solaris/aix): mac
- New machine architecture (e.g. x64/aarch32/arm32/ppc64/ppc64le/sparc): x64
- Provider (leave blank if it does not matter): macstadium
- Desired usage: Replacement for the non-orka machine that we have to reduce costs
- Any unusual specification/setup required: Standard playbooks, although if there is an opportunity for more lower spec ones that could be beneficial
- How many of them are required: Start with 2, look at increasing.
Please explain what this machine is needed for:
As per discussion a few weeks ago that the action is on me to progress, George and I will look at this migration together.
Related: https://github.com/adoptium/temurin-build/issues/3354
Our orka systems have been deprovisioned due to inactivity - currently having negotiations to determine a way forward.
Discussions with MacStadium have indicated that an orka-based solution (which would not be sponsored at present) would be approximately twice the cost of the static systems which we have at present so we are looking at alternative options.
Here is a breakdown of the number of systems and their types we have at macstadium:
| Use | x64 | aarch64 |
|---|---|---|
| Build | 2xG3 (4core) | 2xG5G |
| Test | 6xG3B (4core) 1xG4B (6core) | 2xG5A |
| TCK | 2xC3D (sml) 1xG4D (lge) | 2xG5E |
So that's a total of 4+9+5 = 18 systems. We currently have two hosted with MacInCloud with a potential option to increase that, particularly for x64 capacity
Looking at the performance of various systems, here are some runs of the JDK8/x64 extended.openjdk suite on the different machines:
| System | Time | Failures? |
|---|---|---|
| TC G4D [*] | 2h28 | 17 (hostname issues) 3702 |
| TC G3D - i5/2C/8G [*] | 6h51 | Same hostname issues as G4D 3701 |
| G3B - i7/4C/16G | 3h03 | All passed |
| G3B - i7/4C/16G [*] | 3h38 | Three failures in java.nio |
| aarch64 (Rosetta) [*] | 2h24 | 14 failures |
| MacInCloud i7-8700B | 3h15 | 1 failure in com/sun/jndi/ldap |
| G4B i7/6C/32G [*] | 1h46 | 10 failures in net/nio/rmi |
[*] - These machines have not typically been used for running the openjdk suites in the past so these may be newly visible failures. The second G3B machine was one of the build machines rather than one tagged for test.
So with the exception of the second line, the performance of these for running the full extended.openjdk suite looks reasonable. It should be noted that it is between 2x and 2.5x slower to run the same tests on JDK21 so around 8h for a G3B and 3h30 for a G4B.
Some other pieces of note:
- We are currently using a number of older machines running macos 10. We should consider whether we wish to retain any such machines for compatibility testing.
- We have also tried some runs with cross-compiling from aarch64 to x64 with a view to reducing the number of the (larger) build machines which we require by not requiring dedicated x64 build systems. This has been generally positive, although JDK8 has not yet been verified (wer'll need to get a suitable boot JDK installed, since Adoptium doesn't produce one for JDK8 on aarch64.
- The cross-compilation described above would also be useful if we switch to using dynmically created VMs using the macos Virtualization Framework on aarch64 in support of having fully isolated environments for each build
Noting that JDK8 will not build on macos12 with Xcode 13:
checking for xcodebuild... /usr/bin/xcodebuild
configure: error: Xcode 6, 9-12 is required to build JDK 8, the version found was 13.1. Use --with-xcode-path to specify the location of Xcode or make Xcode active by using xcode-select.
No configurations found for /Users/jenkins/sxa/temurin-build/build-farm/workspace/build/src/! Please run configure to create a configuration.
Makefile:55: *** Cannot continue. Stop.
OpenJDK make failed, archiving make failed logs
If I try a cross-compile from macos11/aarch64 with Xcode 12 I need to make a couple of other changes
It can be made to try a build by adjusting mac.sh to ensure xcode-select -switch / is run, and using --openjdk-target=x86_64-apple-darwin in the configure args. However for JDK8 the build fails with some more errors:
error: use of undeclared identifier 'finite'; did you mean 'isfinite'?
Which seems to have been deprecated and then removed in earlier Xcode versions (Possible backport?)
Error: value size does not match register size specified by the constraint and modifier [-Werror,-Wasm-operand-widths]
may be more problematic
Just to be rigorous, Ive kicked off the AQA test pipeline on all of our mac machines. JDK8 and 11 for x64, just 11 for arm. The focus is the build and test-macstadium machines, the other machines can be used as a 'control'
test-macstadium-macos1014-x64-1 https://ci.adoptium.net/job/AQA_Test_Pipeline/158/console test-macstadium-macos1014-x64-2 https://ci.adoptium.net/job/AQA_Test_Pipeline/157/console test-macstadium-macos11-arm64-1 https://ci.adoptium.net/job/AQA_Test_Pipeline/162/console test-macstadium-macos11-arm64-2 https://ci.adoptium.net/job/AQA_Test_Pipeline/161/console test-macstadium-macos1014-x64-3 https://ci.adoptium.net/job/AQA_Test_Pipeline/163/console test-macstadium-macos1014-x64-4 https://ci.adoptium.net/job/AQA_Test_Pipeline/164/console test-macstadium-macos1015-x64-1 https://ci.adoptium.net/job/AQA_Test_Pipeline/165/console build-macstadium-macos11-arm64-2 https://ci.adoptium.net/job/AQA_Test_Pipeline/166/console build-macstadium-macos11-arm64-1 https://ci.adoptium.net/job/AQA_Test_Pipeline/167/console build-macstadium-macos1014-x64-1 https://ci.adoptium.net/job/AQA_Test_Pipeline/168/console build-macstadium-macos1014-x64-2 https://ci.adoptium.net/job/AQA_Test_Pipeline/169/console test-macincloud-macos1201-x64-1 https://ci.adoptium.net/job/AQA_Test_Pipeline/170/console test-macincloud-macos1201-x64-2 https://ci.adoptium.net/job/AQA_Test_Pipeline/171/console
Bit of a bad idea to run all of them at the same time. Some of the test jobs have expired even after 1 day.
Sifting through the tests that have finished and not expired, avoiding duplicates (ie if jdk_security1_0 and jdk_security1_1 have the same failed tests, only jdk_security1_0 is shown)
test-macstadium-macos11-arm64-1 jdk_security1_0,jdk_security4_0,jdk_util_0,jdk_svc_sanity_0,jvm_compiler_0,jdk_io_0,jdk_other_0,jdk_net_0,jdk_net_0,jdk_time_0,jdk_tools_0,jdk_jfr_0,jdk_jdi_0,jdk_security_infra_0
test-macstadium-macos11-arm64-2 (same failures as -1) jdk_security1_0,jdk_security4_0,jdk_util_0,jdk_svc_sanity_0,jvm_compiler_0,jdk_io_0,jdk_other_0,jdk_net_0,jdk_net_0,jdk_time_0,jdk_tools_0,jdk_jfr_0,jdk_jdi_0,jdk_security_infra_0
build-macstadium-macos11-arm64-2 jdk_math_1,jdk_security1_0,jdk_security4_0,jdk_util_0,jdk_svc_sanity_0,jvm_compiler_0,jdk_io_0,jdk_other_0,jdk_net_0,jdk_security3_0,jdk_time_0,jdk_tools_0,jdk_jfr_0,jdk_jdi_0 ,jdk_security_infra_0
build-macstadium-macos11-arm64-1 (same failures as -2) jdk_math_1,jdk_security1_0,jdk_security4_0,jdk_util_0,jdk_svc_sanity_0,jvm_compiler_0,jdk_io_0,jdk_other_0,jdk_net_0,jdk_security3_0,jdk_time_0,jdk_tools_0,jdk_jfr_0,jdk_jdi_0 ,jdk_security_infra_0
So the failures you've got are only from the arm64 ones? And are all those targets from the openjdk suite - where the others targets all good? I'm a bit surprised we're seeing issues on arm64 when using the arm64 builds - I would expect some issues when trying to run the x64 ones on arm64 but it looks like you've run those with the real arm64 build - is that correct? I'm particularly interested in test-macstadium-macos1014-x64-4 and the build-x64 ones so if those results have got lost we should get those re-run
| Machine | Xcode version | JDK11 x64 build | JDK17 x64 build | JDK20 x64 build |
|---|---|---|---|---|
| build-macstadium-macos11-arm64-1 | Apple clang version 12.0.0 (clang-1200.0.32.29) |
build ✅ | build ✅ | build ✅ |
| build-macstadium-macos11-arm64-2 | Apple clang version 12.0.0 (clang-1200.0.32.29) |
build ✅ | build ✅ | build ✅ |
| test-macstadium-macos11-arm64-1 | Apple clang version 12.0.0 (clang-1200.0.32.29) |
build ✅ | build ✅ | build ✅ |
| test-macstadium-macos11-arm64-2 | Apple clang version 12.0.0 (clang-1200.0.32.29) |
build ✅ | build ✅ | build ✅ |
| test-macincloud-macos1201-x64-1 | Apple clang version 13.0.0 (clang-1300.0.29.3) |
build ✅ | build ✅ | build ✅ |
| test-macincloud-macos1201-x64-2 | Apple clang version 13.1.6 (clang-1316.0.21.2.3) |
build | build ✅ | build ✅ |
Can only kick off one build job at a time and on one machine at a time 😅 , this will take a while
A couple of other things to add to this list - see if we can build ok on clang13 on macos12 (The two macincloud machines) but also see if we can install the older version of xcode (The one used for JDK8) on a newer macos version.
Notes from building x64 jdk8 on my m1 mac
Install xcode11.7. I can do this on my own mac (with GUI), need to find a way to do this headless
Switch to xcode 11.7
xcode-select -switch 'path to Xcode11.7'
Install 'intel' homebrew into /usr/local/Homebrew, requires a new Rosetta bash shell
arch -x86_64 /usr/bin/env bash
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
Back to a non Rosetta shell:
export PKG_CONFIG_PATH="/usr/local/lib/pkgconfig"
Install intel libpng (for freetype)
arch -x86_64 brew install libpng
Command to run build
arch -x86_64 ./makejdk-any-platform.sh --clean-git-repo --jdk-boot-dir 'path to x64 jdk8 mac binary'/Contents/Home --configure-args '--with-toolchain-type=clang --openjdk-target=x86_64-apple-darwin --with-cups=/opt/homebrew/opt/cups/' --target-file-name jdk8_x64.tar.gz --build-variant temurin jdk8u
If theres still errors with the freetype compilation, install intel freetype and rerun build
arch -x86_64 brew install freetype
I built another x64 jdk8 binary on build-macstadium-macos11-arm64-1 and uploaded it to jenkins here
I kicked off the aqa test pipeline, https://ci.adoptium.net/job/AQA_Test_Pipeline/173/console. Only sanity openjdk failed https://ci.adoptium.net/job/Test_openjdk8_hs_sanity.openjdk_x86-64_mac/883/
jdk_jdi_jdk8_0
com/sun/jdi/RedefineCrossEvent.java.RedefineCrossEvent
com/sun/jdi/PrivateTransportTest.sh.PrivateTransportTest
In the interest of seeing how x64 mac tests run on arm64 mac, i kicked off https://ci.adoptium.net/job/AQA_Test_Pipeline/174/console (jdk11 aqa tests on test-macstadium-macos11-arm64-1
Most tests passed. Failing ones are:
Jlink_ReqMod
MathLoadTest_all_5m
jdk_io
java/io/Serializable/serialFilter/GlobalFilterTest.java
jdk_time
java/time/test/java/time/format/TestUTCParse.java
jdk_jfr_0 44 failed tests
jdk_jdi
com/sun/jdi/JdbOptions.java
jdk_security_infra
security/infra/java/security/cert/CertPathValidator/certification/GoogleCA.java
jdk_svc_sanity
jdk/jfr/jcmd/TestJcmdStartStopDefault.java
Ref https://github.com/adoptium/infrastructure/issues/2536#issuecomment-1714401394
com/sun/jdi/RedefineCrossEvent.java.RedefineCrossEvent is excluded on openj9, https://github.com/adoptium/aqa-tests/blob/80e978693163b65ce6d3caabeb823ba594766167/openjdk/excludes/ProblemList_openjdk8-openj9.txt#L333
Known issue https://github.com/adoptium/aqa-tests/issues/227, it fails the same way
Execution failed: `main' threw exception: com.sun.jdi.VMDisconnectedException: connection is closed
Rerunning com/sun/jdi/PrivateTransportTest.sh.PrivateTransportTest on test-macstadium-macos1014-x64-2 https://ci.adoptium.net/job/Grinder/7564/console. Test passed ✅
So a cross compiled x64 jdk8 binary passes the tests in the AQA pipeline. Excellent news
Ref https://github.com/adoptium/infrastructure/issues/2536#issuecomment-1721206291
Rerunning the failing tests on different arm64 mac machines to rule out infra related failure
Jlink_ReqMod, MathLoadTest_all_5m https://ci.adoptium.net/view/Test_grinder/job/Grinder/7568/console on build-macstadium-macos11-arm64-2
MathLoadTest_all_5m passed, rerunning Jlink_ReqMod on build-macstadium-macos11-arm64-1 https://ci.adoptium.net/view/Test_grinder/job/Grinder/7574/console
On build-macstadium-macos11-arm64-1
java/io/Serializable/serialFilter/GlobalFilterTest.java https://ci.adoptium.net/job/Grinder/7569/console ✅
java/time/test/java/time/format/TestUTCParse.java https://ci.adoptium.net/job/Grinder/7570/console
com/sun/jdi/JdbOptions.java https://ci.adoptium.net/job/Grinder/7571/console ✅
security/infra/java/security/cert/CertPathValidator/certification/GoogleCA.java https://ci.adoptium.net/job/Grinder/7572/console
jdk/jfr/jcmd/TestJcmdStartStopDefault.java https://ci.adoptium.net/job/Grinder/7573/console ✅
security/infra/java/security/cert/CertPathValidator/certification/GoogleCA.java rerun
https://ci.adoptium.net/job/Grinder/7575/console on build-macstadium-macos11-arm64-2
java/time/test/java/time/format/TestUTCParse.java rerun https://ci.adoptium.net/job/Grinder/7576/console on build-macstadium-macos11-arm64-2
Ive modified https://github.com/adoptium/infrastructure/blob/6dff77f14bab907d90d2f16b61ac8f0e96b60b3a/ansible/playbooks/AdoptOpenJDK_Unix_Playbook/roles/Xcode/tasks/main.yml#L76 to install Xcode11.7 onto arm64 macs, but when I run the playbook it hangs at that task for a considerable amount of time (so far its been 30mins and no change)
On the remote machine I can see the xcversion process running
If I try to install Xcode11.7 in an ssh session using the ansible commands linked above I get this error
%xip: error: The archive “Xcode_11.7.xip” is damaged and can’t be expanded.
No `Xcode.app(or Xcode-beta.app)` found in XIP. Please remove /Users/administrator/Library/Caches/XcodeInstall/Xcode_11.7.xip if you suspect a corrupted download or run `xcversion update` to see if the version you tried to install has been pulled by Apple. If none of this is true, please open a new GH issue.
administrator@test-macstadium-macos11-arm64-1 ~ %
I've tried an xcversion update but it still fails
It seems xcversion is no longer supported https://github.com/xcpretty/xcode-install/blob/master/MIGRATION.md
Im trying out the suggested alternative https://github.com/XcodesOrg/xcodes but am hitting library errors
administrator@test-macstadium-macos11-arm64-1 ~ % xcodes --help
dyld: lazy symbol binding failed: can't resolve symbol _swift_task_create in /opt/homebrew/bin/xcodes because dependent dylib @rpath/libswift_Concurrency.dylib could not be loaded
dyld: can't resolve symbol _swift_task_create in /opt/homebrew/bin/xcodes because dependent dylib @rpath/libswift_Concurrency.dylib could not be loaded
zsh: abort xcodes --help
I think @rpath/libswift_Concurrency.dylib comes with xcode 13 which is not yet on the machine
ref https://github.com/adoptium/ci-jenkins-pipelines/pull/825#issuecomment-1759488201
I have temporarily added x64 labels to build-macstadium-macos11-arm64-1 build-macstadium-macos11-arm64-2 to allow x64 mac jdk build jobs to run on them.
Their PATH variables in their jenkins config has (temporarily) been changed from /usr/local/bin/:$PATH:/opt/homebrew/bin to /opt/homebrew/Cellar/git/2.42.0/bin:/usr/local/bin/:$PATH:/opt/homebrew/bin
Work on the orka setup is still ongoing at macstadium - will update this ticket when things move forward
I have managed to get an Arm64 Orka VM to successfully compile JDK8u using Haroon's changes to the playbook:
https://ci.adoptium.net/job/build-scripts/job/jobs/job/jdk8u/job/jdk8u-mac-x64-temurin/436/
Now onto JDK11+ which will require a different version of XCode to be installed
Intel tests are passing on the intel VM image. I'm going to take the fixed ones offline to see if Orka can cope
JDK11 build completed using XCode command line tools (same as before) https://ci.adoptium.net/job/build-scripts/job/jobs/job/jdk11u/job/jdk11u-mac-x64-temurin/327/
~~JDK17 x64 build: https://ci.adoptium.net/job/build-scripts/job/jobs/job/jdk17u/job/jdk17u-mac-x64-temurin/405/~~ ~~JDK17 aarch64 build: https://ci.adoptium.net/job/build-scripts/job/jobs/job/jdk17u/job/jdk17u-mac-aarch64-temurin/351/~~
Trying again with Xcode 15.0.1:
JDK21 x64: https://ci.adoptium.net/job/build-scripts/job/jobs/job/jdk21u/job/jdk21u-mac-x64-temurin/37/ JDK21 aarch64: https://ci.adoptium.net/job/build-scripts/job/jobs/job/jdk21u/job/jdk21u-mac-aarch64-temurin/35/ JDK17 x64: https://ci.adoptium.net/job/build-scripts/job/jobs/job/jdk17u/job/jdk17u-mac-x64-temurin/406/ JDK17 aarch64: https://ci.adoptium.net/job/build-scripts/job/jobs/job/jdk17u/job/jdk17u-mac-aarch64-temurin/352/
Right now the main issues I'm seeing are with the VPN expiring after a certain amount of time, this should be resolved once the firewall is configured to allow Jenkins in
@gdams Not sure it's been explicitly mentioned in here but since it came up int he PMC this week can you clarify the reason for moving to XCode 15? The openjdk build matrix lists 12 as the Oracle-supported compiler, with 13.1 as "known good" too. It seems possibly that this is the cause of a lot of warnings showing in the build: https://github.com/adoptium/temurin-build/issues/3562 so we should consider how to handle this.