infrastructure icon indicating copy to clipboard operation
infrastructure copied to clipboard

Update RISC-V base image to 24.04

Open luhenry opened this issue 1 year ago • 3 comments

Currently, riscv64 docker image are based on Ubuntu 20.04. That is to use the earliest available glibc version available for RISC-V, to guarantee broadest compatibility on other distributions. However, this leads to using an ancient GCC compiler (GCC 10) which misses all optimizations and development of GCC 11, 12, 13 and 14.

RISC-V being such a rapidly evolving architecture and SW ecosystem, I want to propose the use of Ubuntu 24.04 as the base image for building Eclipse Temurin on RISC-V. Ubuntu 24.04 is IMO the right target as it has the same glibc version as Fedora 40 (glibc 2.39; https://packages.fedoraproject.org/pkgs/glibc/glibc/; https://packages.ubuntu.com/noble/libc6), distribution which is going to be the base for AlmaLinux 10 and RockyLinux 10.

This update would give us access to GCC 14, which fixes some of the limitations we're running into with older GCC versions, among which: dependency on libatomic, absence of autovectorization, poor bitmanip optimizations, and overall worst code generation.

luhenry avatar Jun 28 '24 12:06 luhenry

cc @sxa @gdams

luhenry avatar Jun 28 '24 12:06 luhenry

If the only reason is to get newer gcc functionality then My preference would very much be to do the same as we do on all other linux platforms i.e. build a custom gcc or (ideally, since we're doing this for most Linuxes now) use a devkit build for this.

sxa avatar Jun 28 '24 17:06 sxa

use a devkit build for this.

I've had a look at this, and while the process puts up a bit of a fight on riscv64 I have managed to get an initial devkit build for riscv64 that seems to work. My original devkit, built on Ubuntu 24.04, required a machine with GLIBC_2.38 so would not run on Ubuntu 20.04 - although the openjdk that is built with it would have been happy on glibc 2.27 systems) but I have now build a second devkit on Ubuntu 20.04, however it is missing gdb (not required for the builds). The devkits are built using the rpms from Fedora 27 to allow it to build a JDK which runs on distributions based on the earlier libc version, similar to what we do on other Linux platforms using the CentOS7 rpms. Fedora was chosen here as there are no CentOS versions available for RISC-V.

Some links. Note that these should not be considered permanent:

To use it to build OpenJDK:

  • curl -L https://ci.adoptium.net/userContent/devkit/riscv64-linux-gnu-to-riscv64-linux-gnu-F27-devkit.glibc217.tar.xz | tar -C /usr/local -xpJf -
  • git clone https://github.com/adoptium/temurin-build
  • cd temurin-build/build-farm
  • export CONFIGURE_ARGS=--with-devkit=/usr/local/riscv64-linux-gnu-to-riscv64-linux-gnu/riscv64-linux-gnu
  • ./make-adopt-build-farm.sh jdk21u

And it will produce something that will happily run on the distributions that the current builds support (technically earlier, as the devkit is 2.27 and the previous Ubuntu 20.04 builds were 2.32), but using the gcc 13.2 from the devkit.

Before continuing with any effort to put this into production I'd like to get it building reliably on systems within the Adoptium infrastructure (I've had to run this in multiple phases just to get the single devkit out for the first version, and the second excludes gdb - not sure yet why it's being so problematic) but for now I might put this devkit into the build images (but not use it in the default jenkins pipelines) so it is available for testing.

sxa avatar Aug 17 '24 12:08 sxa

I've been able to successfully build a devkit with GCC14.2 using the RPMs from Fedora 27/28 with glibc 2.27. Unlike my earlier attempts it seems to build reliably on the Scaleway machines and takes around 14 hours.

sxa avatar Feb 18 '25 12:02 sxa

New devkit at https://ci.adoptium.net/userContent/devkit/devkit.riscv64.GCC142.F27F28.scaleway9.tar.xz (sha256sum b1ded6f28dfe3260e0b061fbede07e9f754ca218278c26bdfb9af4f0f6bffcea) This was built (as you might guess from the name) on scaleway machine 9 running the existing Ubuntu 20.04 build container using the sysroot rpms from Fedora 27/28. I've added two additional files - the devkit build log and a list of the rpms and sha256sums that were downloaded for the sysroot.

The times to run the devkit build can be seen from the timestamps below - 13½ hours for GCC:

sxa@test-rise-ubuntu2404-riscv64-9:~/jdk/build/devkit/riscv64-linux-gnu$ find . -name log.build -print | xargs ls -lart
-rw-rw-r-- 1 sxa zeus   175805 Feb 18 17:50 ./riscv64-linux-gnu/binutils-2.41/log.build
-rw-rw-r-- 1 sxa zeus   305569 Feb 18 17:56 ./riscv64-linux-gnu/gmp-6.3.0/log.build
-rw-rw-r-- 1 sxa zeus   722637 Feb 18 17:57 ./riscv64-linux-gnu/mpfr-4.2.0/log.build
-rw-rw-r-- 1 sxa zeus    74474 Feb 18 17:58 ./riscv64-linux-gnu/mpc-1.3.1/log.build
-rw-rw-r-- 1 sxa zeus 12809227 Feb 19 07:31 ./riscv64-linux-gnu/gcc-14.2.0/log.build
-rw-rw-r-- 1 sxa zeus   118027 Feb 19 07:36 ./riscv64-linux-gnu/binutils-2.41-lib/log.build
-rw-rw-r-- 1 sxa zeus   613795 Feb 19 08:07 ./riscv64-linux-gnu/gdb-13.2/log.build
-rw-rw-r-- 1 sxa zeus      915 Feb 19 08:08 ./riscv64-linux-gnu/ccache-3.7.12/log.build
sxa@test-rise-ubuntu2404-riscv64-9:~/jdk/build/devkit/riscv64-linux-gnu$ 

sxa avatar Feb 19 '25 10:02 sxa

Build of jdk21 with the devkit seems to work ok 👍🏻 I haven't run anything through the fuil set of tests yet but I wouldn't anticipate too many issues in that respect. Follow-on activities now that the prototype is successful:

  • Create jenkins job to be able to build the devkit (Based on the jdk head devkit, although if we want to use GCC14.2 instead of 13.2 we'll need to patch that ... Not a problem as we currently patch it elsewhere) [*]
  • Adjust the playbooks to pick up the playbook (I've tested with my "preview" devkit linked in the comment above and it works with the latest commit in https://github.com/sxa/infrastructure/commits/riscv64_devkit1/)
  • Verify that builds with the devkit are also reproducible (Which should be feasible now)
  • Aim to have GCC14.2 builds enabled for March's JDK24 release.

For reference for anyone not used to using the devkit: The configure command should be run with --with-devkit=/usr/local/devkit/riscv64-linux-gnu-to-riscv64-linux-gnu or wherever you've extracted it to.

[*] - Note that the current devkit is based on some older Fedora repositories which have a mix of F27 and F28 rpms. Using a devkit based on a newer version of Fedora (Not sure there is one prior to 36) would impact its ability to run on Ubuntu 20.04 or other versions that people may have out there. We should ensure that no-one objects to using this. The patch that I've used for Tools.gmk is as follows:

$ git diff
diff --git a/make/devkit/Tools.gmk b/make/devkit/Tools.gmk
index 249eaa66247..56e348b29b6 100644
--- a/make/devkit/Tools.gmk
+++ b/make/devkit/Tools.gmk
@@ -68,7 +68,8 @@ else ifeq ($(BASE_OS), Fedora)
     BASE_OS_VERSION := $(DEFAULT_OS_VERSION)
   endif
   ifeq ($(ARCH), riscv64)
-    BASE_URL := http://fedora.riscv.rocks/repos-dist/f$(BASE_OS_VERSION)/latest/$(ARCH)/Packages/
+#    BASE_URL := http://fedora.riscv.rocks/repos-dist/f$(BASE_OS_VERSION)/latest/$(ARCH)/Packages/
+    BASE_URL := https://secondary.fedoraproject.org/pub/alt/risc-v/archive/RPMS/riscv64
   else
     LATEST_ARCHIVED_OS_VERSION := 35
     ifeq ($(filter x86_64 armhfp, $(ARCH)), )
@@ -92,9 +93,9 @@ endif
 # Define external dependencies
 
 # Latest that could be made to work.
-GCC_VER := 13.2.0
-ifeq ($(GCC_VER), 13.2.0)
-  gcc_ver := gcc-13.2.0
+GCC_VER := 14.2.0
+ifeq ($(GCC_VER), 14.2.0)
+  gcc_ver := gcc-14.2.0
   binutils_ver := binutils-2.41
   ccache_ver := ccache-3.7.12
   mpfr_ver := mpfr-4.2.0
$ 

sxa avatar Feb 19 '25 14:02 sxa

Job at https://ci.adoptium.net/job/build-scripts/job/utils/job/devkit/job/devkit-gcc-linux-riscv64/33/ completed on one of our riscv64 boards and produced the devkit as an artifact on that job (Upload step failed, but that's not critical for this).

Running a build as follows after extracting the tarball into /usr/local/devkit:

git clone https://github.com/adoptium/temurin-build
cd temurin-build/build-farm
CONFIGURE_ARGS=--with-devkit=/usr/local/devkit ./make-adopt-build-farm jdk21u

sxa avatar Mar 21 '25 11:03 sxa

Result:

Tools summary:
* Boot JDK:       openjdk version "21.0.5" 2024-10-15 LTS OpenJDK Runtime Environment Temurin-21.0.5+11 (build 21.0.5+11-LTS) OpenJDK 64-Bit Server VM Temurin-21.0.5+11 (build 21.0.5+11-LTS, mixed mode, sharing) (at /usr/lib/jvm/jdk21)
* Toolchain:      gcc (GNU Compiler Collection)
* Devkit:         gcc-14.2.0 - Fedora_28 (/usr/local/devkit)
* C Compiler:     Version 14.2.0 (at /usr/local/devkit/bin/gcc)
* C++ Compiler:   Version 14.2.0 (at /usr/local/devkit/bin/g++)

[...]

=JAVA VERSION OUTPUT=
openjdk version "21.0.7-beta" 2025-04-15
OpenJDK Runtime Environment Temurin-21.0.7+4-202503211136 (build 21.0.7-beta+4-202503211136)
OpenJDK 64-Bit Server VM Temurin-21.0.7+4-202503211136 (build 21.0.7-beta+4-202503211136, mixed mode, sharing)
=/JAVA VERSION OUTPUT=
===GENERATING RELEASE FILE===

So we need to

  1. determine if we're ok with using this particular devkit or if we want to base it on a later Fedora
  2. make it publish to devkit-binaries
  3. include it in the build image (For prototyping we could put a PR into the playbooks to install it from the jenkins job without it being formally published i.e. skip step 2)
  4. enable it in the build

sxa avatar Mar 21 '25 11:03 sxa

I've done two consecutive jdk24 builds with the devkit and the contents of the binary tarballs are binary identical which is consistent with most other Linux platforms.

export CONFIGURE_ARGS=--with-devkit=/home/sxa/jdk/build/devkit/result/riscv64-linux-gnu-to-riscv64-linux-gnu
export SCM_REF=jdk-24+36_adopt
export VARIANT=temurin
BUILD_ARGS="--release --clean-libs --tag jdk-24+36_adopt" ./make-adopt-build-farm.sh jdk24

Extracting under /usr/local/devkit/gcc-14.2.0-Fedora_28-b00 allows it to be built if you add --use-adoptium-devkit gcc-14.2.0-Fedora_28-b00 to BUILD_ARGS (In this case, CONFIGURE_ARGS from the above instructions should be unset). Note that in order to be reproducible the devkit has to be on the same location on the machine (Whether used via --use-adoptium-devkit or --with-devkit

sxa avatar Mar 26 '25 18:03 sxa

Sample JDK24 build pipeline utilising the devkit in the updated build image: https://ci.adoptium.net/job/build-scripts/job/jobs/job/jdk24/job/jdk24-linux-riscv64-temurin/16/console (I ran it as a headful build too)

sxa avatar Apr 01 '25 09:04 sxa

Re-opening to track updating the compiler for earlier versions, but JDK25+34 build in https://ci.adoptium.net/job/build-scripts/job/jobs/job/jdk/job/jdk-linux-riscv64-temurin/205/ is the first of the builds from the regular pipelines to be built with the GCC14 devkit.

sxa avatar May 22 '25 23:05 sxa

Re-opening to track updating the compiler for earlier versions, but JDK25+34 build in https://ci.adoptium.net/job/build-scripts/job/jobs/job/jdk/job/jdk-linux-riscv64-temurin/205/ is the first of the builds from the regular pipelines to be built with the GCC14 devkit.

Test results seem comparable to the previous versions so I think this is good :-)

Noting that this will be the first platform on GCC14. Others will be looked at as part of https://github.com/adoptium/temurin-build/issues/4110

sxa avatar May 23 '25 12:05 sxa

@luhenry We've got this running regularly on JDK25 now. Does it make sense to roll out the GCC14 build process onto earlier versions (21, 17). I haven't done a test build with the new devkit on either of those yet (I haven't tried JDK17 with any devkit so far on any platform ;-) )

sxa avatar Aug 04 '25 10:08 sxa

Generally speaking, it would be great to be on the latest and greatest stable of the GCC compiler on RISC-V as there are still quite a few bug getting fixed. So it's not only about performance but also about correctness of the generated code.

I would then support using GCC 14 for JDK 21 and 17 if possible.

luhenry avatar Aug 04 '25 10:08 luhenry

Trialling with the devkit options in the following jobs:

  • https://ci.adoptium.net/job/build-scripts/job/jobs/job/jdk17u/job/jdk17u-linux-riscv64-temurin/13/
  • https://ci.adoptium.net/job/build-scripts/job/jobs/job/evaluation/job/jobs/job/jdk11u/job/jdk11u-evaluation-linux-riscv64-temurin/111/

sxa avatar Aug 04 '25 11:08 sxa

The jdk17u failures in *.functional are a known problem: java.lang.UnsatisfiedLinkError: Can'\''t load library: /home/jenkins/workspace/Test_openjdk17_hs_dev.functional_riscv64_linux/jdkbinary/j2sdk-image/lib/libawt_xawt.so with https://bugs.openjdk.org/browse/JDK-8324305

luhenry avatar Aug 05 '25 10:08 luhenry

* https://ci.adoptium.net/job/build-scripts/job/jobs/job/evaluation/job/jobs/job/jdk11u/job/jdk11u-evaluation-linux-riscv64-temurin/111/

Noting that this pipeline failed over all because the JDK11/extended.perf job #9 (running on test-rise-ubuntu2404-riscv64-2 hit its 25 hour timeout. The last successful run of that job (albeit back in November which seems to have been the last time it was able to build it from a tag) took 1h15.

EDIT: Re-runs of the job in Grinder look ok, although took variying amounts of time:

Grinder # machine time result
13902 rise2404-2 1h30 four renaissance tests failed
13903 rise2404-1 2h05 three renaissance tests failed
13904 rise2404-7 - Timed out after 25h

sxa avatar Aug 07 '25 08:08 sxa

Builds are running with the Fedora-28 based devkit - closing. Documentation on the versions has been added in https://github.com/adoptium/temurin-build/pull/4241

sxa avatar Sep 02 '25 10:09 sxa