build icon indicating copy to clipboard operation
build copied to clipboard

Upgrading the compiler toolchain

Open miladfarca opened this issue 7 months ago • 31 comments

V8 recently introduced std::format, but the change was reverted due to lack of support on older compilers. This feature is only available starting from GCC 13.1 and Clang 14 , which means the existing GCC base version (12.2) might become obsolete soon.

Additionally, V8 officially discontinued support for GCC at the end of July 2024. Any remaining GCC support now mainly comes from third-party platforms on V8 (e.g., ppc64 and s390x).

Rust was also added as a dependency to help support Temporal. Since Rust relies on LLVM for code generation, maintaining support for the LLVM toolchain (and Clang) will be necessary going forward.

Considering the availability of GCC vs Clang across different platforms (Ubuntu, Debian, RHEL, AIX, etc.), should the community aim to upgrade to GCC 13.1+ or transition entirely to Clang?

A mixed toolchain approach may be less than ideal, as platforms sticking with GCC would need to continue patching compilation errors as they arise.

miladfarca avatar Jun 03 '25 16:06 miladfarca

Are there platforms that we support and don't have Clang ?

targos avatar Jun 03 '25 18:06 targos

/cc @nodejs/build

miladfarca avatar Jun 04 '25 11:06 miladfarca

As discussed we need to map supported Clang versions across the OS versions we support.

miladfarca avatar Jun 19 '25 18:06 miladfarca

Are there platforms that we support and don't have Clang ?

Clang is being worked on for IBM i, but is not currently available (neither is GCC 13+).

kadler avatar Jun 19 '25 18:06 kadler

Initial changes have landed for ppc64le/s390x and we are now building/testing V8 (release) with Clang on RHEL: https://chromium-review.googlesource.com/c/chromium/src/+/6705431

miladfarca avatar Jul 22 '25 19:07 miladfarca

With https://github.com/nodejs/node/pull/59048, we will require Clang >=19.

targos avatar Jul 22 '25 19:07 targos

Correct, currently using clang version 19.1.7 on RHEL 8.10 .

miladfarca avatar Jul 22 '25 19:07 miladfarca

fyi @nodejs/platform-smartos

miladfarca avatar Jul 31 '25 16:07 miladfarca

fyi @nodejs/platform-smartos

Our current Jenkins build zones we provide you do not have gcc13 in them. The ones I use to fix bugs, however:

nuc-build-2024(~)[0]% gcc --version
gcc (GCC) 13.3.0
Copyright (C) 2023 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

nuc-build-2024(~)[0]% 

do. We should probably discuss updating them^H^H^H^H the build zones, possibly even spinning new ones if my recent contributions to v8 (which still haven't landed upstream in google's v8 yet) work as well as they should.

danmcd avatar Jul 31 '25 18:07 danmcd

We also have clang in pkgsrc:

nuc-build-2024(~)[0]% pkgin avail | grep clang
clang-18.1.8nb3      C language family frontend for LLVM
opencl-clang-18.1.0nb3 OpenCL-oriented wrapper library around clang
nuc-build-2024(~)[0]% 

and I should see if I can build using that.

danmcd avatar Jul 31 '25 18:07 danmcd

Also, this v8 fix isn't in node yet (never mind upstream V8), but needs to be if we wanna solve the gcc13 and/or clang problem: nodejs/node#58237

danmcd avatar Jul 31 '25 20:07 danmcd

So we can start experimenting, here is a PR to install Clang on the Fedora-based CI hosts: https://github.com/nodejs/build/pull/4116

targos avatar Aug 06 '25 15:08 targos

I'm working on the RHEL-based hosts. Currently working through an issue with linking to libatomic when using clang (trying out potential solutions). Will link to the PR when I open it.

richardlau avatar Aug 14 '25 13:08 richardlau

@richardlau fyi I'm currently using use_custom_libcxx = true to solve it.

miladfarca avatar Aug 14 '25 13:08 miladfarca

@richardlau fyi I'm currently using use_custom_libcxx = true to solve it.

FWIW for now I can build the current main of Node.js with libatomic -- I just have to install a matching gcc-toolset-*-libatomic-devel (rather surprisingly, the clang RPM depends on a gcc-toolset (14 for the current clang RPM) and by default picks that to look for headers/libraries). The complication being that it doesn't look like UBI 8 has an RPM for gcc-toolset-14-libatomic-devel (but RHEL 8/RHEL 9/UBI 9 does). I am currently exploring:

  • Install gcc-toolset-14-libatomic-devel. Source the RPM from somewhere else for the UBI 8 based containers, e.g. https://repo.almalinux.org/almalinux/8/AppStream/x86_64/os/Packages/gcc-toolset-14-libatomic-devel-14.2.1-1.1.el8_10.x86_64.rpm. (See if we can get someone in Red Hat to consider adding the package to UBI 8.)
  • Pass --gcc-install-dir to clang/clang++ to point to the already installed gcc-toolset-12 (what we are currently using and already has gcc-toolset-12-libatomic-devel installed). This can be done either:

Using use_custom_libcxx = true may be something we end up doing, but that would involve a Node.js gyp change and possibly pulling in more of V8's dependencies? We only selectively include stuff from V8's DEPS: https://github.com/nodejs/node-core-utils/blob/main/lib/update-v8/constants.js

richardlau avatar Aug 14 '25 13:08 richardlau

It works (for now) yes but mixing libatomic from gcc with clang could become problematic in the future. Apart from libatomic, use_custom_libcxx is also the preferred method when using clang and could become the only supported method down the road: https://chromium-review.googlesource.com/c/chromium/src/+/5963336

And yes you will need to pull more dependencies: https://chromium.googlesource.com/chromium/src/+/main/build/config/BUILD.gn#292

miladfarca avatar Aug 14 '25 13:08 miladfarca

Right, but we can do that later (maybe as part of the next V8 update? cc @targos ). We're not currently doing it on the Fedora/Debian machines that are now using clang to build main.

richardlau avatar Aug 14 '25 13:08 richardlau

Not sure it's related but I saw in the logs from the Fedora/Debian machines that it's using something from GCC automatically. For example: https://ci.nodejs.org/view/All/job/targos-node-test-commit-linux/4/nodes=fedora-latest-x64/consoleFull

11:38:48 ../deps/v8/src/wasm/wasm-code-manager.h:746:17: warning: 'atomic_load<v8::base::OwnedVector<const unsigned char>>' is deprecated: use 'std::atomic<std::shared_ptr<T>>' instead [-Wdeprecated-declarations]
11:38:48   746 |     return std::atomic_load(&wire_bytes_)->as_vector();
11:38:48       |                 ^
11:38:48 /usr/bin/../lib/gcc/x86_64-redhat-linux/15/../../../../include/c++/15/bits/shared_ptr_atomic.h:140:5: note: 'atomic_load<v8::base::OwnedVector<const unsigned char>>' has been explicitly marked deprecated here
11:38:48   140 |     _GLIBCXX20_DEPRECATED_SUGGEST("std::atomic<std::shared_ptr<T>>")
11:38:48       |     ^
11:38:48 /usr/bin/../lib/gcc/x86_64-redhat-linux/15/../../../../include/c++/15/x86_64-redhat-linux/bits/c++config.h:2058:45: note: expanded from macro '_GLIBCXX20_DEPRECATED_SUGGEST'
11:38:48  2058 | # define _GLIBCXX20_DEPRECATED_SUGGEST(ALT) _GLIBCXX_DEPRECATED_SUGGEST(ALT)
11:38:48       |                                             ^
11:38:48 /usr/bin/../lib/gcc/x86_64-redhat-linux/15/../../../../include/c++/15/x86_64-redhat-linux/bits/c++config.h:2026:19: note: expanded from macro '_GLIBCXX_DEPRECATED_SUGGEST'
11:38:48  2026 |   __attribute__ ((__deprecated__ ("use '" ALT "' instead")))
11:38:48       |                   ^

targos avatar Aug 14 '25 14:08 targos

One slightly annoying chicken and egg problem is that switching to clang on RHEL has broken the V8 CI on Linux ppc64le and s390x.

Part of that is that tools/make-v8.sh has some hard coded references to gcc/g++, which I can fix, but we also need this fix to V8's build DEP: https://chromium-review.googlesource.com/c/chromium/src/+/6705431 to avoid e.g. https://ci.nodejs.org/job/node-test-commit-v8-linux/6735/nodes=rhel8-s390x,v8test=v8test/console

16:18:22 ERROR at //build/config/clang/BUILD.gn:211:11: Assertion failed.
16:18:22           assert(false)  # Unhandled cpu type
16:18:22           ^-----
16:18:22 See //build/config/clang/BUILD.gn:254:1: whence it was called.
16:18:22 clang_lib("compiler_builtins") {
16:18:22 ^-------------------------------
16:18:22 See //build/config/BUILDCONFIG.gn:407:5: which caused the file to be included.
16:18:22     "//build/config/clang:unsafe_buffers",
16:18:22     ^------------------------------------

I tried rolling forward https://github.com/nodejs/node/blob/538186b84fb45ed725ea3d6353ec2353253e3ccd/deps/v8/DEPS#L132 on Node.js' main branch to 8e6f1ed0fb9d39e2e4017b151d551d1f52db7343 but that led to other configuration errors.

It's not clear to me if it's possible to build the current version of V8 on main with clang on Linux ppc64le or s390x. I might need to force the V8 CI to use gcc on those platforms (either through select-compiler.sh or tools/make-v8.sh) until we're able to update V8 in Node.js.

FWIW the Node.js build is working with clang on RHEL as we're not dependent on any of the V8 build toolchain for those.

richardlau avatar Sep 04 '25 16:09 richardlau

For now I've edited https://ci.nodejs.org/job/node-test-commit-v8-linux/ to use gcc 12 for Linux ppc64le and s390x. We'll need to undo that when we update V8 on main to a version that includes https://chromium.googlesource.com/chromium/src/build/+/8e6f1ed0fb9d39e2e4017b151d551d1f52db7343.

Node.js builds will still use clang as they don't rely on any of V8's gn stuff.

richardlau avatar Sep 04 '25 16:09 richardlau

I had a look at this node-daily-master test run.

Platforms that are still on GCC:

Note that we really need at least Clang 19. I had to upgrade the canary builds from Clang 18 because they had compiler errors with it (https://github.com/nodejs/node-v8/pull/303).

targos avatar Sep 06 '25 09:09 targos

For Ubuntu, Clang 19 doesn't seem to be available by default on 22.04 (I tried apt update and apt search clang on one of the containers). I see two options:

  • Install Clang from the LLVM repository: https://apt.llvm.org/
  • Upgrade to Ubuntu 24.04

targos avatar Sep 08 '25 12:09 targos

Is it important that illumos/SmartOS jump to clang 19, as opposed to using a gcc that does the right C++ thing? Moving to the most recent SmartOS-specific build environment (pkgsrc 24.4) still means gcc (gcc 13.3.0) as there's no clang19 there. Furthermore, I'm not sure how nicely clang/llvm plays with illumos; they may suffer a bit of all-the-world's-Linux like V8 seems to.

I've mentioned previously that we may need to update the Jenkins agents for SmartOS/illumos, as well as maybe offering up a non-SmartOS illumos distribution Jenkins agent as well. I don't wish to clutter up this ticket with it, but I want to make sure we're doing our part to help things move along.

danmcd avatar Sep 08 '25 13:09 danmcd

@danmcd If a platform needs to continue using gcc then they may need to maintain V8 on it. Up until now much of the gcc maintenance has been done by ppc64/s390x ports on RHEL as V8 upstream no longer supports it. We have recently switched to Clang as well to build V8 and i've noticed gcc builds are already broken. I also noticed BUILDING.md has been updated to reflect this here: https://github.com/nodejs/node/pull/59782/files tho the GCC >= 12.2 part really depends on if someone maintains it.

miladfarca avatar Sep 08 '25 13:09 miladfarca

V8 14.2 is not compatible with GCC 12: https://github.com/nodejs/node/pull/60111

targos avatar Oct 03 '25 13:10 targos

V8 14.2 is not compatible with GCC 12: nodejs/node#60111

I know pkgsrc 22.4 LTS has gcc12, but 23.4 LTS has gcc13.2 so it SHOULD build ("23" jenkins agents).

pkgsrc 24.4 LTS (which we use to BUILD SmartOS now) has gcc13.3. I can do local builds on 24.4 to see what's happening. ALSO given you folks accepted my madvise() and 64-bit-pointer fixes in your downstream V8 (because I've had ENOCYCLES to engage putting these into V8 directly), I'm pretty sure it'll be MUCH EASIER TO provide you a pkg 24.4 build zone that'll be agile enough to keep you moving forward. (Current ones had to use OLD illumos/SmartOS-PI to workaround the madvise() problem I've since fixed for you.)

As for llvm and pkgsrc on SmartOS, I'll have to consult more with Jonathan P.

ALSO, I may/should start providing another illumos distro for reality checking. OmniOS LTS is new (r151054) and its compiler chain seems good for you:

nowhere(~/build/illumos-gate)[0]% pkg list | egrep "gcc|clang|llvm"
developer/gcc10                                   10.5.0-151054.0            im-
developer/gcc13                                   13.3.0-151054.0            im-
developer/gcc14                                   14.2.0-151054.0            im-
ooce/developer/llvm-15 (extra.omnios)             15.0.7-151054.0            im-
system/library/gcc-runtime                        14-151054.0                i--
system/library/gccgo-runtime                      11-151054.0                i--
nowhere(~/build/illumos-gate)[0]% 

(Thought it may need a llvm-runtime/clang-runtime? I'll speak with OmniOS folks about this.)

Assuming targos:v8-142 is the correct branch, I'll see what a local 24.4 build of it does, and see if I can get any needed fixes to you.

danmcd avatar Oct 03 '25 14:10 danmcd

I have two C++ problems with my 24.4 build, one is illumos-specific and not anyone here's concern.

The other is errors like this (which seem to be the same with gcc13.3 in pkgsrc 24.4 LTS as the gcc13.2 errors in pkgsrc 23.4 LTS):

<SNIP!>
09:04:11 ../deps/v8/src/compiler/turboshaft/builtin-call-descriptors.h:31:7: error: declaration of 'static constexpr v8::base::tmp::append_t<v8::base::tmp::list<>, v8::internal::compiler::turboshaft::V<v8::internal::compiler::turboshaft::FloatWithBits<64> > > v8::internal::compiler::turboshaft::builtin::CheckTurboshaftFloat64Type::Arguments::make_args_type_list_n(v8::internal::compiler::turboshaft::detail::IndexTag<1>)' changes meaning of 'make_args_type_list_n' [-Wchanges-meaning]
09:04:11    31 |       make_args_type_list_n(detail::IndexTag<name##_index + 1>);               \
09:04:11       |       ^~~~~~~~~~~~~~~~~~~~~
09:04:11 ../deps/v8/src/compiler/turboshaft/builtin-call-descriptors.h:757:7: note: in expansion of macro 'ARG'
09:04:11   757 |       ARG(V<Float64>, value)
09:04:11       |       ^~~
<SNIP!>

The full logs in https://ci.nodejs.org/job/node-test-commit-smartos/nodes=smartos23-x64/62692/console have more.

I'm recovering from sickness so I'll be slow in response.

danmcd avatar Oct 03 '25 16:10 danmcd

@abmusse and I have investigated this one already and there's an open CL for it already: https://chromium-review.googlesource.com/c/v8/v8/+/6987408 Both index_counter (mentioned in the CL) and make_args_type_list_n have the same issue.

kadler avatar Oct 03 '25 18:10 kadler

Question regarding the Rust toolchain support: Should we consider migrating from ccache to sccache (https://github.com/mozilla/sccache) ? That would expand the caching footprint to include rust.

ryanaslett avatar Nov 12 '25 18:11 ryanaslett

I was looking at recommended ways to install Rust on Fedora: https://developer.fedoraproject.org/tech/languages/rust/rust-installation.html

I guess we need to decide on a general strategy. Do we want to:

  • Depend on the version distributed by the OS?
  • Install rustup and setup Rust with Ansible?
  • Something else?

targos avatar Nov 15 '25 08:11 targos