toolchains_llvm icon indicating copy to clipboard operation
toolchains_llvm copied to clipboard

[wrapper] Make `cc_wrapper.sh` POSIX-compliant and use `#!/bin/sh`

Open RyanTorok opened this issue 3 months ago • 10 comments

Some Linux distributions, such as NixOS, do not provide /bin/bash. This PR aims to make the LLVM toolchain compatible with these platforms by only using POSIX-compatible features in the shell script and replacing the shebang with #!/bin/sh.

Prior to this change, attempting to use the LLVM toolchain to build a C/C++ target on NixOS causes an error like:

src/main/tools/linux-sandbox-pid1.cc:548: "execvp(external/toolchains_llvm~~llvm~llvm_toolchain/bin/cc_wrapper.sh, 0x1ce8a310)": No such file or directory

where the No such file or directory error is referring to /bin/bash, not the cc_wrapper.sh script itself. In fact, trying to run bazel-REPONAME/external/toolchains_llvm~~llvm~llvm_toolchain/bin/cc_wrapper.sh manually afterwards gives a Required file not found error.

This PR emerged out of a discussion on #543, where it was noticed that replacing the #!/bin/bash shebang with #!/usr/bin/env bash causes issues when the calling rule does not include /usr/bin in PATH (rules_rust hits this case).

This PR does not update the corresponding MacOS script, as all MacOS platforms should have /bin/bash.

Supercedes #543.

RyanTorok avatar Sep 10 '25 02:09 RyanTorok

This is a case in which Copilot comments are highly relevant and much more thorough than what I could provide. Please take a look.

fmeum avatar Sep 10 '25 07:09 fmeum

This is a case in which Copilot comments are highly relevant and much more thorough than what I could provide. Please take a look.

Copilot was quick to remind me that POSIX shells don't support the same nice syntax for regular expressions as bash. After fixing the issues it pointed out, I was able to get several of the test scripts to pass in an Ubuntu container, so hopefully CI will pass this time :)

RyanTorok avatar Sep 11 '25 00:09 RyanTorok

Trying to rewrite cc_wrapper.sh.tpl to be POSIX-compliant caused too many bugs, and more importantly, required eliding a bunch of safety checks. The original script uses arrays and regular expressions everywhere, and replacing these with POSIX-compatible equivalents was very unidiomatic and unsafe.

As an alternative, I moved the current cc_wrapper.sh.tpl to cc_wrapper_inner.sh.tpl, and created a new cc_wrapper.sh.tpl that searches for an instance of bash on the host, either directly at /bin/bash or through /usr/bin/env/PATH. Once a valid instance of bash is found, it is used to call cc_wrapper_inner.sh.

RyanTorok avatar Sep 17 '25 03:09 RyanTorok

Can a maintainer please re-run CI on this PR?

RyanTorok avatar Sep 29 '25 23:09 RyanTorok

Can a maintainer please re-run CI on this PR?

I checked the errors and it appears that the new inner script is still not present. So somewhere it is not being copied as it needs to.

helly25 avatar Oct 02 '25 11:10 helly25

I was able to replicate this failure locally in an Ubuntu container. I think the issue might be that the path of the cc_wrapper_inner.sh script is hard-coded relative to cc_wrapper.sh. I'm digging into how the cc_wrapper.sh script is invoked; it may require $(location) expansion.

RyanTorok avatar Oct 09 '25 00:10 RyanTorok

I believe I've fixed the issue. The cc_wrapper_inner.sh script was being found before, but the version of it I had before was inadvertently a combination of the original cc_wrapper.sh and the version from my failed attempt earlier of making it POSIX-compliant.

RyanTorok avatar Oct 09 '25 00:10 RyanTorok

@RyanTorok the version that conditionally generates the inner script wrapper fails on some tests as the file is then missing. Making the dependency dependent on the operating system class fails in cross compile. I'll push another version that creates an empty file for darwin/macos.

helly25 avatar Oct 09 '25 10:10 helly25

@RyanTorok the version that conditionally generates the inner script wrapper fails on some tests as the file is then missing. Making the dependency dependent on the operating system class fails in cross compile. I'll push another version that creates an empty file for darwin/macos.

Yep -- I realized as such. I believe the reason the other tests timed out is because I was using exec to execute the inner script, which might have been breaking the way Bazel executes the cc_wrapper script. I put together a more robust testing environment for myself and I'm checking this now.

To make this work on MacOS, we can just use the cc_wrapper_inner handoff everywhere now.

RyanTorok avatar Oct 09 '25 10:10 RyanTorok

The CI output indicates a runaway chain of processes spawning (i.e. a fork bomb). This PR has had it all!

RyanTorok avatar Oct 10 '25 01:10 RyanTorok