rules_haskell
rules_haskell copied to clipboard
rules_haskell fails when using docker sandbox (and --experimental_use_sandboxfs)
Describe the bug
external/rules_haskell_ghc_linux_amd64/bin/ghc-pkg: line 12:
$HOME/.cache/bazel/_bazel_$USER/.../sandbox/sandboxfs/752/external/\
rules_haskell_ghc_linux_amd64/bin/../lib/bin/ghc-pkg: No such file or directory
To Reproduce
- install sandboxfs
bazel build --experimental_use_sandboxfs //...
Expected behavior I expect the build to complete.
Environment
- OS name + version: Debian Buster
- Bazel version: 2.2.0
- Version of the rules: b41234677c9381982aae98098fb473a5b733c945
A similar error occurs when you use docker sandboxes.
I can reproduce the issue:
11:02:39 ch@ka repos/rules_haskell/rules_haskell_1265 ±|master|> bazel build --experimental_use_sandboxfs //...
Starting local Bazel server and connecting to it...
INFO: Analyzed 654 targets (173 packages loaded, 5372 targets configured).
INFO: Found 654 targets...
ERROR: /media/crypt1/repos/rules_haskell/rules_haskell_1265/tests/hsc/BUILD.bazel:9:1: HaskellRegisterPackage tests/hsc/link-config-hsc-lib/link-config-hsc-lib.conf failed (Exit 127) ghc-pkg failed: error executing command external/rules_haskell_ghc_linux_amd64/bin/ghc-pkg recache '--package-db=bazel-out/k8-fastbuild/bin/tests/hsc/link-config-hsc-lib' -v0 --no-expand-pkgroot
external/rules_haskell_ghc_linux_amd64/bin/ghc-pkg: 12: exec: external/rules_haskell_ghc_linux_amd64/bin/../lib/bin/ghc-pkg: not found
INFO: Elapsed time: 155.119s, Critical Path: 0.17s
INFO: 0 processes.
FAILED: Build did NOT complete successfully
For the record, I get a similar error when provisioning with nixpkgs (build --host_platform=@rules_haskell//haskell/platforms:linux_x86_64_nixpkgs and
run --host_platform=@rules_haskell//haskell/platforms:linux_x86_64_nixpkgs in .bazelrc.local):
09:58:44 ch@ka impure repos/rules_haskell/rules_haskell ±|*|> bazel clean --expunge && bazel build --experimental_use_sandboxfs //...
INFO: Starting clean (this may take a while). Consider using --async if the clean takes more than several minutes.
Starting local Bazel server and connecting to it...
INFO: Analyzed 624 targets (172 packages loaded, 5294 targets configured).
INFO: Found 624 targets...
ERROR: /media/crypt1/repos/rules_haskell/rules_haskell/tools/runfiles/BUILD.bazel:31:1: HaskellRegisterPackage tools/runfiles/link-config-bin/link-config-bin.conf failed (Exit 1) ghc-pkg failed: error executing command external/rules_haskell_ghc_nixpkgs/bin/ghc-pkg recache '--package-db=bazel-out/k8-fastbuild/bin/tools/runfiles/link-config-bin' -v0 --no-expand-pkgroot
src/main/tools/linux-sandbox-pid1.cc:427: "execvp(external/rules_haskell_ghc_nixpkgs/bin/ghc-pkg, 0x1b4a920)": No such file or directory
INFO: Elapsed time: 15.646s, Critical Path: 0.11s
INFO: 0 processes.
FAILED: Build did NOT complete successfully
bazel has version bazel 2.0.0- (@non-git), version of the rules: 2c4c3567e1bef3528bb378dbfa98dd74546160e5
@iphydf Thank you for trying this out and reporting the issue. @smelc Thanks for looking into this.
Regarding the issue
GHC's bindist goes through a configure step that generates wrapper scripts for the GHC binaries. In case of ghc-pkg it looks like this:
bazel-<workspace>/external/rules_haskell_ghc_linux_amd64/bin/ghc-pkg
#!/bin/sh
DISTDIR="$( dirname "$(resolved="$0"; while tmp="$(readlink "$resolved")"; do resolved="$tmp"; done; echo "$resolved")" )/.."
exedir="$DISTDIR/lib/bin"
exeprog="ghc-pkg"
executablename="$exedir/$exeprog"
datadir="$DISTDIR/lib"
bindir="$DISTDIR/bin"
topdir="$DISTDIR/lib"
#!/bin/sh
PKGCONF="$topdir/package.conf.d"
exec "$executablename" --global-package-db "$PKGCONF" ${1+"$@"}
So, bin/ghc-pkg will in turn call out to lib/bin/ghc-pkg. We're not tracking these interdependencie accurately in Bazel. So, the actual binaries are missing within the sandbox.
To fix this we need to track these dependencies. An easy way would be to simply always include the whole GHC distribution when a GHC binary is called. Since the configure step happens in a repository rule (at least for now) we could add a corresponding filegroup to the generated BUILD file that captures these. Once that works we could narrow these dependencies down to avoid excessive mounts into the sandbox.
Regarding @smelc reproduction
It looks like your first test is running within a Nix shell but using the bindist. That is not supported. The implementation is geared towards either all Nix or no Nix.
For the record, I get a similar error when provisioning with
nixpkgs(build --host_platform=@rules_haskell//haskell/platforms:linux_x86_64_nixpkgsandrun --host_platform=@rules_haskell//haskell/platforms:linux_x86_64_nixpkgsin.bazelrc.local):
With a Nix provided toolchain you're depending on files located within the Nix store, i.e. under /nix/store. If the Nix store (or at least the required store paths) is not mounted into the sandbox, then the build will fail, as you observe. I don't see a Bazel flag to add default mounts to the sandbox. Maybe this could be done with a wrapper around the sandboxfs binary which Bazel is then pointed to via --experimental_sandboxfs_path. That may be a way to get rules_nixpkgs to work with sandboxfs.
Regarding @smelc reproduction
It looks like your first test is running within a Nix shell but using the bindist. That is not supported. The implementation is geared towards either all Nix or no Nix.
Yes indeed, my bad; only made worse since we discussed that yesterday. Sorry for the noise. I've updated the message to correct that.
The state in which to test this issue in a manner similar to @iphydf is described here: https://github.com/tweag/rules_haskell/issues/1173#issuecomment-606723360.
To fix this we need to track these dependencies. An easy way would be to simply always include the whole GHC distribution when a GHC binary is called. Since the
configurestep happens in a repository rule (at least for now) we could add a correspondingfilegroupto the [generatedBUILDfile]
@aherrmann> would we do that by extending this group?
https://github.com/tweag/rules_haskell/blob/2c4c3567e1bef3528bb378dbfa98dd74546160e5/haskell/ghc.BUILD.tpl#L17 (this one captures the content of the bin directory after ghc's install right?)
To fix this we need to track these dependencies. An easy way would be to simply always include the whole GHC distribution when a GHC binary is called. Since the
configurestep happens in a repository rule (at least for now) we could add a correspondingfilegroupto the [generatedBUILDfile]@aherrmann> would we do that by extending this group? https://github.com/tweag/rules_haskell/blob/2c4c3567e1bef3528bb378dbfa98dd74546160e5/haskell/ghc.BUILD.tpl#L17
That filegroup is meant to only capture the actual executables since it's passed to the toolchain's tools attribute. For a start it's probably easiest to just add a second filegroup to cover all parts of the toolchain and pass them to the toolchain in a new attribute. Once this works, you can refine from there.
Another possibility would be to add the rest of the distribution to the :bin filegroup's data attribute to add it to bin's runfiles. However, this requires some care to avoid issues such as https://github.com/tweag/rules_nixpkgs/pull/103. So, I wouldn't start there.
(this one captures the content of the
bindirectory after ghc's install right?)
Yes, that's right.
Regarding the failure with nixpkgs, there are actually two issues: The first one is indeed that the actual binaries are inside /nix (which is easily circumvented by using a wrapper around sandboxfs to bind-mount /nix), and the second one is that the executables picked by the toolchain implementation are just (relative) symlinks to the actual executables, and the actual executables aren't included in the input of the rules.
For example, the ghc_pkg executable is external/rules_haskell_ghc_nixpkgs/bin/ghc-pkg which is a symlink to ghc-pkg-8.8.3