buck2
buck2 copied to clipboard
Allow NIX_* environment variables to be passed to rustc
Otherwise if you are running buck2, and by extension rustc, from within nix any C(++) builds will fail.
All credit to @fnichol who pointed me to this fix in their prelude.
Let me know if this fix is better applied somewhere else.
This is, IMO, not the right place to put this. It is far too tied to the specifics of the Nixpkgs/Rust toolchain you are using (nixpkgs? rust-overlay? mozilla's overlay? your own custom binaries? and what git versions are those?) and may become fragile if the version of Buck2's prelude and nixpkgs
get desynchronized somehow, e.g. after a merge from staging
and you run something like nix flake update
.
Besides all of that, nobody at Meta uses Nix for anything like this, so it will inevitably bitrot and nobody there will be able to maintain it. They won't be in a position to fix or migrate it or know when to add or remove things, or understand why and now those options may break (or help enforce!) hermeticity.
It is probably something better attached to the actual toolchain definition, i.e. system_rust_toolchain
should take an inherited_envs
parameter or something that lists environment variables that it should inherit from the buck2 daemon
directly. Maybe all toolchains should take that parameter; I don't know.
That's just one possible fix, though. I'm not even sure that's the best one. Really, in an ideal world, system_rust_toolchain
would be able to discover this on the fly somehow. But you need some form of dynamic dependencies for that. I'd have to think more about it.
Yeah I am not attached to this solution at all. Something like inherited_envs
sounds good to me, but curious to hear from folks more experienced than I.
I can think of three ways to go:
- Have inherited env vars as @thoughtpolice suggest. I think the issue with that is really the NIX_ENV_VAR_PREFIXES field. That seems a bit grim. It also requires quite a lot of configuration.
- Accept that this is a bit grim whichever way you slice it, and just hardcode that anything with a
NIX_
prefix is preserved. Only a few lines of code. No configuration. Not beautiful, but not horrible, and works out of the box for most people. - Make Nix users specify a rustc binary that has these environment variables prebaked in. I think that's actually not too hard, just do something like (untested):
genrule(
name = "rustc",
cmd = "echo \$NIX_WHATEVER=$NIX_WHATEVER rustc > $OUTPUT && chmod +x $OUTPUT",
output = "rustc",
executable = True,
)
Now you have produced a rustc that captures the environment variables, so even if the Rust toolchain omits them, users get it back.
That said, option 2 seems simple and effective? Although I'd be curious what the Rust rule authors think, e.g. @zertosh @davidbarsky
I think that solution 3 actually leads to a more complete solution 4 which should be the real way for people to use Nix for Buck2: there should, instead of a system_rust_toolchain
, be a nix_rust_toolchain
, and this rule should invoke nix
itself in order to:
- Get access to the
rustc
binary, i.e. by copying a closure from https://cache.nixos.org - Interrogate the list of shell variables needed from the resulting
shellHook
- Create a new derivation that wraps
rustc
as noted above and captures those variables explicitly, - Returns the
rustc
created from step 3 as the compiler to use.
There are two complications:
- We need to also write a
nix_cxx_toolchain
too as a hard dependency. - We also need to express what other Nix expressions should be included in the closure (i.e. you may want
zlib
available so you use the one from nixpkgs.)
This would mean that you do not need direnv or any other tool like nix shell
to populate your terminal environment. You could also specify a version of nixpkgs
this way, and it would be independent of anything being hard coded.
I haven't really needed this myself (yet), but maybe I can try to get a working nix_cxx_toolchain
going or something.
Is it likely that some people would want the whole Nix hermetic toolchain (which should certainly be available) and some other Nix users would want to just use direnv? If so, is doing both the proper toolchain and a hack to let NIX_ variables through worthwhile?
That said, option 2 seems simple and effective? Although I'd be curious what the Rust rule authors think, e.g. @zertosh @davidbarsky
As a preface, I'm on a 9 hour flight, so the amount of oxygen I have to think isn't particularly high, nor have I had a decent amount for minute. Anyways:
- I'm a (bad) nix user who uses it as a homebrew replacement, but I also have
fbsource
. I may or may or may not notice breakage, but it'd be crummy if this messed with toolchains bundled within the monorepos. That said, this might be a self-inflicted issue. - I'm not sure I'm an author as so much I occasionally poke at the implementation details. That being said, if the use case I mentioned in the first point continues to work, I don't really have objections. even if it doesn't, it's arguable this is a self-inflicted problem.
There is a rules_nixpkgs
for Bazel. They don't seem to be using the NIX_ environment variables to do linking, at all: rather, they construct symlinks into /nix/store and pass those through the normal bazel rules. E.g. here, here, here. You can definitely use this to make a "hermetic" (big caveat) local-only toolchain. If more of Buck's rules supported uploading to the action cache, then this would be viable, as you could allow developer machines to push and pull from the cache.
The problem is with remote execution. The hermetic seal is drawn outside buck-out. Making nix-based toolchains = symlinking to /nix/store, but if you try to use such a symlink (buck-out/v2/gen/.../__glibc__/out/lib/libc.so.6 -> /nix/store/...
on a remote executor, there is no /nix/store. So remote execution is very difficult. Your toolchain's binaries are all symlinks to outside buck-out. The executables' RUNPATHs point at /nix/store paths. So do the RUNPATHs of all the shared libraries. There are bash scripts etc that call other scripts and read files in /nix/store. None of these things can work.
The rules_nixpkgs
project has been discussing how to get remote execution working since early 2022. Most people are building a custom docker image for RE using nix, that has everything you can possibly need already available in /nix/store, in advance. Executing an action like "link (the linker is a symlink into /nix/store) main.o with glibc (which happens to symlink to /nix/store)" just uses whatever's been pre-populated in the docker image. Any new nix dependency => create a new docker image, push it, update the RE configuration at the same time.
I think that's the only idea that's not insane for anyone who doesn't have 10,000 developers to throw at the problem.
There are 3 main alternatives, and all of them sound like pretty big undertakings.
-
https://github.com/tweag/rules_nixpkgs/pull/404 -- You deploy an RBE server that is also a Nix remote store. You configure the RBE action runner or a local machine with an environment variable/host address, it connects to the remote, runs the nix build there, and copies the results back. On the RBE itself, copying the results back is a no-op, I think. The Nix remote store stores everything at /nix/store on the RBE. This would surely only work if you deployed your own RBE, with management of chroots etc done yourself.
-
https://skillsmatter.com/skillscasts/17673-remote-execution-with-rules-nixpkgs -- mount an NFS server at /nix/store in all the RBE executors. A separate nix remote store manages the contents (r+w). On the developer machine, a local-only action runs in the rule that writes /nix/store symlinks, that tells the the remote store to build and save a binary for the nix derivation and save it in NFS. Then once an RE action is executing, it's already there on the readonly NFS mount. This is apparently quite efficient as NFS only sends the blocks you actually try to read. But it's obviously a pretty involved setup.
-
You can always... rewrite all the /nix/store paths to point to
./buck-out/...
. This sounds absolutely horrifying, but it could work. There is a lot of existing nix tooling around rewriting store paths, likepatchelf
andnix bundle
with e.g. relocatable.nix. A difficulty is the lack of sharing. If you want glibc, you could get the closure (about 4 or 5 /nix/store directories) and rewrite all the nix paths in them to refer to./buck-out/...
, writing the output to buck-out. Cool. But almost every package needs glibc, so they will all do that independently. Once your dependencies are large... you addclang
, and thenlldb
, both of which rewrite and save their own copy ofllvm
etc... now you have 3GB of stuff, when it should be 1GB. Sounds pretty slow in every way.- My best idea for Buck is to do store path rewriting, but use some combo of anon targets (to dedupe multiple glibcs, llvms, etc) and dynamic dependencies (to see what nix packages are direct inputs to clang (
nix-store --query --references /nix/store/...clang-15.0.7
=> llvm, glibc, libgcc, etc) that also need to be called as anon targets, in order to do the rewriting in forclang
itself). Then buck would basically mirror /nix/store inbuck-out/v2/anon
, deduplicated just as /nix/store is, but with all the paths rewritten, built on-demand, and available in the cache and on RBE. Alas that is not how dynamic dependencies work. You have to run actions, I don't think you can bind an output using an anon target. So this is a pipe dream.
- My best idea for Buck is to do store path rewriting, but use some combo of anon targets (to dedupe multiple glibcs, llvms, etc) and dynamic dependencies (to see what nix packages are direct inputs to clang (
Using a docker image with the required nix dependencies in it seems doable and doesn't require deploying and administering a bunch of servers. At least it's possible to have consistent versions and bust caches when they change. I have an improved version of buck2-nix
that does basically what rules_nixpkgs
does, which I'm polishing up. I made a clang toolchain with it, not even using Nix's clang wrapper script, that has the deps and exec deps set correctly such that it will happily cross-compile to linux and link glibc, by using --target-platforms
on a Mac. So it's promising, at least.
I think it will also need symlinks, even dangling ones, to be preserved in buck2's re_upload
. At the moment it just looks straight through symlinks and recursively copies /nix/store/some-path to the remote executor.
OK, so this seems a complicated issue. It would be great to make some progress on it.
- @davidbarsky - are you saying that if we land a variant of this patch we break stuff for you? And is it OK to break stuff?
- @cormacrelf - I think dynamic actions could bind existing outputs - if you symlink to the actions is that enough?
- @cormacrelf - "If more of Buck's rules supported uploading to the action cache" - this seems quite feasible. You could imagine that the upload to cache could be controlled so even if False you can specify a config to make them all true.
In the meantime, is some variant of whitelist all NIX_
variables a reasonable way to go? Or is the genrule
trick I mentioned better?
@ndmitchell
The genrule approach can't work with system_rust_toolchain as-is, because it doesn't allow configuring RustToolchainInfo.compiler
. If it did allow that, the genrule would look SOMETHING like this, but even this doesn't work, because using nix is sometimes like bashing your head against a brick wall: nix flakes only work if every single file it references is checked into git... (side note why on earth should nix care? but it does.) srcs
in a genrule are copied to buck-out, not in git.
non-working concept
You need to use some kind of nix command to capture the environment, because the NIX_ env vars are huge (multiple kilobytes each) and quite unwieldy to paste into a genrule.
# root BUCK file
genrule(
name = "nix-rustc-script",
srcs = glob([ "**/*.nix" ]) + ["flake.lock"],
bash = """
(
echo '#!/usr/bin/env bash';
nix develop . --command bash -c set \
| grep '^PATH=\|^NIX_\|^PKG_CONFIG_\|^LD_' \
| sed 's/^/export /';
echo 'exec rustc \"$@\"';
) > $OUT
chmod +x $OUT
""",
out = "rustc.sh",
)
command_alias(
name = "nix-rustc",
exe = ":nix-rustc-script",
visibility = ["toolchains//..."],
)
Given that, it would be nicer to say to Nix folks (a core audience for buck) that there's a flag you can set to make it work. I would add a hack via:
- Flag in rustc_action.py to pass through any variable starting with a prefix
- Field on RustToolchainInfo:
env_prefix_whitelist
, default []. - Field on system_rust_toolchain to enable it, so as not to interfere if you use nix but don't want it in buck2. Set it to `env_prefix_whitelist = ["NIX_"].
- No need to whitelist individual variables because it's going to be too fragile.
- @davidbarsky - are you saying that if we land a variant of this patch we break stuff for you? And is it OK to break stuff?
I would not be surprised if things break, but I also know I'm not doing standard stuff. There's a small group of nix users that I'll ask at work about this, but I have the sudden, sinking feeling that I might have the most expertise on interactions between the buck rules, fbsource
, and nix.