cargo
cargo copied to clipboard
Inform build scripts whether cargo is `check`ing or `build`ing
Because in cargo check there may be work that build.rs can skip
cc @wycats
Sounds like a good idea to me! The only catch that I can think of is that we need to ensure we don't cache the check'd output and use it for the real output, other than that should be relatively easy to do with a new env var!
The only catch that I can think of is that we need to ensure we don't cache the check'd output and use it for the real output
that is what is currently happening, ie.:
touch build.rscargo checkupdates eg.target/debug/build/recompile_self-feb93e46b6e91788/outputcargo builddoesn't build anything(because of 2.), but it should!
tested: cargo 0.25.0 release: 0.25.0
I think this is an ok thing to expose to build scripts, but it further highlights that a better factoring around build scripts might reduce footguns
This kind of solution will help people who remember to check for the mode, but breaking up build.rs into more hooks, many of which wouldn't semantically run at all during check, would have a bigger effect.
But as long as build.rs is the only game in town, I'm cool with this!
I'd expand the specific cases described in the title and first comment to include any subcommand, as I've seen Stack Overflow questions for "only do something for documentation" and "only do something for tests"
I have a use case where a bunch of heavy work in build.rs script can (and should) be skipped for both the check and test commands.
This would be helpful for https://github.com/rust-lang/rust/issues/76444
This would be great. My use case is that I'm wrapping a C++ library that has static globals whose constructors read env vars (:scream:), and I'd like to set those appropriately for testing without impacting the normal build.
FWIW I'd prefer to just run a different script altogether if it exists (e.g. build/tests.rs), but not picky.
Is there any solution or path towards providing this feature?
Also have a use case for this like the above: big C++ dependency that Rust builds via build.rs, but only necessary if things get as far as the link step.
Big +1 on this one:
Also have a use case for this like the above: big C++ dependency that Rust builds via build.rs, but only necessary if things get as far as the link step.
A beefy C++ dependency is effectively rendering cargo check useless for a project of mine.
+1 I would like to distinguish between cargo build, cargo clippy and cargo test in build.rs to avoid repeating heavy work.
I have found two things that have helped me reduce the extra undesired work somewhat, although not a full fix yet:
In the Cargo.toml of the package with the build.rs build script that does so much work, avoid running it on tests and doctest (if that is acceptable for your crate):
[[bin]]
name = "flowstdlib"
path = "main.rs"
test = false
doctest = false
[lib]
name = "flowstdlib"
path = "lib.rs"
test = false
doctest = false
This is useful for me as I have a workspace project with many crates, and this only deactivates that for this crate, and cargo test will test all other crates in the workspace.
Corrected Text
I have also used CARGO_PRIMARY_PACKAGE env var to detect when my create was being built as a dependency, not the primary package, and skip the heavy work.
Here is some sample code you can use in build.rs to do that if it helps your case:
if option_env!("CARGO_PRIMARY_PACKAGE").is_none() {
println!("cargo:warning='flowstdlib' is not the primary package being build, skipping WASM generation");
std::process::exit(0);
}
I have tried to implement this feature in #10126, but it has a few open questions such as the naming of the env var and its values, etc. If my approach is OKed, I will add the unit tests/try to get it to pass CI. (I'm new to Rust, please be patient :smiley_cat:)
I wanted to give an update on at least my personal thinking on this issue given that my last thoughts on this were over 4 years ago.
I'm not 100% certain this still fits in Cargo myself. The major downside of implementing a feature like this is that cargo check && cargo build gets slower than it currently is. Cargo currently caches build script invocations for those two commands, which means that all the work done by cargo check is reused by cargo build and isn't redone.
Originally I thought this was somewhat of a pedantic "oh ok now Cargo keeps myself warmer in the winter but is that really a bug?" but we're 4 years out from that last comment and basically ever build script ever written is not ready for a "check mode". This means that if this were to be implemented then this would be doubly-slower until build scripts actually read the appropriate env var for "I'm in check mode". Otherwise cargo check would build your huge C++ project, then cargo build would rebuild that whole C++ project yet again. The only way to fix this would be to update all build scripts in the wild to respect the "am I in check mode?" item.
As an author of crates like cc and cmake it's also somewhat ambiguous to me about what "check mode" would do for libraries like that. Should they do nothing? Type-check the C code? Make sure it all compiles? Basically I don't think that C/C++ have any real meaningful distinction like Rust does for cargo check and cargo build, so there's not really an obvious choice of what these library crates would do, which would require even more opt-in or configuration on behalf of all users.
My thoughts obviously aren't set in stone, though, but I wanted to bring up some downsides to "let's just simply implement this".
@alexcrichton is there a way for build scripts to opt-in to having this variable set? That way they'd only have to be rerun if they're actually using it.
Originally I thought of reusing rerun-if-env-changed, but that means cargo can't enforce that caching is correct, it just has to trust the build script ... but I think it has to do that anyway since the build script could be looking at any file, and its rerun-if-changed directives could be incomplete?
In my use case I always generate a rust "manifest" file and all rust in the lib is checked/clippied/built/tested always....
But on "build" I spend about 10mins generating other files, that are used in my app, but no use to check, clippy or test.
So, I think that fits your (compiled rust) use case, and allowing build.rs to know which one is being done doesn't detract from it?
After much investigation, I just "bailed" and implemented this as a "feature" of my lib, used in build.rs, that is off by default.
"cargo clippy" clippies all 7 crates in my workspace quickly.
"cargo build --features "do hard work" " does the heavy lifting.
Only inconvenience is I have to remember to invoke with the features flag.
In our use-case we have a heavy C++ library dependency that we cannot pull in and link to as a pre-compiled binary artifact as we want to be in control over the entire build start to finish (and even if we chose to use binaries: there simply aren't any provided for that project) and as we need to be able to pass certain compilation flags to that external build.
That external build tends to take about 4-5min on clean builds on a beefy MBP though (and ~10sec on unchanged re-builds), which render rust-analyzer useless for several minutes after any cargo clean since it runs cargo check. We need to have rust-analyzer run cargo check though as we also use bindgen in our build script to generate bindings for that external library, which major parts of our project then depend on.
Being able to skip the costly external build for cargo check and only run the bindgen step would be a huge boost in developer productivity for us as the external library is never actually edited by us and pinned to a specific well-tested release.
We would hope for a similar benefit when running cargo clippy as we have to run CI jobs for several platforms (due to heavy use of #[cfg(target_arch = "…")]) and configurations (due to heavy use of features) and currently every one of these jobs has to build the external library, adding 4-5min to the clock per platform, even though it never gets linked for clippy builds.
(For what it's worth we also looked into it but chose not to go the path of a "feature" workaround as that would break --all-features/--no-default-features and is not really what features are intended to be used for.)
My far simpler case is to throw an error if cargo bench is accidentally ran without the +nightly. Perhaps it is possible to do it via some other means, but the build script is how i tried to do it (and failed due to missing "cargo mode").
To reiterate I don't mean to say that this is a pointless feature with no use cases, this thread is a testament to how useful something along these lines would be. I wanted to write down my thoughts about what the naive solution of "just add the env var and cache appropriately" would have an impact on. I don't personally have time to design a different solution and weigh its tradeoffs, that's ideally where an enterprising contributor would step in and lead the charge.
I don't personally have time to design a different solution and weigh its tradeoffs, that's ideally where an enterprising contributor would step in and lead the charge.
I think that's already happened, though? If we went with my idea of using rerun-if-env-changed then #10126 should work basically as written :)
To add one more to the list, this would also help a lot for Miri in cross-target mode. Miri is perfectly able to interpret code for an arbitrary target, but build scripts can break this if they insist on building some C code for that target first -- C code that is not useful for Miri anyway.
As an author of crates like cc and cmake it's also somewhat ambiguous to me about what "check mode" would do for libraries like that. Should they do nothing? Type-check the C code? Make sure it all compiles? Basically I don't think that C/C++ have any real meaningful distinction like Rust does for cargo check and cargo build, so there's not really an obvious choice of what these library crates would do, which would require even more opt-in or configuration on behalf of all users.
They should only do whatever is needed to get a check-build of the Rust code to pass. Since that does not involve any linking, I think that means that typically, they should just do nothing in check builds.
@jyn514 I assume your proposal is to have them emit rerun-if-env-changed=CARGO_MODE or so? We'd still also want cargo to have separate caching for different modes (so that a check doesn't destroy the build cache), but then that seems like it should work, yeah.
@jyn514 I assume your proposal is to have them emit rerun-if-env-changed=CARGO_MODE or so? We'd still also want cargo to have separate caching for different modes (so that a check doesn't destroy the build cache), but then that seems like it should work, yeah
Yes, exactly :)
It could be useful to explore whether or not cc can compile the code without optimization or debug symbols, then never link the final objects. I would see value in knowing that I broke all my C code if I'm writing part of the Rust project in C.
Unfortunately I don't think cmake can do this (well, not reliably. You can ask it for a list of commands it would have run. But let's not...).
Unfortunately I don't think cmake can do this (well, not reliably.
It can. Use object libraries.
Ah I see... I was thinking of "what is the least amount of work needed to produce rmeta files", since I view that as the goal of cargo check, but that is not the only way to think about this.
You can't ask cmake to produce an object library on someone's behalf without modifying their CMake source, and so such a solution would fall outside the realm of libraries. This feature is good for that too, just not something we could feasibly ask the cmake crate for.
What might be useful there though is adding like -DIS_RUST_CHECKING=TRUE or something automatically, and documenting that this happens. However even then as a user, you can't just transparently use object libraries because iirc cmake doesn't propagate them via target_link_libraries without the generator expressions as well and so you would have to restructure your project for it in some form or another.
GPT-4 led me to believe cargo exposes this in build scripts via the "CARGO_CFG" environment variable. I was all excited to use it, but turns out it's not a thing. Wish it was....
Please don't use LLMs for things like this.
Ok let me rephrase, we really should add something that gives build scripts some context about what is being built and why. It is ridiculous that we have all this information in the parent process, but none of it is shared with build scripts, and I don't see any good reason for why it is not shared.
This should include:
- read access to all arguments passed to cargo
- read access to any RUSTC arguments that may have been set
And write access to the above as well as the ability to mutate+inject environment variables that are used later in the build process
I have tried to find a workaround and checked for differences in environment variables within the build script with:
let mut vars = String::from_str("VARS:\n").unwrap();
for (key, value) in std::env::vars() {
vars.push_str(&format!("{}={}\n", key, value));
}
std::fs::write("env-vars.log", vars).expect("Unable to write env-vars.log");
The most promising variable on my machine (ubuntu-22.04) seems to be 'RUST_BACKTRACE':
if std::env::var_os("RUST_BACKTRACE").unwrap_or("".into()) == "short" {
std::fs::write("_is-checking.log", "-").expect("Unable to write file");
std::process::exit(0);
} else {
std::fs::write("_is-building.log", "-").expect("Unable to write file");
}
You can also just check whether this variable exists at all.