clap icon indicating copy to clipboard operation
clap copied to clipboard

clap_builder builds non-deterministically with Bazel

Open rejuvenile opened this issue 3 months ago • 4 comments

Please complete the following tasks

Rust Version

rustc 1.92.0-nightly (f6aa851db 2025-10-07)

Clap Version

HEAD, 4.5.49

Minimal reproducible code

cargo build

Steps to reproduce the bug with the above code

cd clap/cargo_builder && cargo build --release
cd ../../clap2/cargo_builder && cargo build --release
cd ../..
bcomp (or diff) clap/target/release/libclap_builder.rlib clap2/target/release/libclap_builder.rlib

Actual Behaviour

  • absolute paths to the source tree are embedded in the rlib, even in release mode
  • there are at least several hundred other hex differences (some look like padding, perhaps garbage or uninitialized)

Expected Behaviour

Two cargo builds in different source trees on the same host on the same version result in binary-identical rlibs.

Additional Context

This causes distributed build, cache, and test systems like Bazel to be less efficient. Because the clap_builder dependency in a binary may be different across hosts, the resulting binary will hash differently and expensive tests may be rerun. No other clap crate has this issue.

Debug Output

No response

rejuvenile avatar Oct 14 '25 16:10 rejuvenile

What is it that you saw that is clap specific about this problem rather than being a general cargo reproducibility problem?

Note that cargo requires extra steps to create reproducible binaries. There is unstable support for trim-paths which simplifies this but even that still has known reproducibility issues.

epage avatar Oct 14 '25 16:10 epage

Thanks for the quick response! I actually don't build using Cargo. I used Cargo as an example to simplify the repro. I actually build using Bazel and rules_rust. Of the hundred or so dependent crates in Bazel, clap_builder is the only one which doesn't build reproducibly in this setup, which is why I suspect something specific.

BTW, in my build environment all of the source paths are the same across all hosts, so that's not causing the non-determinism (however, it would cause non-determinism in less hermetic environments). I've attached a binary diff of the two rlib's in the Bazel setup (without source code paths). One can see some structure to the differences.

Image

Here's most of the cargo built diff:

Image

rejuvenile avatar Oct 14 '25 17:10 rejuvenile

Can't think of what would cause this. This isn't even code that depends on clap_derive in case that generated code in a non-deterministic way (e.g. using a HashMap which it doesn't). I did a quick scan of what macros we invoke and I'm not seeing anything.

This will likely need investigation by someone interested enough and with a reproduction case.

epage avatar Oct 14 '25 17:10 epage

On two separate machines with the same source path, cargo delivers deterministic builds. So I think that perhaps all the differences in the cargo version are due to different source paths, and in the Bazel version due to some interaction with rules_rust.

rejuvenile avatar Oct 14 '25 17:10 rejuvenile