Re-organize build-dir by package + hash, rather than artifact type
Implementation:
- [x] #15848
- [x] #15947
Documentation: https://doc.rust-lang.org/nightly/cargo/reference/unstable.html#build-dir-new-layout
Known issues
- As a side effect, we pass a lot more parameters to rustc, likely making
cargo -vvmore annoying, similar to #13941 - It also makes impossible to just copy paste commands into your shell if the limit is exceeded. See bjorn3's comment.
- Increases the need for a solution #13672.
Unresolved issues
Open questions
- [ ] Transition plan: while
build-dirisn't stable, enough tools rely on the layout that we'd want to setup a transition plan so they can have time to test against the new layout and work to support both - [ ] What do we call the directory? I said
build/as its all encompassing - [ ] Can the old
build/anddeps/content live in the same place? - [ ] How should we handle
incremental/?- rustc loads incremental artifacts only for local crates. cross-crates is from rmeta. Therefore per-crate incremental artifacts should be fine.
- Given incremental directory has its own flock mechanism, we don't need to add flock for that directory.
- See #t-compiler > ✔ Cargo switching to one `-C incremental` directory per c...
- [ ] Can we share across
<profile>at least?-
<hash>is the-C extra-filenamehash and doesn't encompass all of fingerprinting, so we'd need to audit if there are cases that don't change the hash that we'd stil need per-profile - Changing of local source is one example, so at least local packages still need to be scoped by profile
- Blocked on #4282
-
- [ ] Remove redundant
-Cextra-filenamein files where possible.- It may not be possible. rustc relies on it. See https://github.com/rust-lang/cargo/pull/15947#issuecomment-3393205232
- [ ] Re-evaluate if we want platform to be unconditionally included in the build-dir layout, or if we can completely drop it (blocked on #4282)
- [ ] Is
<pkgname>/<hash>good enough or do we need to go with prefixes to reduce the number of items within a directory. See bjorn3's comment. - [ ] Can we simplify how fingerprints are stored, reducing pressure on path lengths
- [ ] Under the new layout, should
cargo clean -palso cleans old layout paths? See https://github.com/rust-lang/cargo/pull/15947#discussion_r2407136912 - [ ] Is the current handling of build scripts sufficient or should we explicitly split them into separate entries
Future extensions
- #4282
- #5931
- #5026
- Inconsistent use of
-Cextra-filename(#8332) - Shadowing binaries on Windows (#7919)
- Collisions between intermediate build artifacts (#8794)
- Reducing redundant or extraneous information to reduce pressure in Windows for path lengths (e.g.
-Cextra-filenameis redundant with the build-unit's<name>/<hash>)
About tracking issues
Tracking issues are used to record the overall progress of implementation. They are also used as hubs connecting to other relevant issues, e.g., bugs or open design questions. A tracking issue is however not meant for large scale discussion, questions, or bug reports about a feature. Instead, open a dedicated issue for the specific matter and add the relevant feature gate label.
Original
note: I specify build-dir to clarify which half of #14125 I'm referring to. The files and layout of build-dir does not have compatibility guarantees (source).
Currently, build-dir is laid out like
-
target/-
<target-platform>/?-
<profile>/-
incremental/ -
build/-
<package>-<hash>/-
build*script-*
-
-
-
deps/-
<package>-<hash>*
-
-
-
-
Currently,
-
cargo clean -p <package>will operate on everything for<package>
In the future, we could have
- GC will track and operate on everything for
<package>-<hash>(#5026) - Change the locking so only overlapping
<package>-<hash>that are being built block (#4282) - Centrally cache
<package>-<hash>artifacts across all projects (#5931) - Reduced overhead on systems that are slow when there are a lot of files within a directory (see also rust-lang/cargo#15691)
These could be aided by re-arranging the build-dir to be organized around <package>-<hash>, like
-
target/-
<target-platform>/?-
<profile>/-
incremental/ -
build/-
<package>-<hash>/-
build*script-* -
*.d
-
-
-
-
-
Side effects
- We'll have to change how we invoke rustc which will increase the length of the command-line
- Currently, we blindly point rustc at
deps/and rustc finds the files it needs. We'll instead need to point to each individual artifact rustc may need.
- Currently, we blindly point rustc at
These could be aided by re-arranging the
build-dirto be organized around<package>-<hash>, like
target/
<target-platform>/?
<profile>/
incremental/build/
<package>-<hash>/
build*script-**.d
I am assuming that final binary would still be located at target/<target-platform>/<profile>/<bin-name> (ie. target/debug/foo).
Is that correct?
Transition plan: while build-dir isn't stable, enough tools rely on the layout that we'd want to setup a transition plan so they can have time to test against the new layout and work to support both
I think we should check how many of the most popular tools rely on the layout. Depending the impact we can adjust how aggressive we want to be with the migration.
Regarding strategies, I have 2 ideas.
- Simply write the files to the new and old layout directories. This of course with the additional overheads of doubling the disk writes, storage and the complexity of cleaning up multiple directories. But this provides the best backwards compatibility story.
- Only write the files to the new layout and create symlinks in the previous layout. This would help mitigate the overhead of the disk writes/storage.
What do we call the directory? I said build/ as its all encompassing
I do not have a strong opinion on this. I was thinking perhaps packages or crates since its a list of the built packages. But I think build also makes sense here too 😄 Maybe build might be better to leave the door open for future possibilities of adding other artifacts to this directory?
I am assuming that final binary would still be located at target/
/ / (ie. target/debug/foo). Is that correct?
This does not touch final artifacts. I'd recommend reading up on the following note
note: I specify build-dir to clarify which half of #14125 I'm referring to.
I think we should check how many of the most popular tools rely on the layout. Depending the impact we can adjust how aggressive we want to be with the migration.
I know there is at least
- https://crates.io/crates/cargo-cache
- https://crates.io/crates/cargo-sweep
There might be some other tools that do weirder stuff, like inspecting debug files or rlibs.
Regarding strategies, I have 2 ideas.
Writing to both or symlinks won't work for the above two tools.
A common approach we take is to have a feature be opt-in and then transition it to opt-out. A question in this is if we'd want to still support the old layout, for which we'd do this through a config, or if we'll only support the new layout, for which we use an env variable and after a sufficient time we remove the opt-out.
Okay, I finally had some time to read up on #14125 some other related threads.
Would it would be better to focus on making progress on separating target-build-dir/target-artifact-dir out as proposed here before attempting re-organize the layout? Doing it all at once would lead to less fragmentation of "build layouts". But adding to the scope would be more work and make it harder to land #14125.
I am leaning towards doing this re-organization after separating the build/artifact dirs.
I've wondered about doing the build-dir change first, like you said. It would make the scope of the change clear and it would help to communicate out what has compatibility guarantees.
I did some investigating on this. I created a small (and incomplete) prototype on my fork (https://github.com/rust-lang/cargo/commit/6a644ff535c4c37c2c3be23cc19437a43b08644f) where the dep-info files are stored in target/<target-platform>/<profile>/build/<package>-<hash>.
One notable side effect of doing this is that rustc command starts to get very large for projects with many dependencies.
This is because as mentioned in the issue description we currently we only add target/<profile>/deps the library search path (-L) in the current implementation.
If we reorganize the deps dir we need to add each library to the rustc lib search path which makes the process command very large if you have many dependencies.
This does not seem like an immediate issue, but I am not sure if there limits on some operating systems that might limit the size of a process command. (I created a synthetic test on my Linux machine and was able to create a rustc process with a 60MB with no issues. I didn't both trying anything larger)
This does not seem like an immediate issue, but I am not sure if there limits on some operating systems that might limit the size of a process command. (I created a synthetic test on my Linux machine and was able to create a rustc process with a 60MB with no issues. I didn't both trying anything larger)
Further down in the layers of calls to rustc, we automatically roll over from CLI args to an argfile.
Some potentially relevant questions
- How much more frequently are we using argfiles?
- What is the cost of switching to argfiles?
- What is the cost of making the argfiles bigger?
Unsure how much we need to answer this in depth.
On my Linux machine the limit is around 2.6MiB
$ getconf ARG_MAX
2621440
Looking at how the thresholds are calculated: https://github.com/rust-lang/rust/blob/d2eadb7a94ef8c9deb5137695df33cd1fc5aee92/compiler/rustc_codegen_ssa/src/back/command.rs#L145-L205. I feel like on Windows it is way easier to hit the limit.
What is the cost of switching to argfiles?
One failed rustc invocation + writing big argfiles + only allow UTF-8 encoding (IIRC). The good news is that only the last few rustc calls would have such long command line arguments.
The good news is that only the last few rustc calls would have such long command line arguments.
Yes, this is what I observed while testing with my changes. As we get closer to the root of the dependency graph the larger the command grows.
How much more frequently are we using argfiles?
I think it will depend on the system settings. My machine has an ARG_MAX of 2097152 (2MB)
I did some testing in a dummy project with about ~360 total dependencies and the final rustc invocation clocked in at about 190KB
With some basic extrapolation I would start hitting the arg max on my machine for a project with ~4,000 dependencies.
I think a project of this size is pretty large the overhead of a few argfiles on the last few rustc calls will probably not be noticeable.
However, I am not so sure about Windows. I don't have a windows machine handy but this Stack overflow seems to suggest that its much lower at 2^16 chars (~32KB). But this post was 13 years ago so I am not sure if it has increased since then.
However, I am not so sure about Windows. I don't have a windows machine handy but this Stack overflow seems to suggest that its much lower at 2^16 chars (~32KB). But this post was 13 years ago so I am not sure if it has increased since then.
See the link in my previous comment https://github.com/rust-lang/cargo/issues/15010#issuecomment-2842205488 for Windows arg limit, which is 6k in rustc.
Would it make sense to always add the target-platform for build dir layout?
As far as I can tell, if we build using cargo build --target <platform> we will use build-dir/<target-platform>/<profile> as the main build directory, then uplift files to build-dir/<profile>.
Given that the build-dir is not for human consumption I think we don't need do any uplifting inside of the build-dir. I think this would simplify the uplift logic and make the layout more consistent.
Within build-dir, we can change however much we want and if we can make things more consistent,. then sure.
Personally I prefer to always having the target triple in path (in whatever form, could be a hash as well), if we need to distinguish them.
I have finally had some time to put to put together a design for the new layout.
Firstly layout.rs contains a overview of the current layout for reference.
I am proposing we move the following build-dir layout
build-dir/
.rustc-info.json
<target> # e.g. x86_64-unknown-linux-gnu
<profile> # e.g. debug/release
build
$pkgname-$META
.fingerprint
dep-$targetkind-$targetname
invoked.timestamp
lib-$targetname
lib-$targetname.json
deps
$pkgname-$META.$kind
incremental
... # The contents are opaque to cargo
build-script
build-script-build-$META
build-script-build
build-script-build-$META.d
build-script-execution
invoked.timestamp
out/
output
root-output
stderr
Design Notes:
- The layout changes above are only applicable to
build-dir. Theartifact-dirlayout would not change. -
<target>would always be included even if the user does not pass a target.- When
build-dirandartfiact-dirare the same the final artifacts will be included intarget/<profile>and the build artifacts will be intarget/<target>/<profile>. (See example below for more details)
- When
- The
deps/build-script/build-script-executionsubdirectories are not strictly required and are primarily for human readability.- Only 1 of
build-scriptandbuild-script-executionwill be populated as the $META hash will change between the build-script build and execution - A noteable side effect of of these sub-directories is that the build script files will no longer collide with the current implmentation. Its unclear to me if this is a pro or con.
- Only 1 of
- The new design would pass deps to rustc indvidually instead of a single
-Lpointing at thedepsdir.
I put together a proof of concept on my fork did not find anything obvious blockers.
Below is an example of a foo bin with dependency on syn
Expand
target
├── CACHEDIR.TAG
├── release
│ ├── .cargo-lock
│ ├── examples
│ ├── foo
│ └── foo.d
├── .rustc_info.json
└── x86_64-unknown-linux-gnu
├── CACHEDIR.TAG
└── release
├── build
│ ├── foo-5b6794832030febc
│ │ ├── deps
│ │ │ ├── foo-5b6794832030febc
│ │ │ └── foo-5b6794832030febc.d
│ │ └── .fingerprint
│ │ ├── bin-foo
│ │ ├── bin-foo.json
│ │ ├── dep-bin-foo
│ │ └── invoked.timestamp
│ ├── proc-macro2-13c0700341350001
│ │ ├── build-script
│ │ │ ├── build-script-build
│ │ │ ├── build_script_build-13c0700341350001
│ │ │ └── build_script_build-13c0700341350001.d
│ │ ├── deps
│ │ └── .fingerprint
│ │ ├── build-script-build-script-build
│ │ ├── build-script-build-script-build.json
│ │ ├── dep-build-script-build-script-build
│ │ └── invoked.timestamp
│ ├── proc-macro2-6f8d13bc0bbd4bff
│ │ ├── deps
│ │ │ ├── libproc_macro2-6f8d13bc0bbd4bff.rlib
│ │ │ ├── libproc_macro2-6f8d13bc0bbd4bff.rmeta
│ │ │ └── proc_macro2-6f8d13bc0bbd4bff.d
│ │ └── .fingerprint
│ │ ├── dep-lib-proc_macro2
│ │ ├── invoked.timestamp
│ │ ├── lib-proc_macro2
│ │ └── lib-proc_macro2.json
│ ├── proc-macro2-db590e4856f9cba8
│ │ ├── build-script-execution
│ │ │ ├── invoked.timestamp
│ │ │ ├── out
│ │ │ ├── output
│ │ │ ├── root-output
│ │ │ └── stderr
│ │ ├── deps
│ │ └── .fingerprint
│ │ ├── run-build-script-build-script-build
│ │ └── run-build-script-build-script-build.json
│ ├── syn-f7018c4e957f487b
│ │ ├── deps
│ │ │ ├── libsyn-f7018c4e957f487b.rlib
│ │ │ ├── libsyn-f7018c4e957f487b.rmeta
│ │ │ └── syn-f7018c4e957f487b.d
│ │ └── .fingerprint
│ │ ├── dep-lib-syn
│ │ ├── invoked.timestamp
│ │ ├── lib-syn
│ │ └── lib-syn.json
│ └── unicode-ident-520beb59e27cced9
│ ├── deps
│ │ ├── libunicode_ident-520beb59e27cced9.rlib
│ │ ├── libunicode_ident-520beb59e27cced9.rmeta
│ │ └── unicode_ident-520beb59e27cced9.d
│ └── .fingerprint
│ ├── dep-lib-unicode_ident
│ ├── invoked.timestamp
│ ├── lib-unicode_ident
│ └── lib-unicode_ident.json
├── .cargo-lock
└── examples
(syn as the example as it has proc macros and build scripts to test the layout while keeping a small file tree)
Feedback welcomed :)
Besides the argument list issue, the others concern around expanding to multiple -L flags are
- Would the order of
-Lchanges between different builds?- If yes, that might push cargo a bit more away from determinism.
- Does it affect how rustc wrappers like sccache determine their cache keys?
- I believe yes, though wrappers might be able to work around this. For example, sccache already tries being deterministic on
--extern.
- I believe yes, though wrappers might be able to work around this. For example, sccache already tries being deterministic on
- Other than extra allocations, would multiple
-Lflags get any noticeable performance issue on the search algorithm in rustc?
Small suggestion: Can we remove the leading . and stop hiding the .fingerprint directory?
By the way, we might not need to move incremental into package-specific directory. Incremental directory atm
- is completely opaque to cargo
- seems to have per-package directories already
- seems to have its own flock mechanism for each package already
target/debug/incremental/
└── foo-32xpvjjguvo8o/
├── s-h9eqb8u5e4-02ifki0-cw28b41hhwifyrtgntdidbhyq/
│ ├── d8m9ubxqw5ry9cqer4mjobw0c.o
│ ├── dep-graph.bin
│ ├── query-cache.bin
│ └── work-products.bin
└── s-h9eqb8u5e4-02ifki0.lock*
though it may not matter as Cargo only set -C incremental for path-local packages.
- Would the order of
-Lchanges between different builds?
Ideally no, the order of these flags should be deterministic.
- Does it affect how rustc wrappers like sccache determine their cache keys?
- I believe yes, though wrappers might be able to work around this. For example, sccache already tries being deterministic on
--extern.
For sccache, it appears that it only considers -L native and -L all in the hash key.
- Other than extra allocations, would multiple
-Lflags get any noticeable performance issue on the search algorithm in rustc?
I spent some time investigating this and we do see a regression in performance for the extreme cases (1,500+ args) but its generally negligible under common conditions.
Benchmarking details
I ran rustc directly with different amounts of -L flags compiling the same project.
-
few-args.shthe existing cargo behavior-L dependency=/.../target/debug/deps -
lots-of-args.shpasses 196 dependencies directly to rustc via-L. -
even-more-args.shsame aslots-of-args.shbut with 1561-Largs.- Note: There were duplicates as to avoid changing dependencies.
> hyperfine --runs 200 ./few-args.sh ./lots-of-args.sh ./even-more-args.sh
Benchmark 1: ./few-args.sh
Time (mean ± σ): 159.1 ms ± 20.2 ms [User: 41.0 ms, System: 43.8 ms]
Range (min … max): 108.0 ms … 214.0 ms 200 runs
Benchmark 2: ./lots-of-args.sh
Time (mean ± σ): 157.7 ms ± 19.9 ms [User: 43.5 ms, System: 43.2 ms]
Range (min … max): 114.6 ms … 216.8 ms 200 runs
Benchmark 3: ./even-more-args.sh
Time (mean ± σ): 181.6 ms ± 22.0 ms [User: 62.4 ms, System: 49.7 ms]
Range (min … max): 130.7 ms … 257.4 ms 200 runs
Summary
./lots-of-args.sh ran
1.01 ± 0.18 times faster than ./few-args.sh
1.15 ± 0.20 times faster than ./even-more-args.sh
Above we can see that ./lots-of-args.sh and ./few-args.sh are in the margin of error and are effectively the same.
We do see ./even-more-args.sh ran slower but was still technically in the margin of error.
Small suggestion: Can we remove the leading
.and stop hiding the.fingerprintdirectory?
I think this is a good idea. I don't see any value in having hidden directories in the build-dir.
By the way, we might not need to move incremental into package-specific directory. Incremental directory atm
- is completely opaque to cargo
- seems to have per-package directories already
- seems to have its own flock mechanism for each package already
target/debug/incremental/ └── foo-32xpvjjguvo8o/ ├── s-h9eqb8u5e4-02ifki0-cw28b41hhwifyrtgntdidbhyq/ │ ├── d8m9ubxqw5ry9cqer4mjobw0c.o │ ├── dep-graph.bin │ ├── query-cache.bin │ └── work-products.bin └── s-h9eqb8u5e4-02ifki0.lock*though it may not matter as Cargo only set
-C incrementalfor path-local packages.
Oh interesting, I was not aware that rustc had its own internal locking system. The motivation for splitting it up is was allow Cargo to manage the locking so that crates in the same local workspace could potentially compile in parallel.
I could not find very much documentation about how rustc incremental builds work outside of this short description 😅
- Would the order of
-Lchanges between different builds?Ideally no, the order of these flags should be deterministic.
Does it affect how rustc wrappers like sccache determine their cache keys?
- I believe yes, though wrappers might be able to work around this. For example, sccache already tries being deterministic on
--extern.For sccache, it appears that it only considers
-L nativeand-L allin the hash key.
By ideally, do you hope it not, or you have verified it not?
If my understanding is still valid, the dependency resolution in Cargo is not entirely deterministic. The build_runner.unit_deps(unit) may return an array in different order. That is why --extern flags may shuffle. In the PoC it uses the same unit_deps so I guess it shares the same symptom.
As for sccache, it considers -L dependency= as well on line 1104.
Anyway, this might not be a blocker for experiments, but we may need to consider and communicate with external tools if we cannot avoid it.
By ideally, do you hope it not, or you have verified it not? If my understanding is still valid, the dependency resolution in Cargo is not entirely deterministic. The
build_runner.unit_deps(unit)may return an array in different order. That is why--externflags may shuffle. In the PoC it uses the sameunit_depsso I guess it shares the same symptom.
My thought was that the args could be sorted before they are added to processor builder. Am I missing something that would make this approach too naive?
As for sccache, it considers
-L dependency=as well on line 1104.
Looking at generate_hash_key, crate_link_paths does not appear to be included in the hasher and appears to only be used write the depinfo files later in execution.
I was looking at the wrong place. Apparently sccache hashes the cli args again and sorts them https://github.com/mozilla/sccache/blob/5b1e93e5dbd8113bb66ac990148e8ba354dd3545/src/compiler/rust.rs#L1439-L1470.
My thought was that the args could be sorted before they are added to processor builder. Am I missing something that would make this approach too naive?
That could work I think, as our dependencies always have unique hashes in their name.
BTW I don't really remember where the non-deterministic behavior came from. Perhaps from -j threads or some other places.
Some areas of question / concern
- Should we cache by package or by build unit, including build script execution being its own directory.
- How will this work with pipelined builds when we get to user-wide caching? Each library build unit will have 2 artifacts but unsure if we can split them. Maybe we can handle this in the caching part and only move things into the cache when both parts are done.
- Should we cache by package or by build unit, including build script execution being its own directory.
Good question, in my proposal above I was being conservative with build scripts. I suppose we could potentially group a unit and it's build script unit together. In the common case we would generally need both of these together.
If we were to group these together, its unclear to me if we will also need to include a hash in the build script dir name to avoid different results. ie.
$pkgname-$META
build-script-$HASH # Do we need a hash or will the $META above be enough?
....
To me it appears that $META would be enough, but I might be missing a scenario where that would produce an incorrect result.
- How will this work with pipelined builds when we get to user-wide caching? Each library build unit will have 2 artifacts but unsure if we can split them. Maybe we can handle this in the caching part and only move things into the cache when both parts are done.
If we can merge everything into a single unit directory it would keep this simple. (Though there is some potential that many build script units cause the size of the cache to grow over time which might be problematic when downloading from remote caches. But I think this concern could be handled later)
The more we group things, the less caching we get, especially since we won't be caching build script execution (and everything that depends on it) in the first iteration.
Also, in the ideal case, the cache is immutable so putting multiple build script output in the same package directory, then the cache would need to be mutable.
So sounds like we'll need to cache on a per-unit basis.
Would the proposed design above be enough to move forward with review? Or would this need something a bit more formal? Also not sure what kind of review is needed and if this change would warrant and rfc?
Looking to start moving this forward in the near future :)
This is an internal-only change so no RFC is needed. Those are for when we want wider input from the project and community.
The bar for feature-flagged changes is also fairly low. Probably will be easier to discuss what it should be with a PR posted or merged.
sounds good, I can open a PR to first add the feature flag and then a follow up PR with an initial implementation that we can iterate on
@rustbot claim
the 1.90 release notes asked users of the existing build-dir contents to chime in here. Debian is using it to determine what got (statically) linked into a Rust executable (binary or cdylib) to record this for the package metadata. if there is a better way to get this information, we can of course consider switching to some other source, the current implementation is rather janky anyway and hard-codes a lot of assumptions that don't always hold..
basically what we currently do is:
- inspect the target dir
- look through all
deps/*.dfiles - filter out references to our "registry" (which is actually all the packaged crates pretending to be crates.io ;))
- map those references back to the packages shipping their respective contents
- emit the information that contents of those packages (dependencies) are statically linked into the currently built package
cargo metadata --filter-platform ... can approximate it but can include extras.
https://doc.rust-lang.org/cargo/reference/unstable.html#sbom will report exactly what is in the build.
ack, that SBOM sounds more appropriate going forward for finding this information! I didn't find any information about planned stabilization though, and force-enabling unstable features for all package builds doesn't seem ideal..