cargo icon indicating copy to clipboard operation
cargo copied to clipboard

Implement 'multidep' (RFC-3176)

Open Byron opened this issue 2 years ago • 23 comments

Tracking issue: #10030 Depends on: #9992

Tasks

  • [x] 'multidep' feature toggle
  • [x] spread out multi-name dependencies to same crate
    • [x] represent these in cargo-metadata
  • [x] multiple targets may be specified in renamed artifact dependencies
  • [ ] 🧹cleanup and finalization
    • [ ] assure no TODO(ST) markers are left in code
    • [ ] assure no tests are ignored

Implementation and review notes

  • cargo metadata is ignored and was adapted to work exactly as before, ignoring duplicate packages for now. This is due to the suggestion made in this review comment.

  • 🧪 tests, possibly for more subtle RFC constraints

    • [x] no-multidep toggle resorts to previous behaviour

Review Progress

  • Leftover from RFC 3028
    • [x] #10437
    • [x] simplify filtering - I took it as far as I could but will probably need guidance again once the review begins
    • [ ] ⚠️write all files of artifacts into deps and uplift them later (asked for direction in this comment)
    • [x] rename a dependency to try to hit this expect() - a test fails unless we exclude non-artifacts which was added by @ehuss already.
    • [x] #10407 - asked for direction - rolled back test, keeping new ones for now.
    • [ ] ❓not sure if this issue about buildstd interaction should be tackled just yet, seems it could benefit from artifact info in the crates registry
    • bindep issues - maybe some of these can be tackled or could be validated as part of this PR
      • [ ] #10525
      • [ ] #10526
      • [ ] #10527
      • [ ] #10593

Sponsored by Profian

Byron avatar Nov 09 '21 02:11 Byron

r? @ehuss

(rust-highfive has picked a reviewer for you, use r? to override)

rust-highfive avatar Nov 09 '21 02:11 rust-highfive

I have created a draft of simple renames in resolve metadata, which result in some ambiguity in the deps.name field as it now possibly corresponds to multiple names which are also repeated in the "extern_name" field of the respective "dep_kinds". For now I have set the deps.name to an empty string to be somewhat backwards compatible, but since feature is gated I could imagine it to become an array of names instead as well.

CC @joshtriplett

Byron avatar Nov 14 '21 13:11 Byron

:umbrella: The latest upstream changes (presumably #10081) made this pull request unmergeable. Please resolve the merge conflicts.

bors avatar Nov 14 '21 16:11 bors

@ehuss I believe the RFC is implemented from all I can see and has a few tests to support that. Maybe that qualifies it for a first review round. What makes this more difficult is that this PR depends on another one which isn't merged yet, and @joshtriplett as reviewer of that one and you might align to find an approach that minimizes work. My preference clearly is to focus on this PR and mostly forget about the other one, implementing both RFCs at once when this one is merged, and avoiding changing #9992 entirely.

Both PRs have reviewer notes, some of which describe limitations.

Thanks for your time, I am looking forward to hearing from you :).

Byron avatar Nov 17 '21 10:11 Byron

:umbrella: The latest upstream changes (presumably #10172) made this pull request unmergeable. Please resolve the merge conflicts.

bors avatar Dec 13 '21 16:12 bors

:umbrella: The latest upstream changes (presumably #10201) made this pull request unmergeable. Please resolve the merge conflicts.

bors avatar Dec 16 '21 01:12 bors

:umbrella: The latest upstream changes (presumably #10265) made this pull request unmergeable. Please resolve the merge conflicts.

bors avatar Jan 12 '22 01:01 bors

:umbrella: The latest upstream changes (presumably #10324) made this pull request unmergeable. Please resolve the merge conflicts.

bors avatar Jan 25 '22 05:01 bors

:umbrella: The latest upstream changes (presumably #10341) made this pull request unmergeable. Please resolve the merge conflicts.

bors avatar Feb 03 '22 18:02 bors

:umbrella: The latest upstream changes (presumably #9992) made this pull request unmergeable. Please resolve the merge conflicts.

bors avatar Feb 22 '22 04:02 bors

Now that the PR for RFC-3028 has been merged (🎉) I am in the process of readying this one for review.

When rebasing against master, the filter-madness that was generously fixed by @ehuss returned and I 'just made it work' for now and will try to simplify it along with finishing some other tasks as tracked in the Leftover from RFC 3028 headline.

Byron avatar Feb 23 '22 04:02 Byron

After trying to address the left overs from RFC-3028 I have arrived in a state where I'd need input to learn how to proceed from here.

Besides the question around metadata tests I tried to change file placement by uplifting them instead of adjusting the units output directory as a whole as suggested by @joshtriplett. The intermediate results can be found in this branch - some tests are failing but they are fixable with more time.

Even though uplifting is generally working, it also has to tests indicating filename collisions. This probably stems from the fact that the library is both depended on by the main manifest as well as by the binary. Several attempts to deduplicate this failed and one reason probably is that it is the job server who would have to avoid scheduling a duplicate job while allowing the unit to be used as dependency (when building the rustc command-line) as it normally would, while waiting for similar jobs to finish before proceeding. Doing it like this would have the benefit of avoiding duplicate work while further reducing the chance for collisions. This approach however seems like a major time investment both into the job server as well as into the collision check. This commit shows the only way I could make it work with uplifting, effectively separating the bulk of files produced into its own directory (here deps/artifact_deps) while uplifting the desirable files into the folder described in the RFC, i.e. deps/artifacts.

Thus I wonder how @joshtriplett feels about this. Is there something obvious I am missing to make uplifting the way you had in mind? I'd be glad if you had directions on how to proceed here.

Thanks for your help.

Byron avatar Feb 24 '22 03:02 Byron

@Byron I answered the metadata test question in the corresponding linked issue.

On further consideration, I don't think you need to put any effort at all into reducing the number of rustc invocations. Trying to unify builds across artifact and non-artifact dependencies in the uncommon case where they even can be unified seems like unnecessary complexity in an already complex implementation.

(You do need to deduplicate multiple artifact dependencies for the same target, in multiple places in the tree. But don't worry as much about trying to deduplicate between host and same-target-as-host.)

You should not need to modify the job server at all here; any deduplication should be happening in the crate graph before it gets built.

I would focus on correctness, and on placing files in the desired locations, rather than on deduplication.

I would have expected that you could build each dependency in a subdirectory of the same directory, even multiple copies of the same dependency, since they should have different hashes and those hashes are included in the subdirectory name. Am I missing something that's leading to the conflict?

joshtriplett avatar Mar 14 '22 08:03 joshtriplett

Thanks @joshtriplett for sharing, for a moment I thought I knew how to proceed but then the moment vanished, and here is why 😅.

I think the following parts are key to understanding what I think is going on (this goes with the usual disclaimer of me probably not really knowing what's happening, but here we go anyway).

I would have expected that you could build each dependency in a subdirectory of the same directory, even multiple copies of the same dependency, since they should have different hashes and those hashes are included in the subdirectory name. Am I missing something that's leading to the conflict?

This is something that was surprising to me at first, but is probably exactly why we see windows failure like this recent one occasionally. The binary artifact is built multiple times into the same directory as they are exactly the same, but in multiple positions in the dependency graph with different names in the manifest. The package name is used instead of the name in the manifest, the hashes end up being the same. The easiest way out would be to change the name of the target directory to contain the name in toml if it differs. Would that be OK for you?

(You do need to deduplicate multiple artifact dependencies for the same target, in multiple places in the tree. But don't worry as much about trying to deduplicate between host and same-target-as-host.)

This is something I consider very difficult as it's an entirely new concept for the jobs server, I believe. Previously building the same library multiple times was only possible if they would differ in version, which yields different outputs. What I don't understand is how it could ever work if that a library (not even an artifact, in a single version) is used multiple times in the same dependency graph while only building it once (and into the same location), as the unit-dependencies don't seem to do any kind of deduplication either which means a build would be triggered multiple times. The reason this works is probably that builds of the same library aren't usually done concurrently (even though that could happen), so the job server detects they are fresh and won't rebuild because of that. Otherwise the current system seems quite able to build the same library multiple times and concurrently, causing races around writing files on windows. Actual deduplication could happen on unit-dependency level, but it would require to know which instance of the crate will be built first, while making other instance of it more like 'proxies' which aren't built/scheduled but somehow wait until the actual instance is done building so it can be used. (I hope this makes sense at all). The alternative to such a system would probably be to assure each instance of a crate gets its own unique name so they never overlap.

I hope that I am missing something essential, but until then I am more or less on the same state as I was when I wrote this comment right above the test that tends to fail due to races.

Fixing this could probably partially be accomplished by putting dependencies into a more uniquely named directory that takes into consideration their name in toml, but that wouldn't fix the possibility of races which I think has existed before but is now exacerbated with the introduction of artifact dependencies.

All of the above is probably more like a brain dump but I hope it helps to point me towards the solution which I seem to be unable to see without complicating things tremendously (and I would want to do better).

Byron avatar Mar 15 '22 13:03 Byron

The package name is used instead of the name in the manifest, the hashes end up being the same.

Ah, I see! I think, for artifact dependencies, it may make sense to feed whether the build is an artifact build or not into both the metadata calculation and the fingerprint calculation, so that artifact and non-artifact builds get different hashes. (The target name is already fed in.) That should eliminate collisions between artifact and non-artifact builds. (If, in the future, it becomes feasible to deduplicate between artifact and non-artifact builds, we can always drop that change.)

See the documentation at the top of https://github.com/rust-lang/cargo/blob/master/src/cargo/core/compiler/fingerprint.rs for more details there.

What I don't understand is how it could ever work if that a library (not even an artifact, in a single version) is used multiple times in the same dependency graph while only building it once (and into the same location), as the unit-dependencies don't seem to do any kind of deduplication either which means a build would be triggered multiple times. The reason this works is probably that builds of the same library aren't usually done concurrently (even though that could happen), so the job server detects they are fresh and won't rebuild because of that. Otherwise the current system seems quite able to build the same library multiple times and concurrently, causing races around writing files on windows.

I may be missing something myself, but as far as I can tell, the dependency graph does already get deduplicated. I just did a test with a crate toplevel depending on b and c, both of which depend on a. Cargo built a and waited for it to be done before building either b or c.

Perhaps this deduplication is happening as part of unification, and since you're not supporting unification, you're getting duplicates in the dependency graph? You might try a test with a similar diamond-shaped dependency graph like the one I described above, without any artifact dependencies, and look at how stock Cargo processes that to see where a gets deduplicated. I would explore that first, before making any changes related to the metadata or fingerprint hashes.

In any case, the deduplication should definitely not happen in the job server; it should just run the jobs it's handed. Any deduplication should happen in the graph before ever handing it to the job server.

joshtriplett avatar Mar 15 '22 16:03 joshtriplett

Perhaps this deduplication is happening as part of unification, and since you're not supporting unification, you're getting duplicates in the dependency graph? You might try a test with a similar diamond-shaped dependency graph like the one I described above, without any artifact dependencies, and look at how stock Cargo processes that to see where a gets deduplicated. I would explore that first, before making any changes related to the metadata or fingerprint hashes.

In any case, the deduplication should definitely not happen in the job server; it should just run the jobs it's handed. Any deduplication should happen in the graph before ever handing it to the job server.

Thanks so much, I regained hope :), and what you say makes sense. Indeed, there must be some mechanism happening that avoids duplicate work, and probably it's more than just freshness of fingerprint files as the example you mentioned wouldn't have worked otherwise. Once the mechanism is discovered, it should be possible to either adjust fingerprinting to know about artifacts and/or make sure the unification handles artifact dependencies.

Byron avatar Mar 16 '22 00:03 Byron

Alright, I think I am onto something here and should be able to eventually get to the bottom of this.

The recent failure on windows is indeed because…

  • units aren't deduplicated (after all, they use different profiles in different dependency types) - (deduplication happens in the unit-interner)
  • The extra-filename hash that disambiguates both outputs on unix isn't present here, and maybe that's the reason both binaries are placed into the same directory as well - they lack the extra-filename information which does affect the build generated build dir.

Here is the log lines produced on windows for completeness.

  Running `rustc --crate-name bar bar\src\main.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts,future-incompat --crate-type bin --emit=dep-info,link -C opt-level=1 -C embed-bitcode=no -C debuginfo=2 -C debug-assertions=on -C metadata=5d165022df23eea2 --out-dir D:\a\cargo\cargo\target\tmp\cit\t56\foo\target\debug\deps\artifact\bar-8f668ecb076f65c2\bin -L dependency=D:\a\cargo\cargo\target\tmp\cit\t56\foo\target\debug\deps --extern bar=D:\a\cargo\cargo\target\tmp\cit\t56\foo\target\debug\deps\libbar-43b90b8924029221.rlib`
[2684](https://github.com/rust-lang/cargo/runs/5547963529?check_suite_focus=true#step:9:2684)
     Running `rustc --crate-name bar bar\src\main.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts,future-incompat --crate-type bin --emit=dep-info,link -C opt-level=3 -C embed-bitcode=no -C debuginfo=2 -C debug-assertions=on -C metadata=83de8e9fff94a191 --out-dir D:\a\cargo\cargo\target\tmp\cit\t56\foo\target\debug\deps\artifact\bar-8f668ecb076f65c2\bin -L dependency=D:\a\cargo\cargo\target\tmp\cit\t56\foo\target\debug\deps --extern bar=D:\a\cargo\cargo\target\tmp\cit\t56\foo\target\debug\deps\libbar-bfc0e72712f573b8.rlib`
[2685](https://github.com/rust-lang/cargo/runs/5547963529?check_suite_focus=true#step:9:2685)
error: linking with `link.exe` failed: exit code: 1104

The target_short_hash() would probably have to learn about ~~the name in the manifest to further disambiguate, and I will try that~~ something to disambiguate.

Edit: The disambiguation must be independent of the name in the manifest, this test only renames to avoid clashes on MSVC apparently and that doesn't work as the name doesn't matter. It looks more like a general problem on systems with missing filename extensions whose fix contradicts the purpose of using a stable hash in the first place. The meta-data hash should be reproducible so maybe it's OK to use in conjunction with the stable hash, but why would we not want to use the metadata hash in place of the stable hash? The only way to fix this without changing the way stable hashes work fundamentally would be to detect that case in a post-processing step and adjust the stable hash only when needed.

Once that is working though, I think the uplifting shouldn't suffer from collisions anymore either.

Byron avatar Mar 16 '22 02:03 Byron

:umbrella: The latest upstream changes (presumably #10433) made this pull request unmergeable. Please resolve the merge conflicts.

bors avatar Mar 16 '22 03:03 bors

Just to follow up on the previous question about deduplication and in short: the existing deduplication mechanism works exactly as it should. Issues occur due to uplifting when there is a lib = true artifact dependency like in this test.

Since the artifact units are built separately for bin, cdylib and staticlib kinds, one will get duplication when the library itself is built which is all the three kinds in one. This is where the output filename collision occours, it actually builds certain artifacts multiple times. A simple fix doesn't actually fix the duplication, but the collision by placing artifact libraries into their own directory.

Doing the actual deduplication appears like it would complicate unit dependency generation more, along with the outstanding changes needed to get the uplifting itself to work consistently.

Now I wonder if uplifting is a feature really worth having for the cost of such deduplication. Can I proceed with what's there and place duplicate artifact library dependencies into its own folder to avoid collisions, or is deduplication on unit-dependency level to avoid collisions the way to go? Thanks again for guidance.


On another note, I have added 05a76cb2222ae943858a07830c2b80f6382000e0 to fix a flaky test in a way that I think it can be fixed if file name extensions aren't supported. These kinds of collisions we will see a lot of with artifact dependencies and multi-dep at least on certain platforms, so I think it's valuable to correct them if we can. The implementation feels a bit forced as it rides on the back of the collision check which may now try to fix output destinations. I'd be happy to learn about better ways :).

Byron avatar Mar 16 '22 09:03 Byron

@Byron I think it's perfectly fine to skip uplifting for the initial PR, and we can re-evaluate it before stabilization. And if that allows skipping changes to the scheduler, that sounds like a great simplification. (Update: based on our conversation, it sounds like uplifting will actually end up working after this PR. Thank you!)

It's also fine to skip de-duplication for now; having the same dependency for both an artifact dependency and a regular dependency will be an unusual case, and if taking a bit more time to build that twice substantially simplifies the code, that's not a problem for now.

The one case I want to make sure this does address correctly, though, is feature unification. If you build something twice, once for the regular dependency and once for the artifact dependency, that's fine, as long as feature unification occurs for builds targeting the same target, as specified in the RFC.

Can I proceed with what's there and place duplicate artifact library dependencies into its own folder to avoid collisions, or is deduplication on unit-dependency level to avoid collisions the way to go?

Go ahead and place them in their own folders to avoid collisions, so that you don't have to poke at the scheduler for now.

joshtriplett avatar Apr 26 '22 02:04 joshtriplett

having the same dependency for both an artifact dependency and a regular dependency will be an unusual case

We actually do have a use for this, so perhaps not that unusual. :P But in our case our bindep and our direct dep are built for different targets so you'd have to compile their dependencies twice anyway (though I suppose it would be nice if the proc macro crates only get built once, for the host).

The one case I want to make sure this does address correctly, though, is feature unification. If you build something twice, once for the regular dependency and once for the artifact dependency, that's fine, as long as feature unification occurs for builds targeting the same target, as specified in the RFC.

For our use case, we may have some general objection to the idea of aggressively unifying features; there's an argument to be made that artifact dependencies might want to select distinctly different features from the main crate, even when built for the same target. However, this is just me registering our future concern, and I don't think that we need to block this PR on that, especially if it contradicts the RFC.

bstrie avatar Apr 27 '22 17:04 bstrie

The one case I want to make sure this does address correctly, though, is feature unification. If you build something twice, once for the regular dependency and once for the artifact dependency, that's fine, as long as feature unification occurs for builds targeting the same target, as specified in the RFC.

We actually have a case where this behavior is problematic. Our project is complicated, but imagine we are creating a VM hypervisor. We have host code and guest code. The guest code is built for -musl. The host code is currently built for -gnu. This currently allows us to have independent dependency graphs. However, we are now unable to build a static binary for the host code because this would unify the dependency graphs when we need them independent.

I understand that in the typical case, unification probably is the right decision. But I would love the option to disable unification in bindeps even on the same target.

npmccallum avatar Apr 29 '22 15:04 npmccallum

:umbrella: The latest upstream changes (presumably #10868) made this pull request unmergeable. Please resolve the merge conflicts.

bors avatar Jul 16 '22 19:07 bors