cargo icon indicating copy to clipboard operation
cargo copied to clipboard

Coerce major wildcards

Open nikita240 opened this issue 1 year ago • 14 comments

What does this PR try to resolve?

Fixes ~#13534~ #9029

This PR implements a change that will make major wildcards in sub-dependencies be coerced by a more restrictive dependency to avoid unnecessary duplicate dependencies.

It does this by presorting the remaining_candidates array to prioritize candidates that already have activations (if you'd like an easier diff to understand the change, see the first commit.

How should we test and review this PR?

There are unit tests for the new behavior.

Additional information

There is a bit of a special case here that needs to be talked about. When there are multiple possible activated candidates that could be used to coerce the wildcard, a choice needs to be made.

I did the simplest and most intuitive implementation, which is to always choose the candidate with the highest SemVer.

An argument could be made for an alternative: If the root crate has a direct dependency on the package we are coercing, always pick whatever the root activates. This is nice because it seemingly allows for maximum compatibility to the end user, but feels very inconsistent as it will inevitably favor binaries over libraries.

nikita240 avatar Mar 05 '24 07:03 nikita240

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @epage (or someone else) some time within the next two weeks.

Please see the contribution instructions for more information. Namely, in order to ensure the minimum review times lag, PR authors and assigned reviewers should ensure that the review label (S-waiting-on-review and S-waiting-on-author) stays updated, invoking these commands when appropriate:

  • @rustbot author: the review is finished, PR author should check the comments and take action accordingly
  • @rustbot review: the author is ready for a review, this PR will be queued again in the reviewer's queue

rustbot avatar Mar 05 '24 07:03 rustbot

r? @Eh2406

epage avatar Mar 05 '24 12:03 epage

I will try and make time to think about this more carefully as soon as I can. In the meantime I'm gonna quote something Ed routinely says:

When this is ready for review, I'd recommend reworking the commits

  • An initial commit should exist that has passing tests, showing the current behavior
  • A follow up commit should change the behavior and update the tests so they still pass
    • This helps show we are testing the right thing and makes the change in behavior very clear
  • Any refactorings along the way should be separate commits

Specifically in this case I'm pretty sure some of these tests would already pass before this PR. Which is fine, more tests for the existing functionality are always helpful. But it's not clear as a reviewer if you expected there to be a behavior change.

Eh2406 avatar Mar 05 '24 16:03 Eh2406

I will try and make time to think about this more carefully as soon as I can. In the meantime I'm gonna quote something Ed routinely says:

When this is ready for review, I'd recommend reworking the commits

  • An initial commit should exist that has passing tests, showing the current behavior

  • A follow up commit should change the behavior and update the tests so they still pass

    • This helps show we are testing the right thing and makes the change in behavior very clear
  • Any refactorings along the way should be separate commits

Specifically in this case I'm pretty sure some of these tests would already pass before this PR. Which is fine, more tests for the existing functionality are always helpful. But it's not clear as a reviewer if you expected there to be a behavior change.

Understood, I will force-push the commit tree to update it as soon as I'm done brainstorming more tests.

nikita240 avatar Mar 05 '24 16:03 nikita240

Specifically in this case I'm pretty sure some of these tests would already pass before this PR.

Yes, test_wildcard_minor would pass. I just wrote that next to test_wildcard_major to demonstrate how the current behavior differs and is inconsistent based on where the wildcard is placed.

nikita240 avatar Mar 05 '24 16:03 nikita240

Ok I fixed the commit tree.

Here is a summary of the tests

New tests with current implementation

test test_range_major ... FAILED
test test_wildcard_major ... FAILED
test test_wildcard_major_coerced_by_indirect_subdependency ... FAILED
test test_wildcard_major_coerced_by_subdependency ... FAILED
test test_wildcard_major_duplicate_selection ... ok
test test_wildcard_minor ... ok

New tests with proposed implementation

test test_range_major ... ok
test test_wildcard_major ... ok
test test_wildcard_major_coerced_by_subdependency ... ok
test test_wildcard_major_coerced_by_indirect_subdependency ... ok
test test_wildcard_major_duplicate_selection ... ok
test test_wildcard_minor ... ok

There is a regression however:

test resolving_with_sys_crates ... FAILED

This is because we actually fixed an unneccessary duplicate dependency within that test.

To address this, I changed the assertion to remove the duplicate dependency and then added a new test that replicates the old test by explicitelly pinning the requirements to generate duplicate dependencies (see this commit):

test resolving_with_sys_crates ... ok
test resolving_with_sys_crates_duplicates ... ok

nikita240 avatar Mar 05 '24 20:03 nikita240

@Nikita240 to clarify what @Eh2406 asked, each commit should be passing tests. This helps to test the tests and makes changes in behavior clearer to reviewers.

From my quick look at the commits and with your last message, it sounds like this wasn't done.

#13505 is an example of this

  • 0955a80a07d81290c566fbd7a9f073968e75aa17 adds tests, they pass
  • 0f9d4a352aa606d3b8faf560158cf0b3845436a8 changes behavior and updates tests

epage avatar Mar 05 '24 20:03 epage

@Nikita240 to clarify what @Eh2406 asked, each commit should be passing tests. This helps to test the tests and makes changes in behavior clearer to reviewers.

From my quick look at the commits and with your last message, it sounds like this wasn't done.

#13505 is an example of this

  • 0955a80 adds tests, they pass
  • 0f9d4a3 changes behavior and updates tests

Ahh, I misunderstood. Thanks

nikita240 avatar Mar 05 '24 20:03 nikita240

Rebuilt the tree again 👍🏻

nikita240 avatar Mar 05 '24 21:03 nikita240

I got a chance to look this over. My brings a little bit exploding with different ways I want to respond to this PR. Sorry if this comment ends up all over the place. So before I forget, thank you for working on this very tricky issue!

Edit: Got sucked into a meeting and so this comment is unaware of the past two hours of progress, although I don't think it affects things that much.

I'm going to try and respond from most detailed out toward most general.

The most detailed feedback is about how things are named and where comments go, which I'm largely going to save till after we discussed the bigger questions. Except to thank you for working on tests. This stuff is tricky and hard to get right, I deeply appreciate your use of tests.

At the next level the critical question here is what property is this trying to uphold? If we can define it strictly enough we can add a proptest and discover all kinds of fun counterexamples. Even without that level of strictness we can add an assert to the code and watch proptest generate examples that should be standalone tests.

Specifically things get tricky with regard to what order operations are evaluated in. Most concretely this code moves checking cx.activations.get from RemainingCandidates::next to ValidCandidates::new. I don't remember the structure of loops well enough, what ensures cx.activations will not change in between new and next?

Similarly in these small examples the wildcard requirements will get evaluated after the specific requirements, because we have a heuristic that requirements that match fewer versions get processed first. But this is not always true. We could have a situation the direct dependency is wildcard, only after we've evaluated that to be discovered that a transitive dependency adds a specific requirement. So I suspect that whatever property were trying to uphold this will only make it "heuristically" true and not "mathematically" true. Nothing wrong with that, the resolver involves a lot of load bearing heuristics, but a different kind of discussion.

In the last level of abstraction is a question of design, is this "Doesn't unnecessarily include newer versions" a property we even want to cargos resolver to have?

  • Pro: There are clearly a lot of people who've been asking us for this functionality, just look at the linked issues. Also we already use a related heuristic when a new dependencies discovered, if you add foo = "*" to cargo.toml it will select a version of foo that's already in your lock file, originally to allow more builds to work off-line. (Perhaps this "optimization" should be backed out now that we have --offline.)
  • Con: We don't know how many people who intentionally want "the latest version of their dependencies even if there's another version in the tree", because that is what cargo currently does so they have no reason to complain to us.

Eh2406 avatar Mar 05 '24 21:03 Eh2406

The most detailed feedback is about how things are named and where comments go, which I'm largely going to save till after we discussed the bigger questions.

I appreciate that. We can always work that out once there is consensus on the direction to take this.

At the next level the critical question here is what property is this trying to uphold? If we can define it strictly enough we can add a proptest and discover all kinds of fun counterexamples.

We should indeed define this. Here is my attempt:

"When a dependency is explicitly requested to have a non-SemVer-compatible range, it's compatibility across SemVer boundaries should be resolved in the same exact way that a SemVer compatible range would be. For example: 0.* should follow the same rules as 0.1.*."

Specifically things get tricky with regard to what order operations are evaluated in. Most concretely this code moves checking cx.activations.get from RemainingCandidates::next to ValidCandidates::new. I don't remember the structure of loops well enough, what ensures cx.activations will not change in between new and next?

Similarly in these small examples the wildcard requirements will get evaluated after the specific requirements, because we have a heuristic that requirements that match fewer versions get processed first. But this is not always true. We could have a situation the direct dependency is wildcard, only after we've evaluated that to be discovered that a transitive dependency adds a specific requirement. So I suspect that whatever property were trying to uphold this will only make it "heuristically" true and not "mathematically" true. Nothing wrong with that, the resolver involves a lot of load bearing heuristics, but a different kind of discussion.

I share all of these concerns with you. I think it would be wise for me to refactor the code to ensure that these heuristics are strictly upheld (and to make it difficult to break in the future).

My goal with the current implementation was to get working code with a minimal change-set to make it simpler to grok what the proposal does. Refactoring the code to be more strict would require a lot of code changes, probably a few levels up from the current function, so it would have been a huge diff to try to process.

In the last level of abstraction is a question of design, is this "Doesn't unnecessarily include newer versions" a property we even want to cargos resolver to have?

  • Pro: There are clearly a lot of people who've been asking us for this functionality, just look at the linked issues. Also we already use a related heuristic when a new dependencies discovered, if you add foo = "*" to cargo.toml it will select a version of foo that's already in your lock file, originally to allow more builds to work off-line. (Perhaps this "optimization" should be backed out now that we have --offline.)
  • Con: We don't know how many people who intentionally want "the latest version of their dependencies even if there's another version in the tree", because that is what cargo currently does so they have no reason to complain to us.

This is the main question we have to answer.

My opinion:

You have two possible conditions to optimize for:

  • Maximize SemVer
  • Minimize duplicates

Which one of these a user actually wants is likely based on whether or not the dependency is private or public, which as far as I understand one of the goals of the public-private-dependencies RFC.

However, until we have that, here are some assertions we can make:

  • With the current resolver implementation, when a library specifies version = ">0.1, <=0.10 it's essentially interpreted as version = "^0.10" by the resolver. Specifying version = ">0.1, <=0.10 only allows the user to manually update the lock file with cargo update --precise. This is likely unintuitive to most users.
  • A library author is likely only going to specify non-SemVer-compatible ranges for a dependency if it's a public dependency.

Hence, I think the following logic is a fair compromise:

  • Maximize SemVer, UNLESS the major version is unclear, in which case, minimze duplicates while maximizing semver.

I basically view version = "0.*" or version = ">0.1, <=0.10" as an open invitation by the library author to coerce the dependency.

nikita240 avatar Mar 05 '24 21:03 nikita240

I basically view version = "0.*" or version = ">0.1, <=0.10" as an open invitation by the library author to coerce the dependency.

This seems pretty convincing to me, but I started a conversation with the rest of the cargo team on zulip.

I share all of these concerns with you. I think it would be wise for me to refactor the code to ensure that these heuristics are strictly upheld (and to make it difficult to break in the future).

New better abstractions would certainly help hear. The existing ones are not great. That being said if we find a way to make sure this "always" happens then we have proven SAT == MAX-SAT. We need to be comfortable with the fact that there are going to be cases where the resolver does not give the most "optimal" solution.

We should indeed define this. Here is my attempt: "When a dependency is explicitly requested to have a non-SemVer-compatible range, it's compatibility across SemVer boundaries should be resolved in the same exact way that a SemVer compatible range would be. For example: 0.* should follow the same rules as 0.1.*."

I think this is a necessary but insufficient condition for the properties we care about. As both of those examples are handled "the same exact way" currently, they match the largest version that makes the system work.

(On a side note I will be out for a few days and may not be able to respond promptly until next week.)

Eh2406 avatar Mar 06 '24 17:03 Eh2406

I am back. Let me know how I can help.

Eh2406 avatar Mar 11 '24 16:03 Eh2406

:umbrella: The latest upstream changes (presumably #14583) made this pull request unmergeable. Please resolve the merge conflicts.

bors avatar Sep 27 '24 22:09 bors

:umbrella: The latest upstream changes (possibly 081d7bac633bbc72883fb74fb4993bb1b1a2df4a) made this pull request unmergeable. Please resolve the merge conflicts.

rustbot avatar Dec 20 '24 15:12 rustbot

With this waiting-on-author for over a year, I'm going to go ahead and close this. We're still very much interested in this area being improved and feel free for someone to pick back up this conversation on how to solve this problem!

epage avatar Sep 11 '25 14:09 epage