cargo icon indicating copy to clipboard operation
cargo copied to clipboard

Lint againt non semver-open dependencies

Open matklad opened this issue 7 years ago • 31 comments

EDIT: the issue now has mentoring instructions. Note that to implement this, we'll need to add additional API to semver crate as well!

Cargo more-or-less enforces contract on crate authors that version 1.x+n.u can always be used instead of 1.x.v. That means that any semver requirement, except for ^x.y.z, does not make sense. Moreover, exotic version requirements may version resolution significantly more complicated.

So we should lint against any sevmer requirement other than x, x.y or x.y.z.

Specifically,

  • ^x.y.z is not necessary, because ^ is default
  • ~ violates versioning contract. As a consequence Cargo will generally refuse to compile a project with two tilde dependencies with same major version.
  • >, >= do not make sense, because major version upgrade, by contract, can break anything.
  • = is in the wrong place, lockfiles should be used to pin dependencies.
  • <, <= again, version pining belongs to the lockfile.
  • * we already forbid. It actually makes sense for suerp-quick local development though!

The interesting not clear-cut use-case is version striding (compatibility with several major versions). For example, serde = ">= 2, < 4" means that you are source-compatible with both serde 2 and serde 3. However, looks like the proper solution here is for the upstream crate to do dtolnay's trick? That is, serde 2 should depend on serde 3, and reexport it's APIs.

There are several design decisions on how to actually implement it:

  • We can warn on cargo publish enforcing this for crate.io crates, or we can warn on parsing Cargo.toml, enforcing this for anyone.

  • We can add an opt-out mechanism. For example, we might make you to write a reason if you use non-standard requirement: foo = { version = "~9.2", reason = "this crate has weird rustc compatibility policy" }.

  • Finally, the most radical option, what if we flatly deprecate all forms of version specification, except for x, x.y, x.y.z? That way, we won't need to document and explain various forms of version requirements at all.

I personally am leaning towards the last option, because it makes the overall system radically simpler for the user, but I am curious if there are some real use-cases for exotic version requirements?

matklad avatar Apr 11 '18 07:04 matklad

@rust-lang/cargo: this is the issue for the idea we've discussed at all hands and on the last meeting.

matklad avatar Apr 11 '18 07:04 matklad

Thanks for opening an issue for this @matklad! I think the general idea here is a good one to basically raise awareness about the dangers of overly restrictive version requirements, but I do think we need to tread carefully as well. These sorts of version requirements are often useful in a pinch in some situations and can act as a form of relief rather than constantly feeling like Cargo is slapping you on the wrist.

I think this came up as well when publishing to crates.io awhile back as well. We considered rejecting any publishes with a version requirement that has an overly restrictive upper bound (aka in the middle of a semver range) and ended up only disallowing * dependencies.

I'd also want to make sure that the warning story here is something more sophisticated than "warn every time we see these version requirements", although I'm not sure how we'd best do that...

alexcrichton avatar Apr 11 '18 14:04 alexcrichton

I think the most important time to enforce this is on publish, right? You don't want your libraries with overly restrictive requirements causing you problems, but you might want to use = in your end project for several good reasons, as Alex notes.

We could, as sort of the most moderate thing, not upload something with non-open requirements unless you pass an additional flag (similar to --allow-dirty). That way we are "linting" you when you upload, but not totally stopping you the way we do with *.

withoutboats avatar Apr 11 '18 14:04 withoutboats

Agree with @withoutboats that that would a nice conservative first step, which actually will get us 90% of the most aggressive solution anyway.

We've also talked several times before about various publish-time checks (like filling fields in Cargo.toml, having basic docs, etc), so perhaps it's time to introduce some general CLI for such checks? Like

$ cargo publish
   ...
   ...
warning: license, repository and documentation fields of Cargo.toml are not filled
warning: `clap = "~2.10"` is an overly restrictive dependency, consider `clap = "2.10"` instead

return with `--suppress-preflight-checks` to proceed with publication

If that sounds great, I'll be happy to write some mentoring instructions!

matklad avatar Apr 11 '18 14:04 matklad

At least starting out with warnings on publish sounds great to me!

alexcrichton avatar Apr 11 '18 15:04 alexcrichton

it would probably be useful to be able to run these checks independently of publishing, such as in CI. cargo preflight-checks (or doctor/audit etc).

dwijnand avatar Apr 11 '18 16:04 dwijnand

@dwijnand If I'm not mistaken this exists as cargo publish --dry-run

withoutboats avatar Apr 11 '18 18:04 withoutboats

oh yeah.. easy :) 👍

dwijnand avatar Apr 11 '18 20:04 dwijnand

Mentoring instructions:

We already implement similar functionality for warning about missing Cargo.toml fields, this happens here: https://github.com/rust-lang/cargo/blob/master/src/cargo/ops/cargo_package.rs#L39-L41

To add checks for versions, the steps are:

  1. Add --non-standard-requirements flag here, to be able to suppress the warning (name is subject to bike-shedding).
  2. Actually check for versions here. To get the list of declared dependencies, use pkg.manifest().dependencies(). The version_req field of a dependency is a semver::VersionReq which needs to be checked. Looks like the semver crate does not expose API to inspect the precise form of version requirement, so first such an API needs to be implemented. Perhaps just a VersionReq::is_semver_open method should be added? Semver crate source is here
  3. Add a test. A similar one can be found here

matklad avatar Apr 13 '18 16:04 matklad

I will take this.

csmoe avatar Apr 14 '18 04:04 csmoe

@csmoe how is it going? Do you need any help?

Another special case came up in #5275: Cargo treats 0.x.0 versions like x.0.0. That is, 0.1.0 and 0.2.0 are treated semver-incompatible. So, if we see something like >= 0.5.0, < 0.6.0, we can suggest that this is just 0.5.0. Note that implementing this would be tricky, because it'll require exposing all of the semver internals (unless we add some kind of fn simplify(&self) -> Option<VersionReq> method to semver), so it should be handled separately from the main work here.

matklad avatar Apr 21 '18 08:04 matklad

I'm not a fan of warning against ~ and = dependencies.

In my gmp-mpfr-sys crate, I recommend using ~ dependencies when a dependent crate uses some C implementation details. No breaking public Rust API changes will happen between 1.1 and 1.2, but the internal implementation details may change. If a crate uses the bundled underlying C libraries directly, it should use "~1.1", not "1.1", as between 1.1 and 1.2 the version of the C libraries can change, and some C implementation details can change between 1.1 and 1.2. This allows crates that need to poke into the C libraries to get bug fixes by using "~1.1" instead of say "=1.1.3".

FFI also makes it difficult to use the semver trick, as different versions can have different incompatible C libraries. I don't think it's reasonable to expect a solution that merges two incompatible versions of the same C library. In such a case, it would be very useful to be able to make a dependent crate compatible with two different major versions of the -sys crate.

I also have a use case for an = dependency, where I used a non-documented implementation detail of a crate. To be specific, I wanted to test some serde special cases in my rug crate, and the easiest way I found was to use bincode version 1.0.0 and its undocumented non-public SliceReader. Since it is undocumented, and can only be used because of some implementation details, it can be removed in 1.0.1. This is a dev dependency of a library, so Cargo.lock is not in git and the easiest way for me to solve my problem was through setting my dev-dependency as bincode = "=1.0.0". So at least for dev dependencies, which will not hurt dependent crates but which are not solved with Cargo.lock, I think = dependencies should be allowed.

tspiteri avatar Apr 23 '18 10:04 tspiteri

Thanks a lot for the input @tspiteri !

I definitely agree we shouldn't emit any warnings for dev dependencies, as they don't affect the resolution for reverse-dependeices.

I think that we should make exception for the -sys crates as well.

The underlying reason why we want to discourage ~ dependencies is that they can lead to genuinely unsatisfiable dependencies. If one crate depends on foo = "~1.0.0", and another on foo = "~1.1.0", Cargo would just refuse to compile a project which includes both of them, because duplicated minor versions are forbidden. In contrast, if all deps are ^, then Cargo can always produce a valid resolution by duplicating major versions, if needed (which is allowed), and picking the latest minor for each major version.

However, the -sys crates already make it possible to create unsatisfiable dependencies, because of the links key: you can't have two crates with the same links, and so you can't duplicate major versions of the -sys crate.

The question is, how do we exclude -sys crates from linting? If you have a dependency specification like foo-sys = "~1.2.0", you can't reliably decide if foo is a sys-crate or not. In fact, the question is ill-formed: some versions of foo might include links key, and others might not.

Could we use a heuristic "if dependency name ends in -sys, don't warn about weird semver requirements"? This sounds extremely hacky, but this is for the warning only.

matklad avatar Apr 24 '18 12:04 matklad

About the hacky heuristic to detect FFI crates: if the warnings are issued by cargo publish, maybe that command can check whether the dependency has a links key in the version used during the verification stage. It is still a heuristic, but seems to me to be a bit less hacky than depending on the name ending in -sys.

tspiteri avatar Apr 25 '18 09:04 tspiteri

Late input and FWIW: I've been using version bounds like this frequently:

failure    = ">=0.1.0, <2"
fern       =  ">=0.5.5, <2"
futures    = ">=0.1.20, <0.2"

I check in Cargo.lock files so I have some additional control. Advantages of this from my perspective:

  • I can't seem to remember all the rules of the numerous other operator choices, particular for MAJOR version = 0 vs MAJOR >= 1, without constantly consulting the cargo guide. The use of explicit ranges is more clear.
  • Many crates are hopefully going to be stabilizing with a MAJOR version >= 1 eventually. I want to know about these, incl. via CI, and I expect these MAJOR version bumps (counter intuitive from a semver perspective) to be less breaking than a typical MINOR bump.

So I'm personally hoping I'm not going to run afoul of the proposed warnings for doing something that has at least a certain rationale, IMO.

Also this warning proposal seems to not consider the unfortunate reality that semver has some gray areas (see ongoing rust semver debates in various places) and otherwise well maintained crates break others all the time on MINOR releases. Crate authors need, from time to time, to indicate a maximum dependency bound, like futures < 0.2. Would they benefit from or appreciate a warning when doing that?

dekellum avatar Jun 11 '18 19:06 dekellum

@dekellum a dependency constraint of "0.1" is equivalent to ">= 0.1.0, < 0.2.0", so crate authors are already indicating a maximum dependency bound.

sfackler avatar Jun 11 '18 19:06 sfackler

Its not clear from the proposal that "0.1" wouldn't garner a warning. In any case I personally prefer the expanded notation for use is the Cargo.toml. Also there is the potential if unfortunate scenario of needing something like ">= 0.2.1, <= 0.2.7". I wouldn't expect crate authors to write that without good reason (e.g. 0.2.8 breaks something).

dekellum avatar Jun 11 '18 19:06 dekellum

"0.1" will not garner a warning; it is the form we are guiding people towards with that warning.

sfackler avatar Jun 11 '18 19:06 sfackler

@matklad, said:

I am curious if there are some real use-cases for exotic version requirements?

I think I have the above mentioned use cases. To give another concrete example:

Lets say crate foo has ongoing "3.3.x" releases targeting rust 2018 edition, but releases a "4.0.0" to use new features in rust 2019. Meanwhile the dependent crate bar has CI on all of these versions and its 1.3.24 release has CI showing its compatible with all of the above, so it specifies its foo dependency as:

foo = ">= 3.3.17, <5"

Does the author of bar want to see a constant warning about this?

Editorial Note: My original understanding of the aspirational purpose of Rust Editions feature has improved. If you replace "2018 edition" and "2019 edition" with "rust 1.43" and "rust 1.44" (coincidentally the first version to support the "2019 edition"), then the above example is more plausible and consistent.

dekellum avatar Jun 11 '18 20:06 dekellum

Another use case for “exotic” version dependencies: crates that must be updated in lockstep.

For instance, a x-derive crate that generates code that relies on an undocumented API of crate x. The best way to handle this currently without having to make a breaking release on each breaking change to said undocumented API appears to be setting x-derive to depend on =1.2.3 of crate x, and hope that the user doesn't specify =-based versions to both x and x-derive (in which case x would end up duplicated and everything would break).

So I think there are legitimate use cases for using non-standard version requirements. That said, warning against them by default completely does make sense (for the above-mentioned problem of having dependencies that must be updated in lockstep, for instance -- though that'd likely be best solved by https://github.com/rust-lang/cargo/issues/5920).

Ekleog avatar Aug 21 '18 16:08 Ekleog

As a practical example of your x-derive case, We recently saw some breakage with the failure 0.1.2 release (rust-lang-nursery/failure#234) where failure has a dependency on failure_derive written as "0.1.2" (meaning ">=0.1.2, < 0.2"). Both crates 0.1.2 version were released simultaneously but failure_derive 0.1.2 will not compile with failure 0.1.1. Cargo should IMO, be able to handle failure_derive specifying a more restrictive back-depdendency, e.g. ">= 0.1.2, < 0.2" for failure but currently cargo errors out because it views it as a dependency cycle.

: I think this is a misfeature, as cargo should be (heuristically?) solving the graph with preference to single versions, effectively merging broad to narrow dependency ranges, and not being bothered by cycles.

dekellum avatar Aug 21 '18 16:08 dekellum

@dekellum ">= 0.1.2, < 0.2" is the same as "^0.1.2" (or just "0.1.2").

EDIT: Fixed syntax typo.

joshtriplett avatar Aug 21 '18 17:08 joshtriplett

The human error rate for parsing and interpreting /[~\^=]?M(\.m(\.p)?)?/ is unacceptably high, isn't it? Thus I actually prefer the ">= 0.1.2, < 0.2" (or ">= 0.1.2, < 0.2.0") syntax in Cargo.toml (warnings be damned), which per the manual (I'm tired of consulting) is the same as "^0.1.2", same as "0.1.2". I believe for this specific range, "~0.1.2" is yet another synonym.

dekellum avatar Aug 21 '18 18:08 dekellum

Actually the tilde syntax is extra confusing as it differs from the rubygems I'm used to, and has another weird caveat that no mortal could possibly remember:

If you specify a major, minor, and patch version or only a major and minor version, only patch-level changes are allowed.

dekellum avatar Aug 21 '18 18:08 dekellum

@joshtriplett apparently didn't mean "~0.1.2" (human typo) and of course this lint doesn't want to keep the tilde operator either, so in that limited sense we agree. :-)

dekellum avatar Aug 21 '18 21:08 dekellum

What is the status of this issue? @matklad I need some clarifications on the issue...to create the API in semver

hbina avatar Oct 01 '19 16:10 hbina

I need to wait on https://github.com/steveklabnik/semver/issues/191 before I can proceed.

hbina avatar Oct 06 '19 00:10 hbina

@hbina I've been outside of Cargo dev for quite a while, so I am not sure what's the status here, but:

  • IIRC, there were some nice examples where you indeed do want to use = and ~ dependencies. So, while it's still true that plain foo = "x.y.z" deps are right most of the time, and that often people use ~, = and ^ sigils for no good reason, we somehow need to figure out how to avoid false-positives, and that seems hard.
  • I think one of the big motivations here was that ~ deps sometimes just make dep resolution slow, but @Eh2406 did a number of improvements to dep resolution since than, which might alleviate the problem.

matklad avatar Oct 07 '19 14:10 matklad

I have fixed all known (to me) real world slow cases in the resolver. There are still synthetic ones (#6258) but there pretty pathological. I don't remember a specific fix for ~, so if I missed a problem let me know. In general if anyone is seeing more than say 1sec in the resolver please file a bug. TLDR, don't need to make design decisions on resolution being slow.

Eh2406 avatar Oct 07 '19 16:10 Eh2406

Once #12115 is stabilized, we can add cargo lints with user control over them which would be a big help for when there are valid reasons for doing this (generally should be crate's depending on their proc macro and bins)

epage avatar May 24 '23 19:05 epage