opam
opam copied to clipboard
support for semver version constraint operator
Many people prefer semantic versioning (although perhaps fewer in the opam ecosystem than elsewhere). Even when not followed entirely faithfully, semver-style version constraints greatly reduce the number of completely unconstrained dependency edges (which I believe currently are the majority in the repository) and thus downstream breakage and manual work for opam admin staff after a new major release of a popular package is released.
I propose opam to support the semver version constraint operator, ~
. The semantics are as follows:
-
~ x.y.z
wherex > 0
expands to>= x.y.z & < x.y+1.0
-
~ x.y
wherex > 0
expands to>= x.y & < x+1.0
-
~ x
wherex > 0
expands to>= x & < x+1
-
~ 0.y.z
expands to>= 0.y.z & < 0.y+1.0
-
~ 0.y
expands to>= 0.y & < 0.y+1
- everything else is an error.
You may notice that the behavior with major version 0 is not quite what the semver specification would naively imply (it says that at 0.y.z
anything may change at any time). This deviation is intentional and legal (semver does not define any such constraint operator), and provides a meaningful way for a package still in major version 0 to release bugfixes.
You may also notice that there is no mention of alphanumeric versions (alpha, beta, etc). As far as I know these basically do not get used in opam repository so it seems pointless to drag in their (considerably complex and, I think, inexpressible through simple expansion) semantics.
Duplicate of #2102 (although I suggest different / possibly cleaner semantics), feel free to close.
Sounds good, and thanks for suggesting this. I would really like it if the new operator was defined on any allowed version, rather than throwing errors: this needs some generalisations of the above definition, but maybe we can manage it.
For the first case, for example, in opam terms it would be more consistent to have
{ ~ x.y.z } ⇒ { >= x.y.z & < x.y+1 }
I think this is better because of the tilde semantics in our total ordering: 1.2.0~beta
would match ~ 1.1.3
(⇒ >= 1.1.3 & < 1.2.0
) otherwise. This also generalises the 0.x.y
cases.
I don't know though how to handle the 2nd/6th cases:
{ ~ x.y } ⇒ { >= x.y & < x+1.0 } if x > 0
⇒ { >= x.y & < x.y+1 } if x = 0
Maybe it makes perfect sense with semver in mind, but I find such a specific-case rule quite distasteful, or is it just me ? Now I can well understand that if the ~
has a well established definition everywhere else, implementing our own slightly incompatible variant of the same operator would be distasteful as well.
If we skip that last rule, one definition could be:
~ v
is defined as the range of versions fromv
included tonext(v)
excluded.next(v)
is defined as the versionv
with:
- everything after the one-before-last number (of first if there is only one) stripped, and that number incremented
- if the version doesn't contain any numbers, maybe just append
1
?
A number is defined as a maximal sequence of contiguous digits, as used by our version ordering.
This may yield weird results with e.g. ~beta2
suffixes though (next(1.2.3~beta2)
would be 1.2.4
). We could further simplify the next()
function by first stripping anything from:
- the first
~
character, as long as there is a number before it, or - the first letter that happens after a number, or even
- the first non-dot character that happens after a number.
(the reason for the "after a number" part is that we want next(v) > v
, and with our ordering, a > 0
: follows that we should never strip leading non-digit characters)
Sounds good. But I think handling the major version 0 case is quite important, with 7.7% of packages in the repository (539/6997) having such a major version.
Let me rephrase in a way that you'll perhaps find more elegant. If there is a sequence of zeroes at the head of the version:
- the zeroes are stripped,
- the operator is expanded as usual,
- the zeroes are added back to the expanded form.
Does this sound better?
Sounds more intuitive, indeed. The definition for next(v)
could be rephrased accordingly:
next(v)
is the versionv
with the "compatibility version number" incremented, and everything after it stripped. The "compatibility version number" ofv
is identified from the first dot-separated number sequence appearing inv
(possibly of length 1 or 0,/([0-9]+(\.[0-9]+)*)?/
):
- the one-before-last number in that sequence, if it exists and is not zero
- the last number in that sequence otherwise
0
, assumed to be at the end ofv
, ifv
contains no numbers.
Would that make sense ?
Let's see...
-
next(1)
=2
, so>= 1 & < 2
, good; -
next(1.0)
=2
, so>= 1.0 & < 2
, good; -
next(1.1)
=2
, so>= 1.1 & < 2
, good; -
next(1.1.0)
=1.2.0
, so>= 1.1.0 & < 1.2.0
, good; -
next(1.1.1)
=1.2.0
, so>= 1.1.1 & < 1.2.0
, good; -
next(1.0.0)
=1.0.1
while it should be1.1.0
, oops.
I also think that
the first dot-separated number sequence appearing in v
is far too lenient. For example this will consider things such as dates or even parts of git hashes (!) as semver version numbers. Which really doesn't seem good.
I suggest, at least, requiring a semver version to appear at the beginning of the package version, so /^([0-9]+(\.[0-9]+)*)?/
. This does mostly the right thing for dates like 20170203
(>= 20170203 & < 20170204
constrains to that specific date so long as dates don't have further subdivisions), and not particularly right thing for git versions like git1a2b3cd
(>= 0 & < 1
constrains to a version 0*
which hopefully would not coexist with git versions).
I suggest, at least, requiring a semver version to appear at the beginning of the package version
I am not so sure: for example, JS recently moved to have versions prefixed by v
(presumably, to make sure they come after the 1xx.xx.xx
versions of the old scheme, or to match their git tags). We may have versions such as just test1
or beta4
, or even beta3.1
, too. As for git hashes, they are unordered, so it won't make sense anyway (only =
and !=
could make sense, >=
already doesn't).
For the definition, well spotted. So we would need to be more specific:
- the one-before-last number in that sequence, if it exists and is not both zero and coming first in the sequence.
Dose has a Semantic versioning comparing function that can be used in opam. Tested on npm, so pretty solid (or at least it was last year).
Discussed in opam dev meeting: putting this on the triage list for opam 2.2 to see if we can add this as a lightweight operator.
I think we should take into account the debian versioning scheme when adding this feature. If ~ "1.1.1"
is mapped to >= "1.1.1" & < "1.2.0"
, then this range will include version 1.2.0~beta1
which is likely incompatible. An approximate solution would be to replace the part after the incremented component with a few tides, so that we get >= "1.1.1" & < "1.2~~~"
.
The version in the upper constraint to make the above precise would be the greatest lower bound of the sequence {1.2, 1.2~, 1.2~~, 1.2~~~, ...}. This cannot be expressed without changing the version syntax, but an alternative would be to introduce a "significantly-less" operator, say ~<
, where, given a common longest prefix x
, xy ~< xz
iff z
does not start with a tilde and y < z
.
I like this proposal, but I think a plain next-major constraint, as provided for npm and php composer as ^
, is more useful. My reason is that I think a constraint up to the next major version is good practise for external packages, but there is an issue with using ~
for this when there have been important bug fixes which warrant a lower constraint at the patch-level. E.g. {>= "1.2.3" & < "2~"}
is {^ "1.2.3"}
but cannot be expressed with ~
.