maven icon indicating copy to clipboard operation
maven copied to clipboard

[MNG-7559] Fix version comparison (master 4.0.x branch)

Open sultan opened this issue 3 years ago • 28 comments

This PR can be merged, If wanted to port (to 4.0.x) the fix for https://issues.apache.org/jira/browse/MNG-7559 intention is:

  • 1.0.0.RC1 < 1.0.0-RC2
  • ( edr, pfd, etc.) < final, ga, release
  • 9.4.1.jre16 > 9.4.1.jre16-preview

following semver rules should be encouraged, natural ordering is used without the need to hard code strings, except for hard coded qualifiers 'a', 'b', 'm', 'cr', 'snapshot', 'final', 'ga', 'release', '' and 'sp':

  • alpha = a < beta = b < milestone = m < rc = cr < 'snapshot' < '' = 'final' = 'ga' = 'release' < 'sp'

the documentation should discourage the usage of 'CR', 'final', 'ga', 'release' and 'SP' qualifiers. Maven Central should begin to reject new artifact using CR and SP qualifiers.


Following this checklist to help us incorporate your contribution quickly and easily:

  • [x] Make sure there is a JIRA issue filed for the change (usually before you start working on it). Trivial changes like typos do not require a JIRA issue. Your pull request should address just this issue, without pulling in other changes.
  • [x] Each commit in the pull request should have a meaningful subject line and body.
  • [x] Format the pull request title like [MNG-XXX] SUMMARY, where you replace MNG-XXX and SUMMARY with the appropriate JIRA issue.
  • [x] Also format the first line of the commit message like [MNG-XXX] SUMMARY. Best practice is to use the JIRA issue title in both the pull request title and in the first line of the commit message.
  • [x] Write a pull request description that is detailed enough to understand what the pull request does, how, and why.
  • [ ] Run mvn clean verify to make sure basic checks pass. A more thorough check will be performed on your pull request automatically.
  • [ ] You have run the Core IT successfully.

If your pull request is about ~20 lines of code you don't need to sign an Individual Contributor License Agreement if you are unsure please ask on the developers list.

To make clear that you license your contribution under the Apache License Version 2.0, January 2004 you have to acknowledge this by using the following check-box.

sultan avatar Dec 20 '22 12:12 sultan

I must honestly admit that I feel a lot of pain in my ass with this class because:

  • There are too many implications
  • No explicit ordering
  • No explicit statement/code what is before GA and after.

Edge case: 1.0-a (alpha) < 1.0 < 1.0-abc since the qualifier abc is not an alpha.

I think this needs to be split up again. Let's first focus on the . (dot) and - (hyphen) issue. From my PoV the PR addresses several issues which does not feel right.

michael-o avatar Dec 20 '22 15:12 michael-o

@hboutemy I know that you worked on this once in a while. What is your opinion?

michael-o avatar Dec 20 '22 16:12 michael-o

I must honestly admit that I feel a lot of pain in my ass with this class because:

  • There are too many implications
  • No explicit ordering
  • No explicit statement/code what is before GA and after.

Edge case: 1.0-a (alpha) < 1.0 < 1.0-abc since the qualifier abc is not an alpha.

I think this needs to be split up again. Let's first focus on the . (dot) and - (hyphen) issue. From my PoV the PR addresses several issues which does not feel right.

this can be split into two PRs if needed.

the ordering is dispatched into specific locations:

  • private static final List<String> QUALIFIERS = Arrays.asList("snapshot", "", "sp");
  • public static String comparableQualifier(String qualifier)
  • public static int compareQualifiers(String qualifier1, String qualifier2)

still open for a better approach to make things more readable

sultan avatar Dec 20 '22 16:12 sultan

I must honestly admit that I feel a lot of pain in my ass with this class because:

  • There are too many implications
  • No explicit ordering
  • No explicit statement/code what is before GA and after.

Edge case: 1.0-a (alpha) < 1.0 < 1.0-abc since the qualifier abc is not an alpha. I think this needs to be split up again. Let's first focus on the . (dot) and - (hyphen) issue. From my PoV the PR addresses several issues which does not feel right.

this can be split into two PRs if needed.

the ordering is dispatched into specific locations:

* private static final List QUALIFIERS = Arrays.asList("snapshot", "", "sp");

* public static String comparableQualifier(String qualifier)

* public static int compareQualifiers(String qualifier1, String qualifier2)

still open for a better approach to make things more readable

What would those two PRs contain?

michael-o avatar Dec 20 '22 16:12 michael-o

What would those two PRs contain?

one for dot/dash case ( .RC1 < -RC2 ) one for the qualifiers lexical order case ( abc < alpha < edr < m < pfd < rc )

but that would only separate 10 lines of code from the rest. i would personally go with one same PR. your call! if you find it better in two why not.

sultan avatar Dec 20 '22 17:12 sultan

What would those two PRs contain?

one for dot/dash case ( .RC1 < -RC2 ) one for the qualifiers lexical order case ( abc < alpha < edr < m < pfd < rc )

but that would only separate 10 lines of code from the rest. i would personally go with one same PR. your call! if you find it better in two why not.

Yes, please let's do the dot/hyphen issue with a separate JIRA ticket first.

michael-o avatar Dec 20 '22 17:12 michael-o

What would those two PRs contain?

one for dot/dash case ( .RC1 < -RC2 ) one for the qualifiers lexical order case ( abc < alpha < edr < m < pfd < rc ) but that would only separate 10 lines of code from the rest. i would personally go with one same PR. your call! if you find it better in two why not.

Yes, please let's do the dot/hyphen issue with a separate JIRA ticket first.

JIRA/PRs created:

  • JIRA https://issues.apache.org/jira/browse/MNG-7644
  • PR https://github.com/apache/maven/pull/932
  • PR https://github.com/apache/maven/pull/931
  • PR https://github.com/apache/maven/pull/930

sultan avatar Dec 20 '22 18:12 sultan

What I now understand is that arbitrary qualifiers are not considered after GA. Following example: I need to fork commons-io 2.12.0 for the company and it will be named 2.12.0-company-1 through 2.12.0-company-5. I would expect that those come after the GA because I had to apply custom patches.

I do this at work for various reasons and if you check libs bundled with Nexus they have SONATYPE qualifier.

you may want to use the + instead of - as in: 2.12.0+company-1

maybe a third PR to be merged before for this case ?

sultan avatar Dec 21 '22 09:12 sultan

What I now understand is that arbitrary qualifiers are not considered after GA. Following example: I need to fork commons-io 2.12.0 for the company and it will be named 2.12.0-company-1 through 2.12.0-company-5. I would expect that those come after the GA because I had to apply custom patches. I do this at work for various reasons and if you check libs bundled with Nexus they have SONATYPE qualifier.

you may want to use the + instead of - as in: 2.12.0+company-1

maybe a third PR to be merged before for this case ?

Right, according to SemVer my approach is wrong. Thanks for the clarification.

michael-o avatar Dec 22 '22 09:12 michael-o

@sultan Please rebase this PR on top of current master.

michael-o avatar Dec 22 '22 09:12 michael-o

@sultan Please rebase this PR on top of current master.

rebased with a minor change in javadoc and tests

sultan avatar Dec 22 '22 12:12 sultan

As far as I understand now this is a change in behavior? If so, I cannot apply to 3.8.x because it is not a bugfix. Is it?

michael-o avatar Dec 22 '22 15:12 michael-o

As far as I understand now this is a change in behavior? If so, I cannot apply to 3.8.x because it is not a bugfix. Is it?

still a bug fix but i am ok to have it only in Major update 4.0 if the change breaks behaviour

sultan avatar Dec 22 '22 18:12 sultan

Let me have a fresh look at this tomorrow.

michael-o avatar Dec 22 '22 18:12 michael-o

Let me have a fresh look at this tomorrow.

yes no hurry,

the Maven docs used to tell that all other qualifiers were considered later than release, and as of now: lesser than release. if the spec was not in error but require a change then its a major update.

what i wonder if its also the same for the dot/hyphen change and need revert 3.8 and 3.9 so the fix is only on 4.0 (the more i think the more i'm tempted to say yes)

sultan avatar Dec 22 '22 19:12 sultan

Let me have a fresh look at this tomorrow.

yes no hurry,

the Maven docs used to tell that all other qualifiers were considered later than release, and as of now: lesser than release. if the spec was not in error but require a change then its a major update.

Will check that.

what i wonder if its also the same for the dot/hyphen change and need revert 3.8 and 3.9 so the fix is only on 4.0 (the more i think the more i'm tempted to say yes)

Why? The docs say:

else ".qualifier" = "-qualifier" < "-number" < ".number"

so .X and -X are equal and this is what your fix does. It aligns an inconsistency with the specs/docs.

michael-o avatar Dec 22 '22 19:12 michael-o

I must admit that I partially lost overview. @sultan Can you again briefly summarize how the current implementation does not correspond to the spec regardless of pfd or preview qualifiers?

michael-o avatar Dec 22 '22 21:12 michael-o

Hi @michael-o if I may, I think the current implementation is in line with the specification as I have a feeling that the specification has been based on it 😄

The problem is that the spec deviates from SemVer in how it treats non-standard qualifiers, like the "pfd" here. According to SemVer, all qualifiers should be treated as less than -.number whereas with "the spec" all non-standard qualifiers are treated as later than -.number.

As is the case of "-pdf" which is later than the release.

So this:

The docs say:

else ".qualifier" = "-qualifier" < "-number" < ".number

is not exactly true.

    @Test
    public void testComparableVersionWithCustomQualifier()
    {
        assertThat( new DefaultArtifactVersion( "2.3" ).compareTo( new DefaultArtifactVersion( "2.3-pfd" ) ),
                greaterThan( 0 ) );
    }
java.lang.AssertionError: 
Expected: a value greater than <0>
     but: <-2> was less than <0>

So, adding "pfd" as a recognised qualifier would mean that the spec also needs to be updated. Which is why it's a breaking change and not a bugfix.

Frankly, to me it looks like this behaviour contradicts the statement that "-qualifier" < "-number", but I've long given up on that.

andrzejj0 avatar Dec 23 '22 06:12 andrzejj0

Hi @michael-o if I may, I think the current implementation is in line with the specification as I have a feeling that the specification has been based on it 😄

The problem is that the spec deviates from SemVer in how it treats non-standard qualifiers, like the "pfd" here. According to SemVer, all qualifiers should be treated as less than -.number whereas with "the spec" all non-standard qualifiers are treated as later than -.number.

As is the case of "-pdf" which is later than the release.

So this:

The docs say:

else ".qualifier" = "-qualifier" < "-number" < ".number

is not exactly true.

    @Test
    public void testComparableVersionWithCustomQualifier()
    {
        assertThat( new DefaultArtifactVersion( "2.3" ).compareTo( new DefaultArtifactVersion( "2.3-pfd" ) ),
                greaterThan( 0 ) );
    }
java.lang.AssertionError: 
Expected: a value greater than <0>
     but: <-2> was less than <0>

So, adding "pfd" as a recognised qualifier would mean that the spec also needs to be updated. Which is why it's a breaking change and not a bugfix.

Frankly, to me it looks like this behaviour contradicts the statement that "-qualifier" < "-number", but I've long given up on that.

I have just re-read the spec. It says:

If version strings are syntactically correct [Semantic Versioning 1.0.0](https://semver.org/spec/v1.0.0.html) version numbers, then in almost all cases version comparison follows the precedence rules outlined in that specification....Maven does not consider any semantics implied by that specification....Non-numeric tokens ("qualifiers") have the alphabetical order, except for the following tokens which come first in this order

So what it does is just use the syntax of SemVer 1.0 and not the semantics. It considers only a new qualifiers as pre-release and not all, e.g., sp for service pack is a post-GA qualifier. So to me this change is change of the specs since it wants add more qualifiers to the pre-release state. So my verdict based on my understanding the dot vs hyphen issue was real and I can see it with a few manual tests. Others are not.

I will remove this ticket from 3.x for obvious reasons. I think that the version scheme requires an update in 4.0, but that is a different discussion.

Opinions?

Side note: Looking at SemVer 2.0 I am not really happy with it:

  • It allows very complex qualifiers. Don't know wether that is really necessary
  • Our SNAPSHOT qualifier cannot be covered
  • Build qualfiiers aren't release qualifiers for me
  • I don't see how one can map custom qualifiers (post-release) for the case I have depicted earlier, those aren't build qualifiers.
  • I don't see how a build qualifier makes sense after post-release where a release is supposed to be immutable and identical according to SemVer

PS: I am not happy with the ambiguity/lack of precision of the current spec

michael-o avatar Dec 23 '22 07:12 michael-o

I must admit that I partially lost overview. @sultan Can you again briefly summarize how the current implementation does not correspond to the spec regardless of pfd or preview qualifiers?

As of now Javadoc:

  • only SP is considered after release
  • all qualifiers "other than Final/GA/Release/SP" are less than Snapshot

Previously, Javadoc used to be:

  • Unknown qualifiers are considered after known qualifiers, with lexical order (always case insensitive),

no trace of the old statement on maven site, but a major change in the public javadoc

the PR that should not be merged are already closed/unmerged

  • PR #925
  • PR #887

sultan avatar Dec 23 '22 10:12 sultan

what i wonder if its also the same for the dot/hyphen change and need revert 3.8 and 3.9 so the fix is only on 4.0 (the more i think the more i'm tempted to say yes)

Why? The docs say:

else ".qualifier" = "-qualifier" < "-number" < ".number"

so .X and -X are equal and this is what your fix does. It aligns an inconsistency with the specs/docs.

this equality is a change in the docs dated from Nov 14th on maven site

  • PR https://github.com/apache/maven-site/pull/331

it was previously

  • else "<<<.qualifier>>>" < "<<<-qualifier>>>" < "<<<-number>>>" < "<<<.number>>>"

thus a major change in the docs from maven site

that would mean revert change on 3.x for Jira #7644

  • PR #930
  • PR #931

sultan avatar Dec 23 '22 10:12 sultan

Bah, I am really about to puke.

So much complexity for something quite obvious, I guess.

Looking at "<<<.qualifier>>>" < "<<<-qualifier>>>" < "<<<-number>>>" < "<<<.number>>>" it does not make sense, why the dot before a hyphen. If now the equalty applies to qualifiers, why not to nubers as well? The problem ist how to diffentiate between 2.3.4 where 4 is a actually a numeric qualfier?

michael-o avatar Dec 23 '22 10:12 michael-o

Bah, I am really about to puke.

the numbers part gave me headaches as well ^^

So much completely for something quite obvious, I guess.

Looking at "<<<.qualifier>>>" < "<<<-qualifier>>>" < "<<<-number>>>" < "<<<.number>>>" it does not make sense, why the dot before a hyphen. If now the equalty applies to qualifiers, why not to nubers as well? The problem ist how to diffentiate between 2.3.4 where 4 is a actually a numeric qualfier?

the parser was and is treating numbers and string differently (else/if).

the previous parser

  • 2.3.4 parses to [ 2, 3, 4 ]
  • 2.3-4 parse to [ 2, 3, [4] ]
  • 2.3-RC4 parses to [ 2, 3, [ RC, [4] ] ]
  • 2.3.RC4 used to parse to [ 2, 3, RC, [4] ]

i can see a benefit of change with .RC4 = -RC4 and .RC1 < -RC2

but i cannot yet see a benefit of change for numbers, as you may want to keep 1.0.0-2 = 1-2 = 1.0-2

treat - as dot for numbers will lead to bad surprises. we might not want 1.0.0-2 = 1.0.0.2 < 1.0-2 = 1.0.2 < 1-2 = 1.2

we should discourage the use of numeric qualifiers -1, -2 -N in the docs i dont see what the projects maintainers would want for these numbers to mean and how they would compare with strings (< or >?)

sultan avatar Dec 23 '22 10:12 sultan

returning back to this PR feature. (qualifiers ordering)

  • I don't see how a build qualifier makes sense after post-release where a release is supposed to be immutable and identical according to SemVer

yeah i forgot that, sorry. 1.2.3 and 1.2.3+1 should be iso-feature and iso-fixes.

What I now understand is that arbitrary qualifiers are not considered after GA. Following example: I need to fork commons-io 2.12.0 for the company and it will be named 2.12.0-company-1 through 2.12.0-company-5. I would expect that those come after the GA because I had to apply custom patches.

the SemVer way to achieve this would be on the artifact id:

  • :commons-io:2.12.0
  • :commons-io-2.12.0-company:1.0.0
  • :commons-io-2.12.0-company:5.0.0 (ugly? yes...)

the choice of adding 'company' as a qualifier is ambiguous regarding to features and versions ordering

sultan avatar Dec 23 '22 12:12 sultan

returning back to this PR feature. (qualifiers ordering)

  • I don't see how a build qualifier makes sense after post-release where a release is supposed to be immutable and identical according to SemVer

yeah i forgot that, sorry. 1.2.3 and 1.2.3+1 should be iso-feature and iso-fixes.

the SemVer way to achieve this would be on the artifact id:

* :commons-io:2.12.0

* :commons-io-2.12.0-company:5.0.0 (ugly? yes...)

yet the other choice of adding 'company' as a qualifier is ambiguous regarding to features and versions ordering

This is quite ugly, if you can host the repo on corporate Nexus. I have the feeling that SemVer 2.0 is not ideally suited for Maven.

michael-o avatar Dec 23 '22 12:12 michael-o

This is quite ugly, if you can host the repo on corporate Nexus. I have the feeling that SemVer 2.0 is not ideally suited for Maven.

we might not be able to be fully SemVer 2.0 compatible because of legacy artifacts that were there before SemVer 1.0 was even defined. so if we cant be semver 2 compatible, why not borrow the + sign? or any other sign to distinguish between natural ordering from maven custom ordering.

there is surely a good reason for the expected behaviour, but i seem to lack the understanding and i'm not sure i can suggest a valid solution until the expected result is formalized

  • what is the purpose of the Sonatype qualifier?
  • how do we order things when several companies add their own name?
  • can a company do a release in an artifactid folder owned by another company?

there could be other approaches like minus is before GA and plus is after GA

1.0.0-x < 1.0.0 < 1.0.0+x

sultan avatar Dec 23 '22 12:12 sultan

why not put a pause on the pr?

i'll try to send questions to the mailing list after the holidays. there seems to be conflicting wishes and a lack of formalism on the expected behaviour.

i would not want to suggest a solution that put aside someone wishes.

sultan avatar Dec 23 '22 12:12 sultan

why not put a pause on the pr?

i'll try to send questions to the mailing list after the holidays. there seems to be conflicting wishes and a lack of formalism on the expected behaviour.

i would not want to suggest a solution that put aside someone wishes.

This is also my desire to pause until clarified.

michael-o avatar Dec 23 '22 13:12 michael-o