libyear-gradle-plugin
libyear-gradle-plugin copied to clipboard
Ability to filter out pre-release dependency versions
I recently learned about the libyear metric and this plugin, and ran an analysis on one of our projects.
Problem
One issue I noticed in the output was that some dependencies are reported as outdated, even when no stable version existed.
Example line from the report:
-> 1.7 years from jakarta.persistence:jakarta.persistence-api (3.1.0 => 3.2.0-M1)
However, currently the released versions look like this:
| VERSION NUMBER | DATE PUBLISHED |
|---|---|
| 3.2.0-M1 | 2023-11-23 |
| 3.2.0-B02 | 2023-11-06 |
| 3.2.0-B01 | 2023-08-28 |
| 3.1.0 | 2022-02-25 |
| ... | ... |
Given that using unstable/non-final dependency versions in production is considered to be bad practice, I think this plugin could either automatically exclude non-final versions, or at least allow the user to somehow configure which newer versions to consider.
Impact
For a project that had 79 outdated dependencies, 16 of them (i.e., ~20%) were compared against non-final versions:
-> 1.7 years from jakarta.persistence:jakarta.persistence-api (3.1.0 => 3.2.0-M1)
-> 1.5 years from jakarta.validation:jakarta.validation-api (3.0.2 => 3.1.0-M1)
-> 1.4 years from jakarta.annotation:jakarta.annotation-api (2.1.1 => 3.0.0-M1)
-> 1.2 years from net.sf.jopt-simple:jopt-simple (5.0.4 => 6.0-alpha-3)
-> 10 months from org.apache.logging.log4j:log4j-api (2.20.0 => 3.0.0-beta1)
-> 10 months from org.apache.logging.log4j:log4j-to-slf4j (2.20.0 => 3.0.0-beta1)
-> 6.2 months from org.jetbrains.kotlin:kotlin-stdlib-common (1.8.22 => 2.0.0-Beta2)
-> 6.2 months from org.jetbrains.kotlin:kotlin-reflect (1.8.22 => 2.0.0-Beta2)
-> 6.2 months from org.jetbrains.kotlin:kotlin-stdlib-jdk8 (1.8.22 => 2.0.0-Beta2)
-> 6.2 months from org.jetbrains.kotlin:kotlin-stdlib (1.8.22 => 2.0.0-Beta2)
-> 6.2 months from org.jetbrains.kotlin:kotlin-stdlib-jdk7 (1.8.22 => 2.0.0-Beta2)
-> 3.8 months from org.slf4j:jul-to-slf4j (2.0.9 => 2.1.0-alpha0)
-> 3.8 months from org.slf4j:slf4j-api (2.0.9 => 2.1.0-alpha0)
-> 28 days from org.apache.httpcomponents.client5:httpclient5 (5.2.3 => 5.4-alpha1)
-> 25.9 days from org.apache.httpcomponents.core5:httpcore5-h2 (5.2.4 => 5.3-alpha1)
-> 25.9 days from org.apache.httpcomponents.core5:httpcore5 (5.2.4 => 5.3-alpha1)
This results in either:
- Falsely reported dependencies - e.g., for
jakarta.persistence:jakarta.persistence-apiversion3.1.0that is used is actually the latest stable release - Incorrect libyear values - e.g. for
org.apache.httpcomponents.client5:httpclient5libyear value of 28 days was reported (5.2.3 => 5.4-alpha1), but if we compared against the latest stable version (5.2.3 => 5.3), then libyear value would be just 5 days
Collectively this:
- Results in a higher libyear value than it actually is
- Makes the analysis results more difficult to interpret, as they require additional post-processing by a person
Potential solutions
General solution
Looking at semver, it seems that any pre-release version would contain a hyphen:
A pre-release version MAY be denoted by appending a hyphen and a series of dot separated identifiers immediately following the patch version. . . . Examples: 1.0.0-alpha, 1.0.0-alpha.1, 1.0.0-0.3.7, 1.0.0-x.7.z.92, 1.0.0-x-y-z.--.
And looking at the anecdotal evidence from this one project, it seems that:
- all pre-release versions did indeed contain a hyphen
- the only dependency, the version of which contained a hyphen, and which was a stable release was Guava (
com.google.guava:guava (32.1.3-jre => 33.0.0-jre))
Therefore, maybe the general rule could be "if current dependency version contains a hyphen, then consider all available dependency versions, while if it does not - only look at versions without hyphens)
User-configurable solution
Maybe there could be a configuration parameter that allows the user to specify what versions to include or exclude:
libyear {
configurations = ['compileClasspath']
ignoreNewerArtifactsWithVersionsMatching = "<regex that matches specific suffixes>"
^-- new parameter
failOnError = true
validator = allArtifactsCombinedMustNotBeOlderThan(days(5))
}
Example of such regex could be -(?!jre) that would ignore anything with a hyphen, except if it was -jre
Thank you very much @grimsa for your detailed report and your interest in this plugin!
From a surface-level reading, I think the plugin could do better for the general case of semver. If semver describes what a "pre-release" version number looks like, a configuration option to filter out pre-release versions looks reasonable, and may even default to "true".
But at the same time relying more on semver for artifact ordering may be a significant departure from the existing approach, in which the repository tells us which release is the most "recent" (aka "last published"). In many cases this strategy has been very reliable, and works also with projects which do not version with semver, while at the same time has other drawbacks, such as this one:
https://github.com/f4lco/libyear-gradle-plugin/blob/7849052ddbd5f6562fdc08b289e12bddf0d55936/libyear-gradle-plugin/src/main/kotlin/com/libyear/sourcing/SolrSearchAdapter.kt#L107-L111
We'll have to give it more thought, for implementation, as well as on the question "what is the best possible 'default' behavior for the plugin". Any input is appreciated :)
About multiple dependency versions being maintained in parallel - I noticed that as well with Spring projects.
I did not consider it to be a problem in my case, because, for example, Spring Security maintains 3 versions in parallel (https://spring.io/projects/spring-security/#support), at the time of writing this it is 6.2.x, 6.1.x, and 5.8.x. As far as I can tell, they publish releases for all 3 versions within minutes of each other (starting with the oldest and finishing with the latest).
So if we were running the latest 5.8.x release, we would observe:
- A tiny amount of libyears reported for this dependency (because
6.2.xrelease has been published a few minutes after5.8.x) - No indication that we're multiple significant releases behind
But I think it is acceptable, because:
- While each release line is being maintained, as long as we're on the latest release of same major version (even if it is not the latest release line) - we're still using a maintained version, so maybe libyear showing close-to-zero is meaningful. Once maintenance of 5.8.x line stops, we'd naturally see increasing number of libyears accumulating, and then we'd have a clear signal to upgrade.
- The fact that other release lines exist would still be visible in the report as a minutes-large amount of libyears for this dependency (because if 5.8.x line was the latest one, it's release would be published last, and then it would result in 0 libyears, and no entry). So this is also good, though it depends on Spring policy of publishing newer releases later (even if by minutes), which seems to not be the case with Tomcat.
--
As for how to determine the version.
I did try sending a request to Solr search (GET https://search.maven.org/solrsearch/select?q=g:"org.apache.tomcat" AND a:"tomcat") and see how given a version it can return a timestamp.
As for determining what versions are published - maybe it would be possible to leverage published maven metadata? For example, for Tomcat: https://repo1.maven.org/maven2/org/apache/tomcat/tomcat/maven-metadata.xml
It looks like this:
<?xml version="1.0" encoding="UTF-8"?>
<metadata>
<groupId>org.apache.tomcat</groupId>
<artifactId>tomcat</artifactId>
<versioning>
<latest>11.0.0-M15</latest>
<release>11.0.0-M15</release>
<versions>
<version>7.0.35</version>
// ...
<version>7.0.109</version>
<version>8.0.0-RC1</version>
<version>8.0.0-RC3</version>
<version>8.0.0-RC5</version>
<version>8.0.0-RC10</version>
<version>8.0.1</version>
// ...
<version>9.0.84</version>
<version>10.0.0-M1</version>
<version>10.0.0-M3</version>
<version>10.0.0-M4</version>
<version>10.0.0-M5</version>
<version>10.0.0-M6</version>
<version>10.0.0-M7</version>
<version>10.0.0-M8</version>
<version>10.0.0-M9</version>
<version>10.0.0-M10</version>
<version>10.0.0</version>
// ...
<version>10.1.17</version>
<version>11.0.0-M1</version>
<version>11.0.0-M3</version>
<version>11.0.0-M4</version>
<version>11.0.0-M5</version>
<version>11.0.0-M6</version>
<version>11.0.0-M7</version>
<version>11.0.0-M9</version>
<version>11.0.0-M10</version>
<version>11.0.0-M11</version>
<version>11.0.0-M12</version>
<version>11.0.0-M13</version>
<version>11.0.0-M14</version>
<version>11.0.0-M15</version>
</versions>
<lastUpdated>20231212142015</lastUpdated>
</versioning>
</metadata>
I also checked the metadata file for one of Spring Security artifacts and I see that releases are ordered by version (and not by release date).
And this is what metadata for Guava looks with its -android and -jre variants.
Maybe then the logic could be something like (pseudocode):
getMavenMetadata("org.apache.tomcat:tomcat").streamVersions()
.dropWhile(version -> version is not equal to that of the dependency version in current project, e.g. "10.1.3")
// v-- This filter step would deal with the logic requested in this issue
.filter(version -> version is not a pre-release version as defined by semver or some other possibly customizable logic)
.findLast()
This would then result in 10.1.17 being returned, because all 11.0.0 versions are pre-release versions.
And then Solr search could be used to lookup the release dates of 10.1.3 and 10.1.17 releases (to calculate libyear value).
So overall, it seems that combining use of Maven metadata with Solr search might make it possible to have a better solution for cases where multiple release lines are maintained in parallel (like Tomcat or Spring does), and it would also make it possible to exclude pre-release versions (because in Maven metadata we have access to all versions, not just the latest).
--
Now that I'm writing this, it also seems to me that maven metadata would also make it quite easy to implement version-distance-based metric calculation (I think the original paper argued that it has more benefits over date-based metric). That could be interesting and useful too.