rdf4j
rdf4j copied to clipboard
SemVer changes
We are currently following SemVer by using japicmp.
Propose changing our guarantees to the following
We follow SemVer as closely as we can.
- patch releases are binary and source compatible [1]
- minor releases are source compatible [2] [3]
- major release will struggle to only contain breaking changes that are previously deprecated
1: Except in the case of an urgent security concern. 2: Except that we allow new interface methods with default implementations. 3: Except that we allow removal of superclass declaration where the superclass is to be considered a helper class and users are expected to either rely on the actual class, a different superclass in the hierarchy or an interface.
Reasoning
The main reason to use SemVer is to maintain a consistency between releases for our users and especially for users who release their own tools built with RDF4J. SemVer is also useful for aligning our release process with the expectations of the Eclipse Foundation.
Binary compatibility is useful because it means that a user can change out a JAR file without recompiling other jar files that depend on it. Consider a fictional end user who has bought Product X which uses RDF4J 4.0.1 as one of many libraries. We discover and fix a data corruption issue and release a fix with 4.0.2. The end user can then swap out all the existing RDF4J JARs with the new ones without waiting for a new build of Product X.
Backwards compatibility can sometimes be cumbersome and costly, restricting innovation. SemVer details what should or shouldn't be considered a breaking change striking a good balance between compatibility and innovation. A few of the SemVer rules implemented in japicmp are still sometimes overly restrictive.
All new interface methods are considered a major change, even if they have a default implementation. We have fairly frequently broken with this restriction in order to add new interface methods for more specific use cases, for instance adding a removeAll(...) method with a default implementation that instead calls remove(...) from within a loop. Or adding two methods where one signals that the other is implemented, for instance supportsBulkAdd() and bulkAdd(...).
Our codebase relies on many shared classes with sensible implementations of interfaces. In most cases these classes exist for our own sanity in order to reduce duplicate code. Naturally we want our users to take advantage of these shared classes too and we take care to ensure backwards compatibility.
SemVer considers it a major change when a class stops extending a superclass because a user could have relied on that superclass in their code. An example BaseSheep sheep = new NorwegianSpottedMountainSheep(); vs Sheep sheep = new NorwegianSpottedMountainSheep();. From our own codebase EmptyIteration could be a decent example.
EmptyIteration extends AbstractCloseableIteration which implements CloseableIteration and some code for making sure that the iteration is only closed once. EmptyIteration is a final class, so users aren't allowed to extend it with their own code. It makes sense for EmptyIteration to implement CloseableIteration directly since there isn't any need for the code from AbstractCloseableIteration that makes sure that the iteration is only closed once. A question then is if users have relied on EmptyIteration viewed as AbstractCloseableIteration or if they call any of the inherited methods?
public final class EmptyIteration<E, X extends Exception> extends AbstractCloseableIteration<E, X> {
public EmptyIteration() {
}
@Override
public boolean hasNext() {
return false;
}
@Override
public E next() {
throw new NoSuchElementException();
}
@Override
public void remove() {
throw new IllegalStateException("Empty iterator does not contain any elements");
}
}
AbstractCloseableIteration exposes two methods that are not declared in the CloseableIteration interface. We can copy those methods into EmptyIteration which ensures source compatibility when EmptyIteration is viewed as an EmptyIteration.
public final class EmptyIteration<E, X extends Exception> implements CloseableIteration<E, X> {
public EmptyIteration() {
}
@Override
public boolean hasNext() {
return false;
}
@Override
public E next() {
throw new NoSuchElementException();
}
@Override
public void remove() {
throw new IllegalStateException("Empty iterator does not contain any elements");
}
public boolean isClosed() {
return true;
}
public void close() throws X {
}
protected void handleClose() throws X {
}
}
We would need to consider this on a case by case basis. For EmptyIteration it could very well make sense for a users to rely on AbstractCloseableIteration so that all iterations can be tracked in a list to make sure that they all have been correctly closed. Then again we could assume that users would implement their own wrappers if they want this sort of functionality since we could very well create new iteration classes that don't extend AbstractCloseableIteration which they then wouldn't be able to track.
Tasks
- [ ] Update pom.xml
- [ ] Update website
- [ ] Consider publishing a news item
@eclipse/technology-rdf4j-committers
We have started discussing this on the rdf4j dev mailing list and I've opened an issue here to track our decision. Comments are very welcome :)
We follow SemVer as closely as we can.
* patch releases are binary and source compatible [1] * minor releases are source compatible [2] [3] * major release will struggle to only contain breaking changes that are previously deprecated1: Except in the case of an urgent security concern.
Good call out. Somewhat on the fence if/when a concern is serious enough to do this, but it's not likely to happen often anyway.
2: Except that we allow new interface methods with default implementations.
We may want to explain a little why, generally speaking, this is not source compatible, to illustrate that these are rare cases and when handled with care, it won't be a problem. There's a good discussion here: https://stackoverflow.com/questions/31188231/do-java-8-default-methods-break-source-compatibility
3: Except that we allow removal of superclass declaration where the superclass is to be considered a helper class and users are expected to either rely on the actual class, a different superclass in the hierarchy or an interface.
I find this one hard to justify. I understand what you're trying to say, but taking a cynical view it sounds a little too much like: "We have this specific thing we want to fix in a minor release and we need to twist our definitions of compatibility to make that possible".
Perhaps we should at the very least pledge that we would only make such changes in rare circumstances, favoring marking things deprecated / internaluseonly before addressing in a major release, and that if we do this we will always ensure that the LSP is observed in these cases (use of the helper class can be safely swapped to use of the subclass, without changing behavior).