Versioning and stability policy after 1.0
The major feature of 1.0 is better API stability. Let's help our users understand what we mean by this. I wrote it up a VERSIONING.md file. I treat this as informational rather than strict policy.
@ntamas I need your input here. What's missing:
-
Do we promise not to remove deprecated functions except in major releases? Or do we go for a "continuous improvement" approach where deprecated functions can be removed even in minor release, provided that they were kept in a deprecated state for a while?
With the first alternative (which is currently in these notes) we change functions in major versions only, but this may force major versions more often than ideal. With the second alternative we add new variants of functions with different names, deprecate the old variant, and eventually remove it. This way the library evolves gradually, and a major breaking release is very rarely necessary.
-
What about the SOVERSION? We should have a policy for this, and include information for package maintainers. I never fully understood how SOVERSION worked, and how the linker uses it. There are differences in shared library versioning between platforms (Linux/macOS), so we shouldn't base this solely on Linux practices.
-
Do we want to promise zero changes to experimental functions in patch releases?
Codecov Report
:white_check_mark: All modified and coverable lines are covered by tests.
:white_check_mark: Project coverage is 74.92%. Comparing base (f62f8bb) to head (682a3d6).
:warning: Report is 7 commits behind head on develop.
Additional details and impacted files
@@ Coverage Diff @@
## develop #2643 +/- ##
========================================
Coverage 74.92% 74.92%
========================================
Files 404 404
Lines 76298 76298
Branches 15049 15049
========================================
Hits 57168 57168
Misses 19130 19130
| Files with missing lines | Coverage Δ | |
|---|---|---|
| src/version.c | 100.00% <100.00%> (ø) |
Continue to review full report in Codecov by Sentry.
Legend - Click here to learn more
Δ = absolute <relative> (impact),ø = not affected,? = missing dataPowered by Codecov. Last update f62f8bb...682a3d6. Read the comment docs.
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
Do we promise not to remove deprecated functions except in major releases?
I would definitely be doing that. First, this is consistent with semver. Second, we shouldn't expect users to keep up with every single minor release of igraph, and if we decide to deprecate and then remove a function a few more releases later, this could be confusing for people who suddenly upgrade from a minor version where the function was not deprecated to another minor version where it was already removed.
What about the SOVERSION? We should have a policy for this, and include information for package maintainers.
This is what I have in my personal notes about the SOVERSION:
"The first component has to be bumped for any kind of ABI-breaking change (i.e. when an already compiled and linked piece of code would not run correctly when replacing libigraph.so.X silently behind its back). The second and third components have no strict meaning; I propose using them like semver minor and patch components when possible."
The only thing that's unclear to me is whether we should bump the first number even if we change an experimental function only. Distro maintainers might have automated checks for that.
Do we want to promise zero changes to experimental functions in patch releases?
No; the very point of having experimental functions is to make our job easier to evolve these functions as we please without unnecessary management overhead. I think we can allow ourselves to change these even in patch releases.
OK, it sounds like we're on the same page. The only thing left to figure out is the SOVERSION. I would not bump the major SOVERSION for changes to experimental functions. For example, I'm pretty sure Debian will have a libigraph4 package for igraph 1.0. We don't want them to do a libigraph5 for 1.0.3 ...
Something we could do is increment the second component for additions (just like MINOR), and the third component for any incompatible changes to the experimental interface, but without resetting the third component to zero when the second component increments. If the interpretation of the 2nd and 3rd part is flexible, we can do whatever we want for as long as we're consistent. This way the 2nd part is an indicator for people who don't use experimental functions (to be compared using >=), and the 3rd for those who do (to be compared using ==).
Or we can keep it very simple and make the 2nd and 3rd parts be the same as MINOR and PATCH.
An alternative is to decide that this won't serve any practical purpose (if the linker is not using it automatically) and don't bother ever updating the 2nd and 3rd parts.
Or we can solicit feedback from Linux and BSD maintainers (two sufficiently different systems) about what is useful for them.
@jgmbenoit @limburgher @yurivict Do you have any input on this topic? What sort of versioning is useful for you, as package maintainers on Debian, Fedora and FreeBSD?
Let's make the question simple and practical. We need to decide on how to manage the two versions numbers we set for the igraph shared library in CMake (VERSION is 3-component, SOVERSION is 1-component):
https://github.com/igraph/igraph/blob/997f59ad742892fff199824a248fab382b40f526/src/CMakeLists.txt#L332-L334
SOVERSION and the first digit of VERSION will be the same, and will be incremented for any major release, in accordance with the document I added in this PR. They will both be 4 for igraph 1.0, i.e. one greater than the current value of 3.
Do your operating systems use the other two components of VERSION in any way, and what would be the most useful for you?
Please see the VERSIONING.md document in this PR for an explanation of "experimental" functions in igraph. These are functions with a non-stable API, released for testing purposes, with the intention to eventually stabilize them. Changes to these (which may be frequent) won't affect the SOVERSION, but they could be indicated somehow with the 2nd and 3rd component of VERSION if this is useful.
Thank you for asking! In Fedora, with 0.10.12 on my system, the igraph package contains, among other things: lrwxrwxrwx. 1 root root 14 May 6 19:00 /usr/lib64/libigraph.so -> libigraph.so.3 lrwxrwxrwx. 1 root root 18 May 6 19:00 /usr/lib64/libigraph.so.3 -> libigraph.so.3.1.7 -rwxr-xr-x. 1 root root 2751096 May 6 19:00 /usr/lib64/libigraph.so.3.1.7
The .so is used for compilation, and is a symlink to a symlink to .so.3.17
The package's metadata Provides include: libigraph.so.3()(64bit)
Consuming packages built agains igraph, in this case, python3-igraph, Require libigraph.so.3()(64bit)
So I think what you've describes should work for us.
I have just got a quick view on VERSIONING.md . The following statement will be a source of troubles unless the SONAME increment each time an experimental or internal function is removed from the ABI:
The high-level interfaces of igraph do use both experimental and internal functions. Each high-level interface release is only guaranteed to be compatible with one specific release of C/igraph. As of this writing, this is a concern only for the Python interface, as the other interfaces (R and Mathematica) cannot link dynamically to C/igraph.
An obviously easy way is to keep all experimental and internal functions until the next increment of the SONAME. However, I will encourage to follow the tradition: a clear ABI with neither experimental nor internal functions. For experimentations, may be you can play with release candidate (RC) versions.
I was asked in the MacPorts IRC channel to post some information about macOS library versioning here. It wasn't clear to me what exactly needs explaining, so I guess I'll start by linking to Apple's documentation: https://developer.apple.com/library/archive/documentation/MacOSX/Conceptual/BPFrameworks/Concepts/VersionInformation.html (The title is Framework Versions but most of the information applies equally to dylibs.)
To expand on how CMake on macOS uses the SONAME — it will become part of the library's filename libsomething.$SONAME.dylib, and will also be referenced in the library's ID load command (typically an absolute path that tells the linker what to insert into other libraries when linking against the library).
So let's say the igraph library is installed as /usr/local/lib/libigraph.3.dylib, then any binary that links against that library would copy /usr/local/lib/libigraph.3.dylib (with a compatibility version of 3.0.0 and a current version of 3.1.8) into it's list of required libraries.
Any time you change the SOVERSION, programs that linked against the old one will stop working (if the old library is removed) or require rebuilding against the new version. As long as the SOVERSION is kept the same the loader will load an updated version of the library.
In other words: Keep the SOVERSION stable unless you remove or change symbols in an incompatible way.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.
@szhorvat What if we add a switch at compilation time that removes all experimental functions from the library? Package maintainers can then decide not to include these in the built library so there would be no API breakages. People can still use the experimental functions if they build the library themselves or if they use a higher-level interface that bundles the library statically.
I am not at all in favour of this. I'd hope to keep things simple for our users, and I'd hope to be able to refer users to distros like Ubuntu or Debian. Having different configurations of igraph float around is confusing.
I want people to actually use experimental functions as soon as possible so that we can get the feedback necessary to finalize them.
For the same reason, we absolutely should expose such functions in high-level interfaces. That means that distros should link python-igraph statically to C/igraph. They can do the same with any other package that uses experimental functions, and manage this manually, without having to have a cumbersome enforcement mechanism that disables experimental functions.
I want people to actually use experimental functions as soon as possible so that we can get the feedback necessary to finalize them.
Is this really a good idea ? You may creates here a grey zone that may confuse users. I would rather encourage experienced volunteers to play with an experimental (git) branch.
Yes, given how people use igraph, I'm convinced that this is the right way to go. The primary userbase is researchers who use igraph for analysis, not developers who build software that expects a stable API. If we don't make this functionality easily accessible (i.e. expose it in high-level interfaces), there will be no feedback at all. The expertise needed for useful feedback is in network science / graph theory, and the specifics will often differ from function to function. People who have this expertise are frequently less comfortable and less willing to compile packages from source.
I have experts in Network Science in mind. The software part should be transparent for network scientists. If a grey zone is introduced, this transparency may disappear and the software part may become a burden for users less comfortable with building packages (``With which version may I play ?''). Distributions as Debian are meant for stability with a life cycle counted in years. What you want is quick changes, so quite the opposite. As a scientist I see your point, and I would install my own build of the package and install it the /usr/local' hierarchy; /usris meant to be manged by the distribution. I do that for some TeX packages and I have done that for some scientific libraries. If you look for quick changes and quick responses, you may have a look on how the graph-tool maintainer (who you may know) do that (may be for a different purpose): basically he maintains several repositories (Debian, Ubuntu, ...). I guess that every thing is automatized. Actually I came to maintain aprivate` repository to shortcut the relatively long Debian life cycle: I had done it for igraph for a while, but now I just maintained a stable-backports version for the benefice of every one. There is no ready-to-use way to set up you own Debian repository (I have no clue for non debian-derived distributions). The set-up may take some time, but it will respond to your wish. The end users would have just to add your repository in their apt sources.list(5) (or the equivalent for non debian-derived distributions). In short all the burden would be on your side, at least for the set up.
What practical problem do you encounter in Debian specifically that you want to solve by excluding experimental functions? Let's find a reasonable compromise that alleviates that problem sufficiently.
We need this setup to be able to produce quality software in a dynamic way, without too much red tape. We are already very low on resources and can't afford imposing restrictions that create additional work. Experimental functions reduce workload and reduce stress during development as we don't have to get things perfect the first time.
It is possible to add an __attribute__((warning("Message here")) to function prototypes, although only GCC and Clang support it. This makes it possible for you to identify packages that use experimental functions in igraph, and ensure that that they link to igraph statically. python-igraph will be one of these. Does this suffice for your purposes?
If yes, I propose something along these lines:
#if !defined(IGRAPH_USE_EXPERIMENTAL) && defined(__GNUC__)
#define IGRAPH_EXPERIMENTAL __attribute__((warning("This function is experimental.")))
#else
#define IGRAPH_EXPERIMENTAL /* empty */
#endif
IGRAPH_EXPERIMENTAL is then used in public headers to mark experimental functions.
What this does:
- Structured way to mark experimental functions (not just the docs)
- Easy way for users to know if they are accidentally using experimental functions
- Consistency: Same library with same function on all systems (i.e. experimental functions are not forcibly removed)
- Opt-in to eliminate "experimental" warnings by defining
IGRAPH_USE_EXPERIMENTALbefore#include <igraph.h>
What practical problem do you encounter in Debian specifically that you want to solve by excluding experimental functions?
Dynamic libraries must have an increasing table of functions. That is, if a function is removed or modified the SO major version must be incremented. This allows to upgrade softly. An easy way would be to keep the names of the experimental functions and to remove the dead ones at the next major release f the library. This could be the library part of your approach.
The igraph python Debian package is currently linked against the C igraph library: this is highly encourage by Debian but not mandatory. We can reverse this, but I will not encourage to do so.
Having say that, my guess is that you are not the first one to come to this approach. On my side I have no example to furnish: my ``model'' when I create a C library for my own usage is GSL and I write a lot of tests to check the good design of the implemented algorithm and of the API. But I am dealing with well establised mathematics, not with mathematics on the edge. I agree to say that this model might be somehow obsolete in the sense that it was created at a pioneering time far before the git collaborative age. Via Google I could see that boost has an experimental part as well as the GCC libc++ library. This leads me to think that there is some kind of traditional usage to manage experimental features. And that this traditional usage is certainly supported by the Debian machinery (dh(1)). Do you have examples to furnish on your side ?