mdanalysis
mdanalysis copied to clipboard
Add upper pins for some core dependencies on deployment
Is your feature request related to a problem?
One of the common issues with MDA is that knowing what dependency versions are likely to work for a given historical version of the library can be rather difficult.
Now that we have a defined pyproject.toml, ABI issues aren't likely to happen, however deprecated features can still be a problem.
With MDAnalysis' now 3 monthly release schedule, and the fact that we test weekly on the nightly builds of scipy and numpy, would it make sense to be a little bit stricter on how versions are defined in setup.py?
Describe the solution you'd like
In the first instance I would advocate for solely setting the maximum versions of numpy and scipy at latest + 1 minor on release. In my opinion, these are the only two dependencies we have where we regularly have to check for deprecations (hence the weekly runner using their nightly builds).
They are also the libraries that folks tend to set pins around, so this will lead to an implicit maximum pin for other packages.
Describe alternatives you've considered
Alternatively we leave things as they are, it's not so much of an issue that we suffer too much.
I'd be rather interested to hear @tylerjereddy's views here.
So, SciPy does place upper bounds on most dependencies in tagged releases, including the allowed version of Python itself. We allow current NumPy + some constant number of additional NumPy releases until the normal deprecation cycle would turn into a feature removal (I think 2 releases, then they're allowed to/by convention remove after that?).
I won't lie though--this has caused considerable controversy. For example, there have been arguments that newer versions of Python are rarely incompatible with SciPy, so we're imposing unnecessary burdens on developers in various scenarios (some of which I understand, some of which are a bit confusing with resolvers).
For some (probably confusing) issue discussions about pip-related version pins see i.e.,:
- https://github.com/scipy/scipy/issues/14738
- https://github.com/scipy/scipy/pull/12862
I guess for MDAnalysis, when dealing with other/smaller dependencies, one could consider something similar to semantic versioning, where you allow the current version at release time + any patch-level releases, but not any feature releases. Of course, the Python ecosystem isn't too strict on semver policies I don't think. The Rust ecosystem seems quite organized with cargo
and lock files.
I'm hesitant to offer specific advice, but just wanted to mention that there are tradeoffs of course, which you probably realize. Some kind of bound on NumPy/SciPy as a start may be good. One side effect could be that after a few releases, if someone tries to install MDAnalysis with i.e., a bleeding-edge NumPy pip
might go looking for an ancient version of MDAnalysis before you added the pins. I think we did something to address that in SciPy, like an sdist
release somewhere without the pin to prevent pip
from looking backwards too far or something like that.
Anyway, that's the "spirit" of the complexity I think.. even if what I said isn't quite fully accurate.
So if the plane carrying all the mda devs crashed, it would be annoying if the package stopped working because we decided that numpy 1.(n+1) wouldn't work despite having not seen it.
I think trusting semver and pinning to a major version is sane though.
I'm putting this down as something to target for 2.4.0.
It's a good opportunity to review our pins and also complete this: https://github.com/MDAnalysis/mdanalysis/pull/3528
I'm re-targetting this for 2.5.0
closing as we already do this on deployment