mdanalysis icon indicating copy to clipboard operation
mdanalysis copied to clipboard

MAINT: shims for array copy semantics

Open tylerjereddy opened this issue 1 year ago • 7 comments
trafficstars

  • Fixes #4481

  • this gets rid of segfaults/many test failurse against latest NumPy main while still passing the full suite against NumPy 1.26.4; it is pretty hard to test since needs special branch of SciPy (for the same reason, upstream hasn't fully resolved copy semantics)

  • I still see 2-3 test failures against main locally, but likely still a large improvement; may want to let this sit a little bit while the upstream storm settles, but I suspect this is the gist of the shims we'd want

One thing I didn't try was building against NumPy main and testing against 1.26.4, just did each completely separately (that's a good thing to try though). Upstream pre-release wheels will probably be delayed reflecting the related issues for a bit.

Getting late, so short explanation--copy=False got stricter, copy=None is the "try your best not to copy, but you can if you need to" old NumPy way.


📚 Documentation preview 📚: https://mdanalysis--4482.org.readthedocs.build/en/4482/

tylerjereddy avatar Mar 02 '24 04:03 tylerjereddy

Linter Bot Results:

Hi @tylerjereddy! Thanks for making this PR. We linted your code and found the following:

There are currently no issues detected! 🎉

github-actions[bot] avatar Mar 02 '24 04:03 github-actions[bot]

Codecov Report

Attention: Patch coverage is 69.23077% with 4 lines in your changes are missing coverage. Please review.

Project coverage is 89.52%. Comparing base (2c1aa4b) to head (ff93d70). Report is 14 commits behind head on develop.

Files Patch % Lines
package/MDAnalysis/lib/util.py 20.00% 4 Missing :warning:
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #4482      +/-   ##
===========================================
- Coverage    93.38%   89.52%   -3.86%     
===========================================
  Files          171      180       +9     
  Lines        21744    22333     +589     
  Branches      4014     3619     -395     
===========================================
- Hits         20305    19993     -312     
- Misses         952     1891     +939     
+ Partials       487      449      -38     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Mar 02 '24 04:03 codecov[bot]

@hmacdope can you please have a look? Keeping on top of scipy/numpy changes is important. Thanks!

orbeckst avatar Mar 11 '24 20:03 orbeckst

I should update this soon to reduce code duplication and test it against the beta release if it comes out today or tomorrow.

tylerjereddy avatar Mar 11 '24 20:03 tylerjereddy

I think Sebastian also told me to use asarray instead of array in most of these cases. There used to be a good performance reason to use plain np.array() but I think that is long gone now.

tylerjereddy avatar Mar 11 '24 20:03 tylerjereddy

Hello @tylerjereddy! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 519:80: E501 line too long (80 > 79 characters)

Comment last updated at 2024-04-07 20:50:51 UTC

pep8speaks avatar Mar 12 '24 19:03 pep8speaks

I pushed in an update to reduce code duplication and use a more formally correct __array__ signature.

I checked the status of the updated PR vs. latest develop for full testsuite against the pre-release wheels for NumPy/the ecosystem. The segfaults are gone even for develop (NumPy made things a bit more backcompat friendly in time for the beta), but there are 9 test failures for develop on x86_64 Linux at the moment. On this branch, that drops down to 2 failures:

FAILED MDAnalysisTests/analysis/test_helix_analysis.py::TestHELANAL::test_residue_gaps_split - assert 295 == 1
FAILED MDAnalysisTests/analysis/test_helix_analysis.py::TestHELANAL::test_residue_gaps_no_split - assert 99 == 1

From a quick scan of the diff again, the only other place where I think I'm misbehaving a bit is in using a raw string version check instead of using np.lib.NumpyVersion for better future proofing/maintainability.

tylerjereddy avatar Mar 12 '24 19:03 tylerjereddy

@tylerjereddy is the plan to move ahead here before addressing ^ test failures?

hmacdope avatar Mar 19 '24 21:03 hmacdope

Possibly yeah, I didn't have immediate plans to tackle those, and may be quite different if they're real failures we need to worry about vs. some third-party issues (I didn't check).

tylerjereddy avatar Mar 19 '24 21:03 tylerjereddy

Note that NumPy 2.0.0rc1 (ABI stable) is now imminent since the new pybind11 release is out. I'll be under a fair bit of pressure to get the supporting SciPy 1.13.0 RC2 and "final" release out within a few days (at most) of the NumPy RC1 posting to PyPI. I don't think MDA faces the same pressure, but we should retest this branch against NumPy RC1 as well once it is posted very soon, since that version has stability guarantees that the beta didn't (and can be used to build wheels to post on PyPI for NumPy 2 support).

tylerjereddy avatar Mar 29 '24 17:03 tylerjereddy

Ecosystem support summary table is here: https://github.com/numpy/numpy/issues/26191

I added us at the bottom below OpenCV. I need to circle back here soon to check against RC1, ping me if I don't.

tylerjereddy avatar Apr 06 '24 00:04 tylerjereddy

Thanks @tylerjereddy, much appreciated.

hmacdope avatar Apr 06 '24 04:04 hmacdope

Update here:

  • The latest MDAnalysis develop branch has 96 failures and 430 errors when built and tested against the ABI-stable NumPy 2.0.0rc1 on x86_64 Linux locally.
  • With the commit I just pushed in, the feature branch here passes the full MDA testsuite when built and tested against NumPy 2.0.0rc1 on x86_64 Linux locally.
  • I also confirmed that the full suite passes on this feature branch when we build MDA against NumPy 2.0.0rc1, then downgrade at run/test time to NumPy 1.23.2, so backwards compat seems to be working as expected for us.

I believe I've addressed both my own comments and the reviewer comments so far, and we appear to be in good shape for NumPy 2.x series, at least on x86_64 Linux.

tylerjereddy avatar Apr 07 '24 20:04 tylerjereddy

I think codecov was removed from a number of upstream projects because of flakiness/security concerns that kept cropping up. MDA might want to consider doing the same (as much as I hate to see removal of a method of measuring test coverage).

tylerjereddy avatar Apr 11 '24 17:04 tylerjereddy

I think codecov was removed from a number of upstream projects because of flakiness/security concerns that kept cropping up. MDA might want to consider doing the same (as much as I hate to see removal of a method of measuring test coverage).

Yeah it's been a bit annoying lately - issue has been opened and temporary fix PRed. I'm going to go ahead and merge this, CI passed fine outside of report uploading. Thanks @tylerjereddy !

IAlibay avatar Apr 21 '24 14:04 IAlibay

NumPy 2.0 is out now, and it looks like there is no new release of MDAnalysis with this PR included yet. Will the next release be 2.7.1 or 2.8.0?

rgommers avatar Jun 18 '24 07:06 rgommers

@rgommers the plan is to do a v2.8.0 release of MDAnalysis that is numpy 2.0 compatible somewhere towards the end of July. Due to limitations in available developers, it's the earliest we can do this.

IAlibay avatar Jun 18 '24 14:06 IAlibay