mdanalysis icon indicating copy to clipboard operation
mdanalysis copied to clipboard

Cache creation of compound masks

Open PicoCentauri opened this issue 1 year ago • 6 comments

Changes made in this Pull Request:

Add a cache for creating the compound masks in a group. This speeds up the calculations of accumulate other methods that use _split_by_compound_indices:

u = mda.Universe(TPR, XTC)

Running a simple timing test with caching gives on my machine 397 µs ± 1.55 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

%%timeit 
u.atoms.accumulate("masses", compound="residues")

and with "disabled" cache 1.16 ms ± 11 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

%%timeit 
u.atoms.accumulate("masses", compound="residues")
u.atoms._cache.pop("residues_masks")

PR Checklist

  • [x] Tests?
  • [ ] Docs?
  • [ ] CHANGELOG updated?
  • [ ] Issue raised/referenced?

Developers certificate of origin


📚 Documentation preview 📚: https://mdanalysis--4612.org.readthedocs.build/en/4612/

PicoCentauri avatar Jun 05 '24 17:06 PicoCentauri

Hello @PicoCentauri! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! :beers:

Comment last updated at 2024-06-05 17:36:21 UTC

pep8speaks avatar Jun 05 '24 17:06 pep8speaks

Linter Bot Results:

Hi @PicoCentauri! Thanks for making this PR. We linted your code and found the following:

Some issues were found with the formatting of your code.

Code Location Outcome
main package ⚠️ Possible failure
testsuite ⚠️ Possible failure

Please have a look at the darker-main-code and darker-test-code steps here for more details: https://github.com/MDAnalysis/mdanalysis/actions/runs/9388722067/job/25854613962


Please note: The black linter is purely informational, you can safely ignore these outcomes if there are no flake8 failures!

github-actions[bot] avatar Jun 05 '24 17:06 github-actions[bot]

This is related to #3169 but does different caching as proposed in #3169.

PicoCentauri avatar Jun 05 '24 17:06 PicoCentauri

@richardjgowers could you please have a look at this PR, looks like your wheelhouse.

If you don't have the bandwidth please un-assign yourself and let me know. Thanks!

orbeckst avatar Jun 11 '24 04:06 orbeckst

this failure here:

Mismatched elements: 2 / 9 (22.2%)
  Max absolute difference: 55.42303448
  Max relative difference: 26.25439738
   x: array([[21.286, 41.664, 40.465],
         [44.528, 43.426, 23.248],
         [57.534, 27.871, 53.767]])
   y: array([[21.286, 41.664, 40.465],
         [44.528, 43.426, 78.671],
         [ 2.111, 27.871, 53.767]], dtype=float32)

the x coordinate is off by 55.423 where the boxlength on that dimension is 55.423, so that just looks like a rounding error. Those two x coordinates are virtually identical, and this test doesn't seem to be trying to establish which image gets picked, it's just bad luck that this regression test could fall into either. (arguably there could be some sort of periodic assert coordinates equal function for things like this...)

The moment of inertia error is a little more tricky to pin down in a similar way, but I'd be suspicious that it is also a rounding error

I got these tests to fail locally, and then they started passing locally, so it looks a bit twitchy...

richardjgowers avatar Jun 25 '24 13:06 richardjgowers

Thanks for looking into this @richardjgowers. I can also try to debug why the .center_of_mass() doens't seem to create the cache entry.

PicoCentauri avatar Jun 26 '24 07:06 PicoCentauri