mdanalysis icon indicating copy to clipboard operation
mdanalysis copied to clipboard

Integrate the result of the Distopia project into MDAnalysis

Open hmacdope opened this issue 3 years ago • 3 comments
trafficstars

Over in the Distopia project https://github.com/MDAnalysis/distopia we have created a bunch of code for calculating distances efficiently under periodic boundary conditions using SIMD. We should take the fruits of our labour and incorporate it into MDA!

As such this issue is a long term reminder to integrate with the result. My current thoughts are to have distopia as an optional dependency for MDAnalysis as a drop in monkey-patch for some of the distance code.

This is also a good place to discuss requirements for distopia integration on both the MDA and distopia side. Currently distopia uses some C++ 17 features which I think we will have to trim. It also requires a CMake build system. I am not so sure how well that integrates with Python packaging but perhaps scikit-build can help with this aspect. I also suspect we will only be able to support x86_64 SIMD at least in the short term

@richardjgowers and myself have also been discussing a potential re-work or partial fork of distopia to encourage simplicity and reduce the overall size and complexity of the package. I would love to hear peoples opinions and will keep this thread updated with any important discussion points.

hmacdope avatar Aug 22 '22 08:08 hmacdope

So my main questions are:

  • Are there any OS / compiler limitations? (i.e. will distopia work on Windows?)
  • Where are you planning on running CI for this?

The former doesn't matter too much. The latter is most important for me. We've already got a pretty heavy CI workflow, so any additional complexity needed here needs to be well scoped. I'm happy to get CI/CD and packaging done on the side of the core MDA library, but you'll get more complaints from me if it means having to rework CI/CD from scratch.

Currently distopia uses some C++ 17 features which I think we will have to trim.

If it's optional then it's not a problem to use newer C++ features imho. You're looking at what, gcc >= 7.0?

It also requires a CMake build system. I am not so sure how well that integrates with Python packaging

That's fine if you're deploying via conda-forge. PyPi will be harder but also at the same time that's probably not your normal target for a C++ orientated project. I can help with that step when you get to it.

IAlibay avatar Aug 22 '22 14:08 IAlibay

I should note, running CI on a cron job on the distopia side of things might not be feasible, iirc the current gh actions limitations means that CI will stop running after 3 weeks if the code hasn't been touched.

IAlibay avatar Aug 22 '22 14:08 IAlibay

Distopia supports gcc and clang but not MSVC (https://github.com/MDAnalysis/distopia/issues/41), so no windows support as of yet unless done on MinGW or some other trickery.

Im imagining an optional dependency so similar CI/CD burden to things like h5py, pytng etc etc. Distopia has its own CI in its own repo that tests for a match with mdanalysis.lib.distances so no need for super extensive validation tests IMHO. Does that answer the question?

Distopia does have a setup.py (currently not functional) so I imagine python packaging won't be the worst thing ee\ver.

hmacdope avatar Aug 23 '22 02:08 hmacdope

Update on the above Distopia is at 0.1.0 now and is probably ready to be incorporated into MDA. Just currently waiting on a conda recipe to be built conda-forge/staged-recipes#20804

hmacdope avatar Oct 26 '22 05:10 hmacdope

Thanks to @IAlibay for all the help with CI.

hmacdope avatar Oct 26 '22 05:10 hmacdope

We now have conda packages! https://github.com/conda-forge/distopia-feedstock

hmacdope avatar Nov 02 '22 21:11 hmacdope

@MDAnalysis/coredevs more open question is that should distopia provide a **drop-in replacement ** for the distance functions if installed or a selectable backend for the distance functions. The more I'm thinking about it the more a selectable backend makes sense.

hmacdope avatar Nov 13 '22 22:11 hmacdope

Selectable makes sense for reproducibility, given that distopia is only x86 IIRC — so if you want to be 100% sure that you're running the same code on x86 and whatever else (ARM M1?) then you should have a choice.

orbeckst avatar Nov 14 '22 16:11 orbeckst