atomate2 icon indicating copy to clipboard operation
atomate2 copied to clipboard

Atomate2 OpenMM integration & broader classical MD framework

Open orionarcher opened this issue 11 months ago • 12 comments

Developed by @xperrylinn and @orionarcher

Summary

This PR builds out support for OpenMM with a framework that could be extended to support other MD codes. Namely LAMMPS, Amber, and Gromacs. Rough visual example here

Core ideas:

  • Use the openff.interchange.Interchange object as a core engine-agnostic representation of an MD simulation.
  • Define a generic ClassicalMDTaskDocument and generic Interchange creation functions, then hand off responsibility for evolving the simulation to workflows for each MD engine. Currently only OpenMM support is implemented.
  • Support low-throughput workflows by making workflows that can easily output to local directories.
  • Use prev_task to pass meta-data between jobs. Since OpenMM is programmatic, the core information is not stored in the local directory.
  • No InputSets. Since OpenMM is programmatic, InputSets and InputGenerators don't make conceptual sense.

What's implemented:

  • Integration of Atomate2 and OpenFF for classical MD simulations
  • Support for energy minimization, NPT, NVT, and temperature change jobs
  • Implementation of annealing and production workflows
  • Serialization and deserialization of OpenFF objects (Molecule, Topology, Interchange, Quantity) with monty
  • Utility functions for generating OpenFF Interchange objects and creating molecule specifications

To do / open questions:

  • [x] Need to add create conditional imports that fail nicely, where are some examples of this?

Future PR

  • [x] Store trajectories in additional_stores, if available
  • [x] The core openff.interchange.Interchange object can be larger than 16MB MongoDB doc limit. Need to implement a workaround by zipping the object and/or putting it on S3.
  • [x] migrate schema dependencies to emmet
  • [x] migrate utility dependencies to pymatgen

Related PRs:

  • PR to pymatgen would move most of atomate2.classical_md.utils upstream. PR #3729
  • PR to emmet would move atomate2.classical_md.schemas upstream. PR #975

Example usage

mol_specs_dicts = [
    {"smile": "CCO", "count": 10, "name": "ethanol"},
    {"smile": "O", "count": 200, "name": "water"},
]
inter_job = generate_interchange(mol_specs_dicts, mass_density=1)

production_maker = ProductionMaker(
    energy_maker=EnergyMinimizationMaker(),
    npt_maker=NPTMaker(steps=300000, pressure=1),
    anneal_maker=AnnealMaker.from_temps_and_steps(
        steps=1000000, anneal_temp=400, final_temp=300
    ),
    nvt_maker=NVTMaker(steps=5000000),
)

production_flow = production_maker.make(
    inter_job.output.interchange, 
    output_dir="my/directory/"
    prev_task=inter_job.output
)

wf = Flow([inter_job, production_flow])

run_locally(wf)

# will put outputs for the whole flow in "my/directory/"

Additional dependencies introduced

  • openmm
  • openff-toolkit
  • openff-interchange
  • packmol
  • openbabel

These are all necessary for the classical MD setup and execution workflow.

Checklist

Before a pull request can be merged, the following items must be checked:

  • [x] Code is in the standard Python style. The easiest way to handle this is to run the following in the correct sequence on your local machine. Start with running ruff and ruff format on your new code. This will automatically reformat your code to PEP8 conventions and fix many linting issues.
  • [x] Doc strings have been added in the Numpy docstring format. Run ruff on your code.
  • [x] Type annotations are highly encouraged. Run mypy to type check your code.
  • [x] Tests have been added for any new functionality or bug fixes.
  • [x] All linting and tests pass.

Note that the CI system will run all the above checks. But it will be much more efficient if you already fix most errors prior to submitting the PR. It is highly recommended that you use the pre-commit hook provided in the repository. Simply run pre-commit install and a check will be run prior to allowing commits.

orionarcher avatar Mar 20 '24 22:03 orionarcher

Hey @utf, this PR supercedes #717 and is now ready for feedback. Testing is passing locally but I need to fix some issues with the CI. I also need to flesh out the documentation/tutorials.

@rkingsbury might be of interest to you as well!

orionarcher avatar Mar 20 '24 22:03 orionarcher

Codecov Report

Attention: Patch coverage is 91.50943% with 45 lines in your changes missing coverage. Please review.

Project coverage is 77.71%. Comparing base (256b39a) to head (8ed3ab1). Report is 48 commits behind head on main.

:exclamation: Current head 8ed3ab1 differs from pull request most recent head f669907

Please upload reports for the commit f669907 to get more accurate results.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #782      +/-   ##
==========================================
+ Coverage   77.03%   77.71%   +0.68%     
==========================================
  Files         122      133      +11     
  Lines        9108     9638     +530     
  Branches     1429     1499      +70     
==========================================
+ Hits         7016     7490     +474     
- Misses       1663     1702      +39     
- Partials      429      446      +17     
Files Coverage Δ
src/atomate2/classical_md/__init__.py 100.00% <100.00%> (ø)
src/atomate2/classical_md/openmm/flows/__init__.py 100.00% <100.00%> (ø)
src/atomate2/classical_md/openmm/jobs/__init__.py 100.00% <100.00%> (ø)
...c/atomate2/classical_md/openmm/schemas/__init__.py 100.00% <100.00%> (ø)
src/atomate2/classical_md/openmm/schemas/tasks.py 100.00% <100.00%> (ø)
src/atomate2/classical_md/core.py 93.33% <93.33%> (ø)
src/atomate2/classical_md/openmm/jobs/core.py 91.80% <91.80%> (ø)
src/atomate2/classical_md/schemas.py 84.84% <84.84%> (ø)
src/atomate2/classical_md/openmm/flows/core.py 84.09% <84.09%> (ø)
src/atomate2/classical_md/openmm/jobs/base.py 94.06% <94.06%> (ø)
... and 1 more

... and 1 file with indirect coverage changes

codecov[bot] avatar Mar 22 '24 23:03 codecov[bot]

The OpenForceField toolkit can only be installed via conda, making the current testing framework unsuitable.

I initially tried migrating the whole test suite to use micromamba instead of pip. I got almost all of the tests passing but could not debug the tests for forcefields. Since I wasn't sure you'd want to migrate even if it worked, I decided to drop that line of work. Instead, I decided to exclude the classical_md tests from the test action and create a new action called test_classical_md that uses micromamba to install dependencies and then only executes the classical_md tests. This seems to be working.

orionarcher avatar Mar 22 '24 23:03 orionarcher

I am currently working on a builder for the classical_md code. I plan to stage the builder in atomate2 for now. Long term, I plan to migrate most of the code in utils.py to pymatgen and the builder code and TaskDoc code to emmet, mirroring the organization for the rest of the MP codes.

That said, I'd love to get this merged soon so that the builder code can be reviewed separately.

EDIT: I've decided to develop the builder code in an emmet PR instead. For now, I've made a PR for the schemas only here.

orionarcher avatar Mar 23 '24 20:03 orionarcher

As far as I can tell the failing tests are upstream and not due to this PR.

orionarcher avatar Mar 29 '24 22:03 orionarcher

Looks like a great contribution @orionarcher . I'm especially glad you included a tutorial as part of the PR. Is there any aspect you or @utf would find helpful for me to review specifically?

rkingsbury avatar Apr 05 '24 21:04 rkingsbury

From @rkingsbury, I'd love it if you could take a look at the pydantic schemas (now moved to emmet, here). Your MD experience would add a useful perspective.

@utf, it'd be very helpful to get your feedback on the changes to testing.yml. I added a new action and want to make sure it looks good to you.

Overall, I think this is pretty much done and could be merged. The outstanding features aren't critical and could be added in a later PR.

orionarcher avatar Apr 07 '24 18:04 orionarcher

From @rkingsbury, I'd love it if you could take a look at the pydantic schemas (now moved to emmet, here). Your MD experience would add a useful perspective.

Done! I left a review on the already merged PR here - https://github.com/materialsproject/emmet/pull/975#pullrequestreview-1994562429

rkingsbury avatar Apr 11 '24 15:04 rkingsbury

@utf would you mind taking a look at this?

I broke compatibility between emmet and atomate2 in the most recent commits but I will resolve it soon. If you'd prefer to look at something that is passing, please consider commit d0683c9 which has all of the core functionality.

I am starting to write builders in emmet and I'd like to make that no structural changes are required.

orionarcher avatar May 10 '24 18:05 orionarcher

Hi @utf, I've addressed several of the points you mentioned and am still mulling over the best solution for others. In the meantime, I'd appreciate your guidance on three issues.

  1. I migrated the testing to use micromamba and uv. This cuts the "Install dependencies" step from ~150 seconds to ~50 seconds. Further, the migration only broke a single cclib-related-test. If you could take a look or tag the appropriate code owner, that'd be much appreciated, it's outside my area of expertise. For now, I am ignoring the classical_md tests because they are dependent on upstream changes in emmet that have not been merged in.

  2. Currently, calcs_reversed is being used to store all calculations that preceded the task, which is not the intended behavior. I want to change that while preserving the job's history, ideally by storing all previous job_uuid. Is that possible? Can I access the job_uuid from inside the maker and store it in the TaskDoc? If not, how should I link the job to previous jobs?

  3. Where would be the best place to include a tutorial? I put it in /tutorials along with blob_storage.ipynb. That directory is also being tested in the CI. If there is a better place, or if the whole tutorials directory should be moved, lmk.

orionarcher avatar Jul 26 '24 20:07 orionarcher

I've now addressed all the points you brought up in your review. Most significantly:

  • splitting classical_md into openmm and openff.
  • replacing history tracking using calcs_reversed with a final collect_jobs job that gathers uuids and calcs
  • reworking the CI and dependencies

Once emmet is adjusted and tests are passing, this will be ready to merge! Guidance on the failing test and tutorial would still be appreciated.

orionarcher avatar Jul 30 '24 22:07 orionarcher

The failing test is very unusual - I wonder if @Andrew-S-Rosen has any ideas as it's related to cclib.

Have you tested installing the dependencies with pip rather than uv?

Uh... weird! Both main and this PR are using the same version of cclib, which hasn't been updated since March. I'm trying to find the source of the issue but am not successful thus far. [I recognize that the cclib interface isn't really used by anything despite my best intentions there, so it would be a shame for that to hold up a PR...].

Andrew-S-Rosen avatar Jul 31 '24 16:07 Andrew-S-Rosen

@utf is the best way forward to just disable the test here?

orionarcher avatar Sep 10 '24 02:09 orionarcher

And thank you for looking into it @Andrew-S-Rosen

orionarcher avatar Sep 10 '24 17:09 orionarcher

~@utf I believe I've addressed all comments and this is ready to merge.~

EDIT: hold off actually, I forgot that I disabled the openmm tests because a new emmet release is needed. Pinging the MP team.

orionarcher avatar Sep 11 '24 13:09 orionarcher

I've pinned the emmet pre-release to get tests passing. @utf are you good to merge this once the new emmet version is released and tested against? If so, maybe @janosh can merge once that happens so you don't need to circle back.

orionarcher avatar Sep 20 '24 15:09 orionarcher

This is now pinned to the most recent version of emmet and ready to merge @utf.

Your previous suggestion to split out OpenMM and OpenFF was very good. They are now independent and the OpenMM workflows support MLFFs and can return structures to better interoperate with the rest of Atomate2.

orionarcher avatar Sep 22 '24 17:09 orionarcher

This really is fantastic. Thank you very much @orionarcher. @janosh, once you're happy I think we can merge.

utf avatar Sep 23 '24 18:09 utf

Amazing work, thanks for sticking with this @orionarcher !

rkingsbury avatar Sep 24 '24 19:09 rkingsbury