atomate2
atomate2 copied to clipboard
Atomate2 OpenMM integration & broader classical MD framework
Developed by @xperrylinn and @orionarcher
Summary
This PR builds out support for OpenMM with a framework that could be extended to support other MD codes. Namely LAMMPS, Amber, and Gromacs. Rough visual example here
Core ideas:
- Use the
openff.interchange.Interchange
object as a core engine-agnostic representation of an MD simulation. - Define a generic
ClassicalMDTaskDocument
and genericInterchange
creation functions, then hand off responsibility for evolving the simulation to workflows for each MD engine. Currently only OpenMM support is implemented. - Support low-throughput workflows by making workflows that can easily output to local directories.
- Use
prev_task
to pass meta-data between jobs. Since OpenMM is programmatic, the core information is not stored in the local directory. - No
InputSets
. Since OpenMM is programmatic,InputSets
andInputGenerators
don't make conceptual sense.
What's implemented:
- Integration of Atomate2 and OpenFF for classical MD simulations
- Support for energy minimization, NPT, NVT, and temperature change jobs
- Implementation of annealing and production workflows
- Serialization and deserialization of OpenFF objects (Molecule, Topology, Interchange, Quantity) with
monty
- Utility functions for generating OpenFF Interchange objects and creating molecule specifications
To do / open questions:
- [x] Need to add create conditional imports that fail nicely, where are some examples of this?
Future PR
- [x] Store trajectories in
additional_stores
, if available - [x] The core
openff.interchange.Interchange
object can be larger than 16MB MongoDB doc limit. Need to implement a workaround by zipping the object and/or putting it on S3. - [x] migrate schema dependencies to emmet
- [x] migrate utility dependencies to pymatgen
Related PRs:
- PR to
pymatgen
would move most ofatomate2.classical_md.utils
upstream. PR #3729 - PR to
emmet
would moveatomate2.classical_md.schemas
upstream. PR #975
Example usage
mol_specs_dicts = [
{"smile": "CCO", "count": 10, "name": "ethanol"},
{"smile": "O", "count": 200, "name": "water"},
]
inter_job = generate_interchange(mol_specs_dicts, mass_density=1)
production_maker = ProductionMaker(
energy_maker=EnergyMinimizationMaker(),
npt_maker=NPTMaker(steps=300000, pressure=1),
anneal_maker=AnnealMaker.from_temps_and_steps(
steps=1000000, anneal_temp=400, final_temp=300
),
nvt_maker=NVTMaker(steps=5000000),
)
production_flow = production_maker.make(
inter_job.output.interchange,
output_dir="my/directory/"
prev_task=inter_job.output
)
wf = Flow([inter_job, production_flow])
run_locally(wf)
# will put outputs for the whole flow in "my/directory/"
Additional dependencies introduced
- openmm
- openff-toolkit
- openff-interchange
- packmol
- openbabel
These are all necessary for the classical MD setup and execution workflow.
Checklist
Before a pull request can be merged, the following items must be checked:
- [x] Code is in the standard Python style.
The easiest way to handle this is to run the following in the correct sequence on
your local machine. Start with running
ruff
andruff format
on your new code. This will automatically reformat your code to PEP8 conventions and fix many linting issues. - [x] Doc strings have been added in the Numpy docstring format. Run ruff on your code.
- [x] Type annotations are highly encouraged. Run mypy to type check your code.
- [x] Tests have been added for any new functionality or bug fixes.
- [x] All linting and tests pass.
Note that the CI system will run all the above checks. But it will be much more
efficient if you already fix most errors prior to submitting the PR. It is highly
recommended that you use the pre-commit hook provided in the repository. Simply run
pre-commit install
and a check will be run prior to allowing commits.
Hey @utf, this PR supercedes #717 and is now ready for feedback. Testing is passing locally but I need to fix some issues with the CI. I also need to flesh out the documentation/tutorials.
@rkingsbury might be of interest to you as well!
Codecov Report
Attention: Patch coverage is 91.50943%
with 45 lines
in your changes missing coverage. Please review.
Project coverage is 77.71%. Comparing base (
256b39a
) to head (8ed3ab1
). Report is 48 commits behind head on main.
:exclamation: Current head 8ed3ab1 differs from pull request most recent head f669907
Please upload reports for the commit f669907 to get more accurate results.
Additional details and impacted files
@@ Coverage Diff @@
## main #782 +/- ##
==========================================
+ Coverage 77.03% 77.71% +0.68%
==========================================
Files 122 133 +11
Lines 9108 9638 +530
Branches 1429 1499 +70
==========================================
+ Hits 7016 7490 +474
- Misses 1663 1702 +39
- Partials 429 446 +17
Files | Coverage Δ | |
---|---|---|
src/atomate2/classical_md/__init__.py | 100.00% <100.00%> (ø) |
|
src/atomate2/classical_md/openmm/flows/__init__.py | 100.00% <100.00%> (ø) |
|
src/atomate2/classical_md/openmm/jobs/__init__.py | 100.00% <100.00%> (ø) |
|
...c/atomate2/classical_md/openmm/schemas/__init__.py | 100.00% <100.00%> (ø) |
|
src/atomate2/classical_md/openmm/schemas/tasks.py | 100.00% <100.00%> (ø) |
|
src/atomate2/classical_md/core.py | 93.33% <93.33%> (ø) |
|
src/atomate2/classical_md/openmm/jobs/core.py | 91.80% <91.80%> (ø) |
|
src/atomate2/classical_md/schemas.py | 84.84% <84.84%> (ø) |
|
src/atomate2/classical_md/openmm/flows/core.py | 84.09% <84.09%> (ø) |
|
src/atomate2/classical_md/openmm/jobs/base.py | 94.06% <94.06%> (ø) |
|
... and 1 more |
The OpenForceField toolkit can only be installed via conda, making the current testing framework unsuitable.
I initially tried migrating the whole test suite to use micromamba
instead of pip. I got almost all of the tests passing but could not debug the tests for forcefields
. Since I wasn't sure you'd want to migrate even if it worked, I decided to drop that line of work. Instead, I decided to exclude the classical_md
tests from the test
action and create a new action called test_classical_md
that uses micromamba
to install dependencies and then only executes the classical_md
tests. This seems to be working.
I am currently working on a builder for the classical_md code. I plan to stage the builder in atomate2
for now. Long term, I plan to migrate most of the code in utils.py
to pymatgen
and the builder code and TaskDoc code to emmet
, mirroring the organization for the rest of the MP codes.
That said, I'd love to get this merged soon so that the builder code can be reviewed separately.
EDIT: I've decided to develop the builder code in an emmet PR instead. For now, I've made a PR for the schemas only here.
As far as I can tell the failing tests are upstream and not due to this PR.
Looks like a great contribution @orionarcher . I'm especially glad you included a tutorial as part of the PR. Is there any aspect you or @utf would find helpful for me to review specifically?
From @rkingsbury, I'd love it if you could take a look at the pydantic schemas (now moved to emmet, here). Your MD experience would add a useful perspective.
@utf, it'd be very helpful to get your feedback on the changes to testing.yml
. I added a new action and want to make sure it looks good to you.
Overall, I think this is pretty much done and could be merged. The outstanding features aren't critical and could be added in a later PR.
From @rkingsbury, I'd love it if you could take a look at the pydantic schemas (now moved to emmet, here). Your MD experience would add a useful perspective.
Done! I left a review on the already merged PR here - https://github.com/materialsproject/emmet/pull/975#pullrequestreview-1994562429
@utf would you mind taking a look at this?
I broke compatibility between emmet and atomate2 in the most recent commits but I will resolve it soon. If you'd prefer to look at something that is passing, please consider commit d0683c9 which has all of the core functionality.
I am starting to write builders in emmet and I'd like to make that no structural changes are required.
Hi @utf, I've addressed several of the points you mentioned and am still mulling over the best solution for others. In the meantime, I'd appreciate your guidance on three issues.
-
I migrated the testing to use micromamba and uv. This cuts the "Install dependencies" step from ~150 seconds to ~50 seconds. Further, the migration only broke a single cclib-related-test. If you could take a look or tag the appropriate code owner, that'd be much appreciated, it's outside my area of expertise. For now, I am ignoring the
classical_md
tests because they are dependent on upstream changes inemmet
that have not been merged in. -
Currently,
calcs_reversed
is being used to store all calculations that preceded the task, which is not the intended behavior. I want to change that while preserving the job's history, ideally by storing all previousjob_uuid
. Is that possible? Can I access thejob_uuid
from inside the maker and store it in theTaskDoc
? If not, how should I link the job to previous jobs? -
Where would be the best place to include a tutorial? I put it in
/tutorials
along withblob_storage.ipynb
. That directory is also being tested in the CI. If there is a better place, or if the wholetutorials
directory should be moved, lmk.
I've now addressed all the points you brought up in your review. Most significantly:
- splitting
classical_md
intoopenmm
andopenff
. - replacing history tracking using calcs_reversed with a final
collect_jobs
job that gathers uuids and calcs - reworking the CI and dependencies
Once emmet is adjusted and tests are passing, this will be ready to merge! Guidance on the failing test and tutorial would still be appreciated.
The failing test is very unusual - I wonder if @Andrew-S-Rosen has any ideas as it's related to cclib.
Have you tested installing the dependencies with
pip
rather thanuv
?
Uh... weird! Both main
and this PR are using the same version of cclib, which hasn't been updated since March. I'm trying to find the source of the issue but am not successful thus far. [I recognize that the cclib interface isn't really used by anything despite my best intentions there, so it would be a shame for that to hold up a PR...].
@utf is the best way forward to just disable the test here?
And thank you for looking into it @Andrew-S-Rosen
~@utf I believe I've addressed all comments and this is ready to merge.~
EDIT: hold off actually, I forgot that I disabled the openmm tests because a new emmet release is needed. Pinging the MP team.
I've pinned the emmet pre-release to get tests passing. @utf are you good to merge this once the new emmet version is released and tested against? If so, maybe @janosh can merge once that happens so you don't need to circle back.
This is now pinned to the most recent version of emmet and ready to merge @utf.
Your previous suggestion to split out OpenMM and OpenFF was very good. They are now independent and the OpenMM workflows support MLFFs and can return structures to better interoperate with the rest of Atomate2.
This really is fantastic. Thank you very much @orionarcher. @janosh, once you're happy I think we can merge.
Amazing work, thanks for sticking with this @orionarcher !