[WIP] Generalize forcefields for generic ASE calculator support
This is a refactor to better support generic ASE calculators in atomate2. Major changes:
-
schemas/utilshave been ported from theforcefieldslibrary to a separateaselibrary - Forcefield-specific jobs now inherit from ASE base classes, such as the
ForceFieldTaskDocumentfrom the baseAseTaskDocumentclass - Example ASE jobs using the Lennard-Jones and GFNn-xTB (TBLite) calculators
- The
ForcefieldResultandForcefieldObjectclasses alias ASE parent classes + deprecation warning
Most of the other moved classes are internals for atomate2, so I don't expect many important consequences there
To do:
- Replicating all "standard" flows (elastic, eos, gruneisen, phonon) with a base ASE calculator class?
- Tests for the ASE calculators currently in
atomate2.ase.jobs(having some trouble getting TBLite working on an M1 mac)
@esoteric-ephemera Great!
Even though the outside usage of ForcefieldResult might be limited. Let's maybe still alias - just in case.
Hi @JaGeo, restored the ForcefieldObejct and ForcefieldResult objects for now with a deprecation warning
Hi @esoteric-ephemera, this is really nice to see in atomate2. Your implementation looks great to me. Would you be also happy to add static makers?
In terms of your todos:
Replicating all "standard" flows (elastic, eos, gruneisen, phonon) with a base ASE calculator class?
I'm not sure if this makes sense? We'd still need to make subclasses for specific implementations anyway, so I don't think a base implementation would add anything?
Tests for the ASE calculators currently in atomate2.ase.jobs (having some trouble getting TBLite working on an M1 Mac)
This would be great if you can figure it out.
I'm not sure if this makes sense? We'd still need to make subclasses for specific implementations anyway, so I don't think a base implementation would add anything?
Sounds good, I'll just port the MD maker over to ase from forcefields and call it
Static makers have been added + tests
Right now, I'm getting completely different behavior from tblite when running in CI vs. running locally (as in SCF doesn't converge in CI, works completely fine locally). I'm not sure if it's due to the pytorch-tblite incompatibility - see here for example.
For now, I've made the tblite dependence optional but left the relax / static / MD jobs + tests there. Happy to remove if preferable but I think it's good to have basic implementations of these in atomate2
Thanks @esoteric-ephemera. Would it be much effort to mock the tblite call so it doesn't have to run in CI? I'll also tag @Andrew-S-Rosen as maybe he has got tblite working for the quacc CI?
Eitherway, if this won't be possible, I'm happy to merge as is.
It's due to the PyTorch incompatability. Not much you can do there except to split up the tests so you don't have both packages installed in the same env.
@Andrew-S-Rosen I feel this might be a general problem in the future, especially with all the force fields🫠
@utf maybe, we should think about installing and testing the forcefields separately as well if further conflicts show up.
Good suggestions, I split off the tests and everything (incl. tblite) seems to be passing now. Not sure that the way I split the tests makes the most sense (like do the code-cov reports automatically merge?)
Is this still WIP or can it be merged now?
@utf should be ready for review!
@esoteric-ephemera @utf what's the timeline for merging this PR? with latest numpy, scipy and pymatgen (https://github.com/materialsproject/pymatgen/pull/4009) now being 3.10+, would make sense imo for atomate2 to follow suit. given the size of this PR, that would likely result in merge conflicts so better the get this into main first
question from looking at the changes in this PR: is the long-term goal to prevent further proliferation of Makers for each new MLFF that comes along (worthy goal imo)? having a single ASE(MD|Static|Relax)Maker which take a model name and a dict of model specific kwargs or (maybe simpler still) just a ready-to-go ASE-compatible Calculator seems much more maintainable. if that's the goal, i think this PR should deprecate CHGNetRelaxMaker and friends
https://github.com/materialsproject/atomate2/blob/808de7da064c488511326256552c3a7076f58d73/src/atomate2/forcefields/jobs.py#L202
@janosh I would be happy about a deprecation period in such a case. We are using these Makers quite a lot and adapting codes/documentation will take time.
@esoteric-ephemera @utf what's the timeline for merging this PR?
This is ready to go as is. I'd like to add NEB workflows to this (started for VASP here) if I had a few more days, but no need if we'd like to merge sooner.
is the long-term goal to prevent further proliferation of Makers for each new MLFF that comes along (worthy goal imo)? having a single ASE(MD|Static|Relax)Maker which take a model name and a dict of model specific kwargs or (maybe simpler still) just a ready-to-go ASE-compatible Calculator seems much more maintainable.
Yes exactly! The Calculator solution (+ model kwargs) is implemented in this PR. Having a LibXC-style lookup for forcefields is worthy of its own project, and is probably covered by MatGL for the graph-based MLFFs right?
It's early enough that we could deprecate the existing MLFF makers but I don't see a strong reason to because they're in high use, as @JaGeo said. Think it's better to focus on preventing future bloat with more MLFF-specific calculators.
@esoteric-ephemera @janosh I am fine with deprecation. I would just prefer a deprecation period in this case.
Having a LibXC-style lookup for forcefields is worthy of its own project, and is probably covered by MatGL for the graph-based MLFFs right?
i don't think so. matGL has chgnet and m3gnet but is unlikely to implement new models like 7net, ORB, Nequip, etc. at the rate they appear. many of these models have their own code bases. MatGL afaik aims to implement models with a consistent API internally, not act as as registry mapping model names to implementations in external code bases.
but that's all we need and it already exists in atomate2 in the function ase_calculator
https://github.com/materialsproject/atomate2/blob/fdd2c9e2eed3f1fc631a3061f84d93e0898b2347/src/atomate2/forcefields/utils.py#L386-L458
I don't see a strong reason to because they're in high use
the strong reason imo would be to get a leaner/cleaner atomate2 code base and offer a single idiomatic way of using MLFFs through atomate2. to quote "The Zen of Python":
“There should be one—and preferably only one—obvious way to do it.”
@janosh Ok! I see now - so yes we'll still maintain ase_calculator and add models to that as needed, but we won't necessarily make subclasses where the MLFF is pre-assigned. That's totally OK with me, let me get the deprecation warnings in there
Also re: NEB, do we want to wait on that for now or push forwards? It could easily go in a separate NEB-related PR so no rush on that
It could easily go in a separate NEB-related PR
i think a separate PR makes a lot of sense
Sounds good on NEB. I've added deprecation warnings to all the classes, and incorporated a suggestion from @orionarcher about the force_field_name attr of the static, relax and MD makers: the user can input these as either str(MLFF.<attr>) (e.g., "MLFF.CHGNet"), MLFF.<attr> as an enum (MLFF.CHGNet), or the forcefield name as a string (e.g., "CHGNet"). Makes sense to have that now since we're deprecating these classes, and it's easier for the user this way
Codecov Report
Attention: Patch coverage is 83.83234% with 108 lines in your changes missing coverage. Please review.
Project coverage is 19.36%. Comparing base (
8d57884) to head (763bcc6). Report is 20 commits behind head on main.
Additional details and impacted files
@@ Coverage Diff @@
## main #940 +/- ##
===========================================
- Coverage 75.70% 19.36% -56.35%
===========================================
Files 147 160 +13
Lines 10925 11732 +807
Branches 1613 1734 +121
===========================================
- Hits 8271 2272 -5999
- Misses 2173 9319 +7146
+ Partials 481 141 -340
| Files with missing lines | Coverage Δ | |
|---|---|---|
| src/atomate2/forcefields/utils.py | 90.00% <66.66%> (-1.63%) |
:arrow_down: |
| src/atomate2/forcefields/schemas.py | 94.44% <92.30%> (-4.41%) |
:arrow_down: |
| src/atomate2/forcefields/jobs.py | 93.26% <92.45%> (-0.42%) |
:arrow_down: |
| src/atomate2/forcefields/md.py | 83.82% <87.09%> (-0.84%) |
:arrow_down: |
| src/atomate2/ase/jobs.py | 75.00% <75.00%> (ø) |
|
| src/atomate2/ase/schemas.py | 85.51% <85.51%> (ø) |
|
| src/atomate2/ase/utils.py | 82.50% <82.50%> (ø) |
|
| src/atomate2/ase/md.py | 83.14% <83.14%> (ø) |
@esoteric-ephemera might be cleanest and fastest to branch off a pytest-split branch from this PR and open that in a separate PR, then git reset --hard to back out the splitting commits from here so we can review the force field overhaul and CI splitting stuff separately
@janosh sure happy to do that - the tests were failing just because I forgot to ignore ase tests in the main tests. Would you rather the pytest stuff go in a separate PR?
would give better git history but i'm happy either way (if it saves time)
Hi @esoteric-ephemera. Thanks again for this. Is it ready to merge?
Hi @utf should be ready to merge ~provided the tests pass (need to make sure these still run OK after resolving merge conflicts with OpenMM)~
Thanks @esoteric-ephemera, this is very exciting to have!
In that context: should we document this and a few other of our new workflows that are not VASP-based more clearly? I think a lot of people have interest in this general interface but might not discover it easily
@JaGeo agreed, this should probably go in the docs for clarity. Also happy to add to the working paper
@JaGeo agreed, this should probably go in the docs for clarity. Also happy to add to the working paper
Personally, I think it should be also in the paper. Thoughts, @utf ?