atomate2 icon indicating copy to clipboard operation
atomate2 copied to clipboard

[WIP] Generalize forcefields for generic ASE calculator support

Open esoteric-ephemera opened this issue 1 year ago • 3 comments

This is a refactor to better support generic ASE calculators in atomate2. Major changes:

  • schemas / utils have been ported from the forcefields library to a separate ase library
  • Forcefield-specific jobs now inherit from ASE base classes, such as the ForceFieldTaskDocument from the base AseTaskDocument class
  • Example ASE jobs using the Lennard-Jones and GFNn-xTB (TBLite) calculators
  • The ForcefieldResult and ForcefieldObject classes alias ASE parent classes + deprecation warning

Most of the other moved classes are internals for atomate2, so I don't expect many important consequences there

To do:

  • Replicating all "standard" flows (elastic, eos, gruneisen, phonon) with a base ASE calculator class?
  • Tests for the ASE calculators currently in atomate2.ase.jobs (having some trouble getting TBLite working on an M1 mac)

esoteric-ephemera avatar Aug 02 '24 22:08 esoteric-ephemera

@esoteric-ephemera Great! Even though the outside usage of ForcefieldResult might be limited. Let's maybe still alias - just in case.

JaGeo avatar Aug 03 '24 04:08 JaGeo

Hi @JaGeo, restored the ForcefieldObejct and ForcefieldResult objects for now with a deprecation warning

esoteric-ephemera avatar Aug 05 '24 18:08 esoteric-ephemera

Hi @esoteric-ephemera, this is really nice to see in atomate2. Your implementation looks great to me. Would you be also happy to add static makers?

In terms of your todos:

Replicating all "standard" flows (elastic, eos, gruneisen, phonon) with a base ASE calculator class?

I'm not sure if this makes sense? We'd still need to make subclasses for specific implementations anyway, so I don't think a base implementation would add anything?

Tests for the ASE calculators currently in atomate2.ase.jobs (having some trouble getting TBLite working on an M1 Mac)

This would be great if you can figure it out.

utf avatar Aug 06 '24 12:08 utf

I'm not sure if this makes sense? We'd still need to make subclasses for specific implementations anyway, so I don't think a base implementation would add anything?

Sounds good, I'll just port the MD maker over to ase from forcefields and call it

Static makers have been added + tests

esoteric-ephemera avatar Aug 07 '24 04:08 esoteric-ephemera

Right now, I'm getting completely different behavior from tblite when running in CI vs. running locally (as in SCF doesn't converge in CI, works completely fine locally). I'm not sure if it's due to the pytorch-tblite incompatibility - see here for example.

For now, I've made the tblite dependence optional but left the relax / static / MD jobs + tests there. Happy to remove if preferable but I think it's good to have basic implementations of these in atomate2

esoteric-ephemera avatar Aug 12 '24 17:08 esoteric-ephemera

Thanks @esoteric-ephemera. Would it be much effort to mock the tblite call so it doesn't have to run in CI? I'll also tag @Andrew-S-Rosen as maybe he has got tblite working for the quacc CI?

Eitherway, if this won't be possible, I'm happy to merge as is.

utf avatar Aug 14 '24 08:08 utf

It's due to the PyTorch incompatability. Not much you can do there except to split up the tests so you don't have both packages installed in the same env.

Andrew-S-Rosen avatar Aug 14 '24 12:08 Andrew-S-Rosen

@Andrew-S-Rosen I feel this might be a general problem in the future, especially with all the force fields🫠

@utf maybe, we should think about installing and testing the forcefields separately as well if further conflicts show up.

JaGeo avatar Aug 14 '24 12:08 JaGeo

Good suggestions, I split off the tests and everything (incl. tblite) seems to be passing now. Not sure that the way I split the tests makes the most sense (like do the code-cov reports automatically merge?)

esoteric-ephemera avatar Aug 14 '24 22:08 esoteric-ephemera

Is this still WIP or can it be merged now?

utf avatar Aug 20 '24 09:08 utf

@utf should be ready for review!

esoteric-ephemera avatar Aug 20 '24 16:08 esoteric-ephemera

@esoteric-ephemera @utf what's the timeline for merging this PR? with latest numpy, scipy and pymatgen (https://github.com/materialsproject/pymatgen/pull/4009) now being 3.10+, would make sense imo for atomate2 to follow suit. given the size of this PR, that would likely result in merge conflicts so better the get this into main first

janosh avatar Sep 07 '24 16:09 janosh

question from looking at the changes in this PR: is the long-term goal to prevent further proliferation of Makers for each new MLFF that comes along (worthy goal imo)? having a single ASE(MD|Static|Relax)Maker which take a model name and a dict of model specific kwargs or (maybe simpler still) just a ready-to-go ASE-compatible Calculator seems much more maintainable. if that's the goal, i think this PR should deprecate CHGNetRelaxMaker and friends

https://github.com/materialsproject/atomate2/blob/808de7da064c488511326256552c3a7076f58d73/src/atomate2/forcefields/jobs.py#L202

janosh avatar Sep 07 '24 16:09 janosh

@janosh I would be happy about a deprecation period in such a case. We are using these Makers quite a lot and adapting codes/documentation will take time.

JaGeo avatar Sep 07 '24 16:09 JaGeo

@esoteric-ephemera @utf what's the timeline for merging this PR?

This is ready to go as is. I'd like to add NEB workflows to this (started for VASP here) if I had a few more days, but no need if we'd like to merge sooner.

is the long-term goal to prevent further proliferation of Makers for each new MLFF that comes along (worthy goal imo)? having a single ASE(MD|Static|Relax)Maker which take a model name and a dict of model specific kwargs or (maybe simpler still) just a ready-to-go ASE-compatible Calculator seems much more maintainable.

Yes exactly! The Calculator solution (+ model kwargs) is implemented in this PR. Having a LibXC-style lookup for forcefields is worthy of its own project, and is probably covered by MatGL for the graph-based MLFFs right?

It's early enough that we could deprecate the existing MLFF makers but I don't see a strong reason to because they're in high use, as @JaGeo said. Think it's better to focus on preventing future bloat with more MLFF-specific calculators.

esoteric-ephemera avatar Sep 09 '24 17:09 esoteric-ephemera

@esoteric-ephemera @janosh I am fine with deprecation. I would just prefer a deprecation period in this case.

JaGeo avatar Sep 09 '24 17:09 JaGeo

Having a LibXC-style lookup for forcefields is worthy of its own project, and is probably covered by MatGL for the graph-based MLFFs right?

i don't think so. matGL has chgnet and m3gnet but is unlikely to implement new models like 7net, ORB, Nequip, etc. at the rate they appear. many of these models have their own code bases. MatGL afaik aims to implement models with a consistent API internally, not act as as registry mapping model names to implementations in external code bases. but that's all we need and it already exists in atomate2 in the function ase_calculator

https://github.com/materialsproject/atomate2/blob/fdd2c9e2eed3f1fc631a3061f84d93e0898b2347/src/atomate2/forcefields/utils.py#L386-L458

I don't see a strong reason to because they're in high use

the strong reason imo would be to get a leaner/cleaner atomate2 code base and offer a single idiomatic way of using MLFFs through atomate2. to quote "The Zen of Python":

“There should be one—and preferably only one—obvious way to do it.”

janosh avatar Sep 09 '24 17:09 janosh

@janosh Ok! I see now - so yes we'll still maintain ase_calculator and add models to that as needed, but we won't necessarily make subclasses where the MLFF is pre-assigned. That's totally OK with me, let me get the deprecation warnings in there

Also re: NEB, do we want to wait on that for now or push forwards? It could easily go in a separate NEB-related PR so no rush on that

esoteric-ephemera avatar Sep 09 '24 18:09 esoteric-ephemera

It could easily go in a separate NEB-related PR

i think a separate PR makes a lot of sense

janosh avatar Sep 09 '24 18:09 janosh

Sounds good on NEB. I've added deprecation warnings to all the classes, and incorporated a suggestion from @orionarcher about the force_field_name attr of the static, relax and MD makers: the user can input these as either str(MLFF.<attr>) (e.g., "MLFF.CHGNet"), MLFF.<attr> as an enum (MLFF.CHGNet), or the forcefield name as a string (e.g., "CHGNet"). Makes sense to have that now since we're deprecating these classes, and it's easier for the user this way

esoteric-ephemera avatar Sep 09 '24 18:09 esoteric-ephemera

Codecov Report

Attention: Patch coverage is 83.83234% with 108 lines in your changes missing coverage. Please review.

Project coverage is 19.36%. Comparing base (8d57884) to head (763bcc6). Report is 20 commits behind head on main.

Files with missing lines Patch % Lines
src/atomate2/ase/md.py 83.14% 20 Missing and 10 partials :warning:
src/atomate2/ase/utils.py 82.50% 16 Missing and 12 partials :warning:
src/atomate2/ase/schemas.py 85.51% 17 Missing and 4 partials :warning:
src/atomate2/ase/jobs.py 75.00% 15 Missing and 3 partials :warning:
src/atomate2/forcefields/jobs.py 92.45% 2 Missing and 2 partials :warning:
src/atomate2/forcefields/md.py 87.09% 2 Missing and 2 partials :warning:
src/atomate2/forcefields/schemas.py 92.30% 2 Missing :warning:
src/atomate2/forcefields/utils.py 66.66% 1 Missing :warning:
Additional details and impacted files
@@             Coverage Diff             @@
##             main     #940       +/-   ##
===========================================
- Coverage   75.70%   19.36%   -56.35%     
===========================================
  Files         147      160       +13     
  Lines       10925    11732      +807     
  Branches     1613     1734      +121     
===========================================
- Hits         8271     2272     -5999     
- Misses       2173     9319     +7146     
+ Partials      481      141      -340     
Files with missing lines Coverage Δ
src/atomate2/forcefields/utils.py 90.00% <66.66%> (-1.63%) :arrow_down:
src/atomate2/forcefields/schemas.py 94.44% <92.30%> (-4.41%) :arrow_down:
src/atomate2/forcefields/jobs.py 93.26% <92.45%> (-0.42%) :arrow_down:
src/atomate2/forcefields/md.py 83.82% <87.09%> (-0.84%) :arrow_down:
src/atomate2/ase/jobs.py 75.00% <75.00%> (ø)
src/atomate2/ase/schemas.py 85.51% <85.51%> (ø)
src/atomate2/ase/utils.py 82.50% <82.50%> (ø)
src/atomate2/ase/md.py 83.14% <83.14%> (ø)

... and 116 files with indirect coverage changes

codecov[bot] avatar Sep 09 '24 23:09 codecov[bot]

@esoteric-ephemera might be cleanest and fastest to branch off a pytest-split branch from this PR and open that in a separate PR, then git reset --hard to back out the splitting commits from here so we can review the force field overhaul and CI splitting stuff separately

janosh avatar Sep 16 '24 18:09 janosh

@janosh sure happy to do that - the tests were failing just because I forgot to ignore ase tests in the main tests. Would you rather the pytest stuff go in a separate PR?

esoteric-ephemera avatar Sep 16 '24 20:09 esoteric-ephemera

would give better git history but i'm happy either way (if it saves time)

janosh avatar Sep 16 '24 20:09 janosh

Hi @esoteric-ephemera. Thanks again for this. Is it ready to merge?

utf avatar Sep 24 '24 14:09 utf

Hi @utf should be ready to merge ~provided the tests pass (need to make sure these still run OK after resolving merge conflicts with OpenMM)~

esoteric-ephemera avatar Sep 24 '24 16:09 esoteric-ephemera

Thanks @esoteric-ephemera, this is very exciting to have!

utf avatar Sep 25 '24 14:09 utf

In that context: should we document this and a few other of our new workflows that are not VASP-based more clearly? I think a lot of people have interest in this general interface but might not discover it easily

JaGeo avatar Sep 25 '24 14:09 JaGeo

@JaGeo agreed, this should probably go in the docs for clarity. Also happy to add to the working paper

esoteric-ephemera avatar Sep 25 '24 16:09 esoteric-ephemera

@JaGeo agreed, this should probably go in the docs for clarity. Also happy to add to the working paper

Personally, I think it should be also in the paper. Thoughts, @utf ?

JaGeo avatar Sep 25 '24 16:09 JaGeo