openff-interchange
openff-interchange copied to clipboard
Reference data for tests
It is not currently clear to me what should be used as a reference state for testing interoperability. There is currently no reliable way to get system energies in an arbitrary format, i.e. there is no reference set of GROMACS files that new development can be compared to. Maybe a workaround is to compare energies across engines, i.e. use OpenMM energies, from the OpenFF Toolkit, as the reference, even to engines besides OpenMM?
Example:
Sections of two .top
files, written to different precisions, product (slightly) different energies
With the current internal writer (having copied precision from InterMol, which writes to 8 digits: https://github.com/shirtsgroup/InterMol/blob/dbbdd8207c1dc8218ba4b9f026746a88414a867d/intermol/gromacs/gromacs_parser.py#L644)
[ bonds ]
1 2 1 1.09288838e-01 3.17186185e+05
1 3 1 1.09288838e-01 3.17186185e+05
1 4 1 1.09288838e-01 3.17186185e+05
1 5 1 1.09288838e-01 3.17186185e+05
as compared to the output of ParmEd (which uses different logic: https://github.com/ParmEd/ParmEd/blob/7152537b3478c67cb34f7a0a89d32a5397d60295/parmed/gromacs/gromacstop.py#L1819-L1820)
[ bonds ]
1 2 1 0.10929 317186.185379
1 3 1 0.10929 317186.185379
1 4 1 0.10929 317186.185379
1 5 1 0.10929 317186.185379
These are numerically close, but they produce slightly different results when InterMol is used to evaluate the energies:
ipdb> pmd_energy
OrderedDict([('Bond', Quantity(value=0.092649519444, unit=kilojoule/mole)), ...
ipdb> internal_energy
OrderedDict([('Bond', Quantity(value=0.092514954507, unit=kilojoule/mole)), ...
These may or may not matter in terms of an actual simulation, but they do fail the default arguments of np.isclose
.
Related:
https://github.com/openforcefield/openff-system/pull/90#issuecomment-764852584
Temporary solution: play a game of telephone
omm_sys = forcefield.create_openmm_system(top)
struct = pmd.load_topology(top.to_openmm, omm_sys, positions)
struct.save('telephone.gro')
struct.save('telephone.top')
There's no way this is a sufficient solution for generating reference data, but it should be close enough to get things started.
I've also dodged an issue by loosening atol
, which is not a suitable long-term solution
2560b177ecf3c10e01f998cc85e006fb89741fed
A more serious issue is that we're not completely sure dihedrals (of a few forms) are process properly in an ForceField.create_openmm_system
-> pmd.openmm.load_topology
workflow. See a significant amount of confusion at this issue and also a minimal breaking case below https://github.com/openforcefield/openff-toolkit/issues/303
import parmed as pmd
from openff.toolkit.topology import Molecule
from openff.toolkit.typing.engines.smirnoff import ForceField
from parmed.gromacs import GromacsTopologyFile
mol = Molecule.from_smiles("OC=O")
mol.name = "FOO"
mol.generate_conformers(n_conformers=1)
top = mol.to_topology()
parsley = ForceField("openff_unconstrained-1.0.0.offxml")
openmm_sys = parsley.create_openmm_system(top)
struct = pmd.openmm.load_topology(
topology=top.to_openmm(), system=openmm_sys, xyz=mol.conformers[0]
)
num_pmd_impropers = len([d for d in struct.dihedrals if d.improper])
num_pmd_gmx_impropers = len(
[d for d in GromacsTopologyFile.from_structure(struct).dihedrals if d.improper]
)
num_openff_impropers = len(parsley["ImproperTorsions"].find_matches(top))
assert (
num_openff_impropers == num_pmd_impropers
), f"{num_openff_impropers} != {num_pmd_impropers}"
# AssertionError: 1 != 0
assert (
num_openff_impropers == num_pmd_gmx_impropers
), f"{num_openff_impropers} != {num_pmd_gmx_impropers}"
# AssertionError: 1 != 0
This is more serious as it limits how much testing we can actually do.
Some well-trodden territory: https://www.nist.gov/programs-projects/nist-standard-reference-simulation-website