openff-interchange
openff-interchange copied to clipboard
Missing PBC in bonded forces when using to_openmm_simulation causes large energies in polymers
Description and Output
I want to preface this by saying that I'm not sure if this issue is interesting to OpenFF, but I wanted to raise it here in case it is.
I'm using interchange to transfer a polymer sample (13337 atoms with component molecules of sizes 70, 80, 80, 147, 224, 425, 435, 800, and 11076) from GROMACS into OpenMM. When using Interchange to generate the simulation, I found very large initial potential energy (~1.6 billion kJ/mole). Breaking down the energy contributions by forces and whether or not OpenMM recognizes them as periodic:
Nonbonded force IS PERIODIC and has an associated potential of -8243.501361666567 kJ/mol
PeriodicTorsionForce IS NOT PERIODIC and has an associated potential of 32323.733276367188 kJ/mol
HarmonicAngleForce IS NOT PERIODIC and has an associated potential of 626883.4873046875 kJ/mol
HarmonicBondForce IS NOT PERIODIC and has an associated potential of 1606060598.0 kJ/mol
The total potential energy is 1606711561.7192194 kJ/mol
This suggested to me that bonded interactions spanning periodic boundaries are not being correctly assigned their true bond distances. When I manually added in forces in accordance with SMARTS strings in the Sage 2.2.1 XML file, I made sure to add force.setUsesPeriodicBoundaryConditions(True) for all three bonded forces, and it led to the following reasonable behavior:
HarmonicBondForce IS PERIODIC and has an associated potential of 3501.7182302474976 kJ/mol
HarmonicAngleForce IS PERIODIC and has an associated potential of 53255.296142578125 kJ/mol
PeriodicTorsionForce IS PERIODIC and has an associated potential of 31368.817489624023 kJ/mol
NonbondedForce IS PERIODIC and has an associated potential of -7977.904557150272 kJ/mol
The total potential energy is 80147.92730529938 kJ/mol
When trying to reproduce this issue for a smaller system, I found it interesting that the bad energy due to PBCs wasn't as bad. Only 516 harmonic bonds span the periodic box in the geometry above, leading to a contribution of around 3.1 million kJ/mol for each bond. But when I compare the same issue for a single ethane molecule in a box of the same size, this leads to a much smaller difference between the case when the carbon-carbon bond spans the boundary and when it does not.
For the spanning case using interchange:
Nonbonded force IS PERIODIC and has an associated potential of 2.367626820928623 kJ/mol
PeriodicTorsionForce IS NOT PERIODIC and has an associated potential of 2.384521451403998e-07 kJ/mol
HarmonicAngleForce IS NOT PERIODIC and has an associated potential of 0.18148180842399597 kJ/mol
HarmonicBondForce IS NOT PERIODIC and has an associated potential of 2.6079044342041016 kJ/mol
The total potential energy is 5.157013302008865 kJ/mol
For the centered (non-spanning) case using interchange:
Nonbonded force IS PERIODIC and has an associated potential of 2.372286502870795 kJ/mol
PeriodicTorsionForce IS NOT PERIODIC and has an associated potential of 2.384521451403998e-07 kJ/mol
HarmonicAngleForce IS NOT PERIODIC and has an associated potential of 0.20093707740306854 kJ/mol
HarmonicBondForce IS NOT PERIODIC and has an associated potential of 0.4211483895778656 kJ/mol
The total potential energy is 2.9943722083038744 kJ/mol
This suggests to me that the effects of the missing PBCs for bonded forces might be exacerbated in larger molecules.
Reproduction
To reproduce the above, I've attached three YAML files containing the molecular geometry and some code:
Large polymer: xlink_offmol.yaml Ethane centered in periodic box: ethane_centered_offmol.yaml Ethane C-C bond spanning periodic boundary: ethane_uncentered_offmol.yaml
import yaml
import numpy as np
import openff.toolkit as tk
from openff.interchange import Interchange
from openmm import *
from openmm.app import *
from openmm.unit import *
filename = 'offmol.yaml' # set as one of the three above
# read file into offmol
with open(filename, 'r') as file:
data = yaml.safe_load(file)
offmol = tk.Molecule.from_dict(data)
sage = tk.ForceField('openff_unconstrained-2.2.1.offxml')
# generate topology and interchange
topology = tk.Topology.from_molecules(offmol)
topology.box_vectors = tk.Quantity(np.diag([50.3391, 50.3391, 50.3391]), 'angstrom')
topology.is_periodic = True
topology.set_positions(offmol.conformers[0])
interchange = Interchange.from_smirnoff(
force_field = sage,
topology = topology,
charge_from_molecules = [offmol],
allow_nonintegral_charges = True
)
integrator = LangevinMiddleIntegrator(
300 * kelvin,
1/picosecond,
1 * femtosecond
)
omm_simulation = interchange.to_openmm_simulation(integrator = integrator)
# move each force into a different force group for accounting
for index, force in enumerate(omm_simulation.system.getForces()):
force.setForceGroup(index)
omm_simulation.context.reinitialize(preserveState=True) # reinitialize context with existing atom positions
# print information
total_potential_energy = 0.0 * kilojoule * mole ** -1
for index, force in enumerate(omm_simulation.system.getForces()):
state = omm_simulation.context.getState(getEnergy=True,groups={index})
to_print = f'{force.getName()}'
if not force.usesPeriodicBoundaryConditions():
to_print += ' IS NOT PERIODIC '
else:
to_print += ' IS PERIODIC '
to_print += f'and has an associated potential of {state.getPotentialEnergy()}'
print(to_print)
total_potential_energy += state.getPotentialEnergy()
print('The total potential energy is', total_potential_energy)
Software versions
- OpenFF and OpenMM through conda.
- OpenFF Interchange 0.3.22
- OpenMM 8.1.2
- Python 3.10.16
Interesting edge case! I didn't know that OpenMM needed to be told these forces are periodic. I assume that this causes bad physics, but you didn't say you were seeing explosions or surprising behavior from your polymers.
Here's the code that creates and populates openmm.HarmonicBondForce: https://github.com/openforcefield/openff-interchange/blob/0513b82da56d310dd14be08be10c200246fa2fc5/openff/interchange/interop/openmm/_valence.py#L50-L99. In the same file you'll see similarly-structured functions for angle and torsion terms.
The obvious idea for a fix is just setting the bonded forces' periodicity like you do, conditional on the box actually being periodic. (If this is the case, your stop-gap of setting these yourself in your code should be equivalent.) Does that sound reasonable to you?
The corresponding code for non-bonded forces is more complicated because of optional splitting of the NonbondedForce and some other features, but I notice in that file the option is not being set. It seems like OpenMM updates that based on the non-bonded method being used, something for which there is no analog in bonded forces?
>>> import openmm
>>> force = openmm.NonbondedForce()
>>> force.usesPeriodicBoundaryConditions()
False
>>> force.setNonbondedMethod(4)
>>> force.getNonbondedMethod()
4
>>> force.usesPeriodicBoundaryConditions()
True
Hey Matt, thank you very much for your response! I apologize in advance for the long reply because I wanted to address everything.
I do see bad physics after doing this :). Energy minimization takes ~30 minutes when using Interchange to create the OpenMM.System object, and only takes 3 seconds using my custom script. This is because the (already well-equilibrated) topology is drastically rearranged so that no bonds exit the periodic box. This also manifests in the actual production segments as well: the long polymeric chains refuse to exit the periodic box.
I find it very interesting that the effect is so minimal for small molecules. Although the bond energy is indeed still different depending on whether it crosses the periodic box (0.42 kJ/mol versus 2.6 kJ/mol in my example above for ethane), this is certainly not as influential as 3.1 million kJ/mol per bond. Some of my collaborators utilize Interchange.to_openmm_simulation to simulate diffusive phenomena in small molecule systems, where molecules cross periodic boundaries often in long production runs—and they have not noticed non-physicality due to the lack of PBCs for bonded forces. This further suggests to me that this might be a more significant problem in polymeric systems.
I think your suggested fix is very reasonable. In my manual version, I do the following:
harmonic_bond_force = HarmonicBondForce()
for bond_tuple in features['Bonds']:
smarts = features['Bonds'][bond_tuple] # read smarts feature descriptor
k_quantity = string_to_openmm_unit(ff_params['Bonds'][smarts]['k']).in_units_of(kilojoule/(nanometer**2*mole))
length_quantity = string_to_openmm_unit(ff_params['Bonds'][smarts]['length']).in_units_of(nanometer)
harmonic_bond_force.addBond(bond_tuple[0],bond_tuple[1], length_quantity, k_quantity)
harmonic_bond_force.setUsesPeriodicBoundaryConditions(True)
omm_system.addForce(harmonic_bond_force)
making sure to add the line `harmonic_bond_force.setUsesPeriodicBoundaryConditions(True)'.
Regarding the nonbonded forces, I think you are correct. I manually checked the 5 different methods listed API on NonbondedForce under NonbondedMethod, and the methods CutoffPeriodic, Ewald, and PME all automatically apply PBCs when you use them. My manual version has the following
nonbonded_cutoff = string_to_openmm_unit(ff_metadata['vdW']['cutoff'])
vdw_switching = string_to_openmm_unit(ff_metadata['vdW']['switch_width'])
electrostatic_14 = float(ff_metadata['Electrostatics']['scale14'])
vdw_14 = float(ff_metadata['vdW']['scale14'])
nonbonded_force = NonbondedForce()
for atom_index in range(len(features['vdW'])):
vdw_smarts = features['vdW'][atom_index]
charge = features['Electrostatics'][atom_index]
epsilon = string_to_openmm_unit(ff_params['vdW'][vdw_smarts]['epsilon']).in_units_of(kilojoule_per_mole)
sigma = rmin_half_to_sigma(string_to_openmm_unit(ff_params['vdW'][vdw_smarts]['rmin_half']).in_units_of(nanometer))
nonbonded_index = nonbonded_force.addParticle(charge, sigma, epsilon)
nonbonded_force.setCutoffDistance(nonbonded_cutoff)
nonbonded_force.setSwitchingDistance(vdw_switching)
nonbonded_force.setNonbondedMethod(NonbondedForce.PME)
omm_system.addForce(nonbonded_force)
bond_list = []
for bond_tuple in features['Bonds']:
bond_list += [list(bond_tuple)]
nonbonded_force.createExceptionsFromBonds(bond_list, electrostatic_14, vdw_14)
making sure to put in one of the three aforementioned periodic methods when calling nonbonded_force.setNonbondedMethod(NonbondedForce.PME).
I suspect you'd also want to flip that toggle for angle and torsion forces? Have you looked at these forces?
Below is the patch I intend to apply when I am able to block out some time for this - unfortunately this won't be for at least a few days - but most of the work will be curating a good set of tests. The starting point would probably be a polymer, like yours, which crosses the periodic boundaries.
diff --git a/openff/interchange/interop/openmm/_valence.py b/openff/interchange/interop/openmm/_valence.py
index 77a4374f..96954072 100644
--- a/openff/interchange/interop/openmm/_valence.py
+++ b/openff/interchange/interop/openmm/_valence.py
@@ -98,6 +98,8 @@ def _process_bond_forces(
k=k,
)
+ if interchange.box is not None:
+ harmonic_bond_force.setUsesPeriodicBoundaryConditions()
def _process_angle_forces(
interchange,
@@ -178,6 +180,9 @@ def _process_angle_forces(
k=k,
)
+ if interchange.box is not None:
+ harmonic_angle_force.setUsesPeriodicBoundaryConditions()
+
def _process_torsion_forces(interchange, openmm_sys, particle_map):
if "ProperTorsions" in interchange.collections:
@@ -232,6 +237,9 @@ def _process_proper_torsion_forces(interchange, openmm_sys, particle_map):
k / idivf,
)
+ if interchange.box is not None:
+ torsion_force.setUsesPeriodicBoundaryConditions()
+
def _process_rb_torsion_forces(interchange, openmm_sys, particle_map):
"""
@@ -268,6 +276,9 @@ def _process_rb_torsion_forces(interchange, openmm_sys, particle_map):
c5,
)
+ if interchange.box is not None:
+ rb_force.setUsesPeriodicBoundaryConditions()
+
def _process_improper_torsion_forces(interchange, openmm_sys, particle_map):
"""
@@ -306,6 +317,9 @@ def _process_improper_torsion_forces(interchange, openmm_sys, particle_map):
k / idivf,
)
+ if interchange.box is not None:
+ torsion_force.setUsesPeriodicBoundaryConditions()
+
def _is_constrained(
constrained_pairs: set[tuple[int, ...]],
Yes, I also made sure to set setUsesPeriodicBoundaryConditions(True) for both angle forces and torsion forces. This definitely created an energy difference when they were set to false.
Through Interchange:
HarmonicBondForce IS NOT PERIODIC and has an associated potential of 1606060598.0 kJ/mol HarmonicAngleForce IS NOT PERIODIC and has an associated potential of 626883.4873046875 kJ/mol PeriodicTorsionForce IS NOT PERIODIC and has an associated potential of 32323.733276367188 kJ/mol
With the manual method:
HarmonicBondForce IS PERIODIC and has an associated potential of 3501.7182302474976 kJ/mol HarmonicAngleForce IS PERIODIC and has an associated potential of 53255.296142578125 kJ/mol PeriodicTorsionForce IS PERIODIC and has an associated potential of 31368.817489624023 kJ/mol
All energy contributions from the bonded forces are significantly smaller.
Just wanted to weigh in here, especially given the proposed patch from: https://github.com/openforcefield/openff-interchange/issues/1335#issuecomment-3328738234
I didn't know that OpenMM needed to be told these forces are periodic.
For normal bonded force, OpenMM does not need to be told that they are periodic. Here is a good explainer on this situation: https://github.com/openmm/openmm/wiki/Frequently-Asked-Questions#how-do-periodic-boundary-conditions-work
Specifically the following:
Periodic boundary conditions are applied only to nonbonded forces, not bonded ones.
Consider a bond between two particles. Remember that each of them really represents an infinite grid of particles, but that doesn't mean every copy of the first particle is bonded to every copy of the second one. Each copy of the first particle is bonded to one specific copy of the second particle and no other. If a different copy of the second particle somehow manages to come closer than the copy it is bonded to, that doesn't mean the bond should suddenly "break" and transfer to a different copy.
(Actually, it is possible to apply periodic boundary conditions to bonded forces if you really want it. Just call setUsesPeriodicBoundaryConditions() on the force. But this is an unusual thing to do, and is only appropriate in special situations. For example, if you want to simulate an infinite molecule where one end is bonded to the next periodic copy of the other end, you would do it.)
There's also a great summary by Ivy Zhang here: https://github.com/openmm/openmm/issues/3367#issuecomment-987273859
Thanks for the reference. The below was especially helpful:
A consequence is that you must never break a molecule between two different periodic copies. I said it didn't matter whether a particle was at (9, 0, 0) or (-1, 0, 0). But if it is bonded to a particle at (1, 0, 0) then it does matter. One indicates a bond length of 8, while the other indicates a bond length of 2.
In the polymer topology I provided above, the large molecule with 11076 atoms was bonded to itself across periodic boundaries, which meant that there was no way to resolve the bonds without breaking it apart at certain sites.
I guess the question now is whether or not there is a significant disadvantage to declaring setUsesPeriodicBoundaryConditions as true for bonds, angles, and torsions in the appropriate cases. Re:
(Actually, it is possible to apply periodic boundary conditions to bonded forces if you really want it. Just call setUsesPeriodicBoundaryConditions() on the force. But this is an unusual thing to do, and is only appropriate in special situations. For example, if you want to simulate an infinite molecule where one end is bonded to the next periodic copy of the other end, you would do it.)
A good next step would be to see what happens when those are all turned on
if you want to simulate an infinite molecule where one end is bonded to the next periodic copy of the other end
IIUC this is common in polymer science so I disagree that it's a special case (there are lots of non-biological polymers out there that don't fit tidily into a <10 nm box!)
Looking through historical issues, Peter Eastman highlights two things:
- Performance issues
- The bond calculations being incorrect if they shouldn't be periodic.
Probably a question for the OpenMM issue tracker though.
IIUC this is common in polymer science so I disagree that it's a special case (there are lots of non-biological polymers out there that don't fit tidily into a <10 nm box!)
This is certainly true for me, and I'm sure many others who would like to simulate polymers in MD.
A good next step would be to see what happens when those are all turned on
My manual version does this, but I haven't seen the performance issues or nonphysicality mentioned by @IAlibay.
@galioguo Thanks again for bringing this to our attention - I don't expect to soon have the time to work on this (with the amount of research and testing it would necessitate). For now I'll recommend you continue making this modification in your code.
If this is turns out to be a frequent issue with others studying these sorts of periodic polymers, I'm happy to revisit