yank
yank copied to clipboard
FIRE integrator does not fully minimize this system; ASE does
Hi,
I am helping a labmate to minimize a rather tricky dimer system that has been built out of the separate monomers by homology modelling (against PDB 3C7N
),and some manual refinement. The result is a structure with a handful of clashes that the builtin OpenMM minimizer fails to optimize: after three days, some regions of structure begins to collide and, obviously, the MD does not run.
So I was given the starting PRMTOP/INPCRD files and asked to try with another algorithm. (I cannot upload the files publicly, but I can email them to you if needed.)
I first tried to use the minimizers included in the ase
library by coupling it with a custom Calculator
designed to obtain the energy and forces with OpenMM (code below):
from simtk import openmm
class OpenMMCalculator(object):
def __init__(self, system, topology, integrator=None):
self.system = system
self.topology = topology
self._state = None
self._simulation = None
self.integrator = integrator or openmm.VerletIntegrator(0.001)
@property
def simulation(self):
if self._simulation is None:
simulation = openmm.app.Simulation(self.topology, self.system, self.integrator)
self._simulation = simulation
return self._simulation
def state(self, positions):
self.simulation.context.setPositions(positions * pmd.unit.angstrom)
self._state = self.simulation.context.getState(getEnergy=True, getForces=True)
return self._state
def get_potential_energy(self, atoms=None, force_consistent=False):
"""Return total energy.
Both the energy extrapolated to zero Kelvin and the energy
consistent with the forces (the free energy) can be
returned."""
return np.array(self.state(atoms.positions).getPotentialEnergy()._value)
def get_forces(self, atoms):
"""Return the forces."""
return np.array((self.state(atoms.positions).getForces() / 10)._value) # per angstrom!
def get_stress(self, atoms):
"""Return the stress."""
return 0.0
def calculation_required(self, atoms, quantities):
return True
With this calculator, I then called ase.optimize.FIRE
:
import os
import ase
import ase.optimize
import parmed as pmd
def minimize(prmtop, inpcrd, optimizer=ase.optimize.FIRE, fmax=0.05, steps=100000000,
ase_options=None, openmm_options=None):
"""
Minimize an biological structure built with AmberTools. Topology and parameters must be provided in
PRMTOP, coordinates preferrably in INPCRD, but others might work as well.
The optimization trajectory will be stored in a ``*.traj`` file. A restart file with the hessian
and more stuff will be saved to ``*.pckl``. If PCKL file already exists, it will be used as the
starting point of the hessian, so you might want to delete it across different attempts if the
filename is the same.
Also, a ``*.start.pdb`` and ``*.min.pdb`` will be generated with the original and optimized
states, respectively, so you can compare them. An INPCRD will be generated as well.
Parameters
----------
prmtop : str
Path to PRMTOP file (Amber topology + parameters)
inpcrd : str
Path to coordinates file (normally, INPCRD), in Angstrom.
optimizer : callable
Which ASE optimizer to use. Defaults to FIRE.
fmax : float
Max average force for convergence
steps : int
Max number of minimization steps before stopping
ase_options : dict
Keyword arguments that will be passed onto the optimizer initialization.
openmm_options : dict
Keyword arguments that will be passed on OpenMM's system creation (``prmtop.createSystem``).
See https://parmed.github.io/ParmEd/html/structobj/parmed.structure.Structure.html#parmed.structure.Structure.createSystem
Returns
-------
opt
The optimizer instance (so you can call opt.run() again, if needed)
ase_atoms
The ASE structure, with updated positions, for interactive inspection
"""
if ase_options is None:
ase_options = {}
if openmm_options is None:
openmm_options = {}
basename, ext = os.path.splitext(prmtop)
struct = pmd.load_file(prmtop, xyz=inpcrd)
struct.save(basename + '.start.pdb', overwrite=True)
system = struct.createSystem(**openmm_options)
ase_atoms = ase.Atoms(numbers=[a.element for a in struct.atoms],
positions=struct.coordinates,
calculator=OpenMMCalculator(system, struct.topology))
opt = optimizer(atoms=ase_atoms, trajectory=basename + '.traj',restart=basename + '.pckl', **ase_options)
opt.run(fmax=fmax, steps=steps)
struct.coordinates = ase_atoms.positions * pmd.unit.angstrom
struct.save(basename + '.min.pdb', overwrite=True)
struct.save(basename + '.min.inpcrd', overwrite=True)
return opt, ase_atoms
And called it with default arguments:
opt, atoms = minimize("my.prmtop", "my.inpcrd")
The code works, slowly but surely, and in a matter of hours found an acceptable minimum with almost no clashes (original one had 420+ bad contacts):
Step Time Energy fmax
*Force-consistent energies used in optimization.
FIRE: 0 16:54:59 1365569479182351.500000* 106934872482181840.0000
FIRE: 1 16:55:10 590213847001.369751* 17296490224396.2578
FIRE: 2 16:55:19 267267645210.039886* 5091715449409.0625
FIRE: 3 16:55:28 111928043775.791519* 1178028006299.3647
FIRE: 4 16:55:38 56106389020.855469* 316187642544.1710
FIRE: 5 16:55:47 30444412153.592461* 144719585812.3805
FIRE: 6 16:55:55 17118991257.313477* 60182437230.0057
FIRE: 7 16:56:04 10077047616.547144* 26912213062.6401
FIRE: 8 16:56:13 6223857399.995024* 12717554493.1037
FIRE: 9 16:56:22 4066335433.148547* 5194503618.2014
FIRE: 10 16:56:31 2798536494.639191* 3695926848.3093
FIRE: 11 16:56:40 2001396047.852563* 2604284094.3127
FIRE: 12 16:56:49 1480023127.285435* 1572184698.6121
FIRE: 13 16:56:59 1134604545.295926* 1273793133.0214
FIRE: 14 16:57:09 916346526.171279* 1393729622.3804
FIRE: 15 16:57:18 866011451.974753* 4000822992.0231
FIRE: 16 16:57:27 525944578.346811* 359840483.0135
FIRE: 17 16:57:36 384395422.179283* 182973989.3622
FIRE: 18 16:57:44 280328366.963216* 128135244.0095
FIRE: 19 16:57:53 207690106.871335* 84754781.1432
FIRE: 20 16:58:02 157205478.730950* 97735933.7759
FIRE: 21 16:58:12 120696159.140697* 88362144.9791
FIRE: 22 16:58:26 93282713.623994* 61992599.7514
FIRE: 23 16:58:36 72651462.505971* 36500942.0208
FIRE: 24 16:58:45 57246023.437164* 20437869.7095
FIRE: 25 16:58:54 45662582.876512* 12099065.5792
FIRE: 26 16:59:05 36822586.265060* 7870923.0160
FIRE: 27 16:59:16 29980243.758955* 5978081.3436
FIRE: 28 16:59:26 24620308.002318* 4625243.5480
FIRE: 29 16:59:36 20374927.156667* 3760195.3303
FIRE: 30 16:59:47 16976843.260064* 3383586.2006
FIRE: 31 16:59:57 14232645.791040* 2844068.3251
FIRE: 32 17:00:08 12002135.594942* 2246132.9324
FIRE: 33 17:00:16 10181591.290781* 1686825.0257
FIRE: 34 17:00:27 8691181.801153* 1223212.5462
FIRE: 35 17:00:37 7467313.904709* 869799.0612
FIRE: 36 17:00:47 6458267.971686* 687065.9220
FIRE: 37 17:00:56 5621706.450192* 646957.7375
FIRE: 38 17:01:07 4923195.983598* 699277.9362
FIRE: 39 17:01:18 4334927.815135* 710919.8298
FIRE: 40 17:01:28 3834969.030372* 677466.7610
[...]
FIRE: 10659 08:28:29 -161944.308658* 86.5190
FIRE: 10660 08:28:34 -161944.698209* 23.9075
FIRE: 10661 08:28:38 -161944.723300* 16.2688
FIRE: 10662 08:28:43 -161944.747467* 3.4779
FIRE: 10663 08:28:47 -161944.738042* 9.6879
FIRE: 10664 08:28:51 -161944.738963* 8.9168
FIRE: 10665 08:28:56 -161944.741948* 7.4401
FIRE: 10666 08:29:01 -161944.750354* 5.3877
FIRE: 10667 08:29:05 -161944.756265* 2.9594
FIRE: 10668 08:29:10 -161944.761592* 2.1633
FIRE: 10669 08:29:14 -161944.769636* 2.1633
FIRE: 10670 08:29:19 -161944.769154* 3.7667
Seeing that this implementation is written in pure Python/numpy I wondered if somebody had already implemented that for GPU (I remembered reading an OpenMM issue where @jchodera talked about it), and found the FIREMinimizationIntegrator you have developed.
I grabbed the file, pasted the class in a Notebook and ran it like this:
from simtk.openmm import app
from simtk import openmm as mm
from simtk import unit
from sys import stdout
prmtop = app.AmberPrmtopFile("my.prmtop")
inpcrd = app.AmberInpcrdFile("my.inpcrd")
system = prmtop.createSystem()
integrator = FIREMinimizationIntegrator(tolerance=0.0001 * unit.kilojoules_per_mole / unit.nanometers)
sim = app.Simulation(prmtop.topology, system, integrator)
sim.context.setPositions(inpcrd.positions)
print('Starting energy:', sim.context.getState(getEnergy=True).getPotentialEnergy())
sim.reporters.append(app.StateDataReporter(stdout, 100, step=True,
potentialEnergy=True, speed=True, separator='\t'))
while integrator.getGlobalVariableByName('converged') < 1:
integrator.step(50)
The performance is impressive but stops minimizing very soon:
Starting energy: 251733888466944.0 kJ/mol
#"Step" "Potential Energy (kJ/mole)" "Speed (ns/day)"
100 2694168969216.0 0
200 2694168969216.0 8
300 2694168969216.0 7.9
400 2694168969216.0 7.94
500 2694168969216.0 7.94
600 2694168969216.0 7.95
700 2694168969216.0 7.94
800 2694168969216.0 7.94
900 2694168969216.0 7.96
1000 2694168969216.0 7.96
1100 2694168969216.0 7.95
If the system is created with nonbondedMethod=app.CutoffNonPeriodic, nonbondedCutoff=1.0*unit.nanometers
options, then the energies decrease better, but far from ideal:
Starting energy: 251734224011264.0 kJ/mol
#"Step" "Potential Energy (kJ/mole)" "Speed (ns/day)"
100 2424489902080.0 0
200 1742063271936.0 48.9
300 1424046292992.0 49.5
400 849651040256.0 50.2
500 631096934400.0 47.1
600 447762726912.0 46.2
700 361159458816.0 46.7
800 251577466880.0 46.6
900 177901469696.0 46.8
1000 131495706624.0 47.1
1100 131495706624.0 47
1200 131495706624.0 46.8
1300 131495706624.0 45.9
1400 131495706624.0 45.7
1500 131495706624.0 45.8
1600 131495706624.0 45.7
1700 131495706624.0 45.7
1800 131495706624.0 45.6
1900 131495706624.0 45.6
2000 131495706624.0 45.7
2100 131495706624.0 46.7
I first thought that maybe the units are not comparable, so I checked the structure for bad contacts and, while it had decreased from 400 to 200, it still was not comparable to ase
's result. I tried to couple it with OpenMM minimizer afterwards to see if the L-BFGS was able to get the energy down, and it got it to 54491.28125 kJ/mol
during the night, but then FIRE was unable to improve it further. I also tried with different platforms (CUDA, OpenCL, CPU), but the result is the same.
I am not very familiar with the FIRE algorithm so I then thought it had to do with the chosen parameters, but apparently they are the same in both implementations (besides units):
# ASE
class FIRE(Optimizer):
def __init__(self, atoms, restart=None, logfile='-', trajectory=None,
dt=0.1, maxmove=0.2, dtmax=1.0, Nmin=5, finc=1.1, fdec=0.5,
astart=0.1, fa=0.99, a=0.1, master=None, downhill_check=False,
position_reset_callback=None, force_consistent=None):
[...]
# YANK
class FIREMinimizationIntegrator(openmm.CustomIntegrator):
def __init__(self, timestep=1.0 * unit.femtoseconds, tolerance=None, alpha=0.1,
dt_max=10.0 * unit.femtoseconds, f_inc=1.1, f_dec=0.5, f_alpha=0.99, N_min=5):
[...]
So that's it, I guess I am doing something wrong implementation-wise, but I really want to use the FIRE minimizer because it looks impressive! Thanks for your help, Jaime.
I'm not sure, but I think that Yank is converged when the average force of the system is lower than a tolerance, while ASE checks the force for all atoms individually. I think yank also converges when the timestep becomes sufficiently small, but I can't spot the same check in the ASE code. If you're happy to email me your input I can check what the issue is!
Thanks for your answer! Yes, I can send the files via email. Can you send me your address?
In the meantime, I will check the average forces, but I'd say that 0.0001 * unit.kilojoules_per_mole / unit.nanometer
is low enough to avoid that stopping criterion. I think I have also tried with default tolerance
(0
, basically), and the results were similar (convergence with high energy).
Thanks again, Jaime.
Hmm ok maybe it's not the forces then, it's just the only difference I spotted at first look. If you send me over the files I will have more of a hunt!
Thanks!
Thanks for the reports! The FIRE minimizer implementation we use in YANK is still very crude and not at all fully validated yet, so please use it with caution. We should be able to look into this sometime soon! For now, I'd suggest using the OpenMM LocalEnergyMinimizer
even if it's slow.
@hannahbrucemacdonald: [email protected]
Hmm ok maybe it's not the forces then, it's just the only difference I spotted at first look. If you send me over the files I will have more of a hunt!
Thanks!
Got it! Mail sent! Thanks!
@jchodera: Thanks for the reports! The FIRE minimizer implementation we use in YANK is still very crude and not at all fully validated yet, so please use it with caution. We should be able to look into this sometime soon! For now, I'd suggest using the OpenMM
LocalEnergyMinimizer
even if it's slow.
Thanks for the suggestion! We have been using LocalEnergyMinimizer
successfully, but it is not working properly in this system (too bad clashes, probably). I was crossing my fingers and trying to see if maybe another minimizer was able to do at least something with the most severe clashes, and then go back to LocalEnergyMinimizer
, but to my surprise ase
's FIRE did a great work by itself. Having a minimizer implemented as an Integrator
is also very useful for energy reporting, something people keep requesting for some reason (at least here in our lab). 😅