openff-toolkit icon indicating copy to clipboard operation
openff-toolkit copied to clipboard

Example for putting a ligand into a box of water and equilibrating it

Open j-wags opened this issue 4 years ago • 9 comments

@slochower shared the following script with me, which seems to touch on some very useful areas.

We should polish in into a full-fledged example!

Some complications toward making this into an example are:

  • It requires openff-evaluator, which is a non-standard dependency, for the packmol functionality. So either we'll need to decide on a pattern for specifying non-standard deps in examples, or we could make this an openff-evaluator example. I'd go with the former unless we get explicit approval from @SimonBoothroyd.
  • released versions of openff-evaluator are currently only compatible with the OLD openforcefield namespace, instead of openff-toolkit, though this will probably be resolved in the next release (so this will resolve itself in a few days/weeks)
  • It uses test_forcefields/tip3p.offxml, which isn't "approved for production", but it's used in internal code so frequently that may as well be. So someone will need to lead the charge on how we approve it, and an clear versioning scheme so we can make updates to it if necessary.
from openff.evaluator.protocols import coordinates
from openff.evaluator.utils import packmol


from openforcefield.topology import Molecule, Topology
from openforcefield.typing.engines.smirnoff import ForceField
from simtk import openmm
from simtk.openmm import app
from openff.evaluator import unit


mol = Molecule.from_smiles("CCO")
water = Molecule.from_smiles("O")

molecules = [mol, water]
n_molecules = [1, 1000]
trajectory, residue_names = packmol.pack_box(
    molecules=molecules,
    number_of_copies=n_molecules,
    mass_density=0.95 * unit.grams / unit.milliliters,
    verbose=True,
    working_directory="packmol",
    retain_working_files=True,
)

from simtk.openmm.app import PDBFile
pdb = PDBFile("packmol/packmol_output.pdb")



openmm_topology = trajectory.topology.to_openmm()
openff_topology = Topology.from_openmm(openmm_topology, unique_molecules=[mol, water])


from simtk import unit
trajectory.unitcell_vectors * unit.nanometer
vectors = [trajectory.unitcell_vectors[0][0], trajectory.unitcell_vectors[0][1], trajectory.unitcell_vectors[0][2]] * unit.nanometer
openff_topology.box_vectors = vectors


forcefield = ForceField("openff-1.2.0.offxml", 'test_forcefields/tip3p.offxml')
system = forcefield.create_openmm_system(openff_topology)

barostat = openmm.MonteCarloBarostat(1.00*unit.bar, 293.15*unit.kelvin, 25)
system.addForce(barostat)

print(system.usesPeriodicBoundaryConditions())


from simtk import openmm, unit

time_step = 2*unit.femtoseconds
temperature = 300*unit.kelvin
friction = 1/unit.picosecond
integrator = openmm.LangevinIntegrator(temperature, friction, time_step)

num_steps = 10000

trj_freq = 50
data_freq = 10

simulation = openmm.app.Simulation(openmm_topology, system, integrator)
simulation.context.setPositions(trajectory.openmm_positions(0))

simulation.context.setVelocitiesToTemperature(temperature)

pdb_reporter = openmm.app.PDBReporter('trajectory.pdb', trj_freq)
state_data_reporter = openmm.app.StateDataReporter('data.csv', data_freq, step=True,
                                                   potentialEnergy=True, temperature=True,
                                                   density=True)
simulation.reporters.append(pdb_reporter)
simulation.reporters.append(state_data_reporter)


import time

print("Starting simulation")
start = time.process_time()

for i in range(num_steps):
    simulation.step(1)
    if i%200 == 0:
        print(simulation.context.getState().getPeriodicBoxVectors())

end = time.process_time()
print("Elapsed time %.2f seconds" % (end-start))

j-wags avatar Feb 18 '21 22:02 j-wags

I'd love to see the PACKMOL wrapper moved somewhere else (maybe split off into its own tiny package) if it's going to be used outside of evaluator long-term (I think it will be).

#716 is at least one blocker for moving test_forcefields/tip3p.offxml into some pseudo-approved state (which is a topic that should be spun out into another issue if we want to do that).

mattwthompson avatar Feb 19 '21 18:02 mattwthompson

I'd say putting TIP3P parameters somewhere "official" (or, you know, just providing a little more polished way of accessing them) ought to be a high priority. It would make rolling out this sort of script in industry a lot more frictionless (avoiding a lot of 🤔🙄 and associated thinking that things "aren't ready yet"...)

slochower avatar Feb 19 '21 18:02 slochower

Pain points were:

  1. Running packmol on its own and reading the PDB output was not handled directly by the toolkit, leading to using Simon's really nice solvation wrapper that returns an mdtraj.trajectory.
  2. Having the OpenFF topology complain either about the first ligand atom "C is an unusual molecule..." or the first atom in the first water molecule "O is an unusual molecule..."
  3. Inadvertently running with SMIRNOFF water the first time around.
  4. Determining the right way to implement box vectors.
  5. Forgetting to apply a barostat (this became apparent quickly and is more of an OpenMM thing than anything, but most people probably want to be running NPT).
  6. (Soon to be irrelevant) Having to step back to an earlier version of the toolkit to use openff-evaluator.

slochower avatar Feb 19 '21 18:02 slochower

  1. Trying to do some downstream analysis after having atom names automatically assigned. It would be super handy to be able to select atoms in an OpenFF Molecule or Topology by SMARTS pattern and then get those atom names or indices. If I go through this workflow and then, for example, want to monitor hydrogen bonds made by a particular atom, I'm not sure if there is a programmatic way of finding/selecting it by atom name or index (e.g., "what are the assigned atom names for the atoms matching hydroxyl SMARTS [OX2H]?")

slochower avatar Feb 19 '21 20:02 slochower

Hi @slochower, thanks for sharing the code. BTW, when I tried to reproduce the code, I couldn't import evaluator.packmol. I'm using openff0.9.0. Which version of openff do you use? Thanks,

iwatobipen avatar Feb 20 '21 12:02 iwatobipen

Hi @iwatobipen and thanks for sharing your code, which I've benefited from multiple times. Exactly right, this won't (yet) work with the latest OpenFF toolkit. The next version of openff-evaluator will support the latest version of openff-toolkit, but for this, I had to step back to toolkit version 0.7.0. I think @SimonBoothroyd is going to post a new release of openff-evaluator in the next few days.

slochower avatar Feb 20 '21 16:02 slochower

Hi @iwatobipen -- You should be able to make an environment that can run this by calling

conda create -n evaluator -c conda-forge -c omnia openff-evaluator
conda activate evaluator

Apologies for the rough edges -- We'll clear up the dependencies and versioning once this gets packaged as a full example notebook!

j-wags avatar Feb 20 '21 17:02 j-wags

Hi @j-wags, Thanks for your suggestion! I could reproduce the code on my PC after installing openff-evaluator and cmiles!!!! It's cool. Looking forward to update example notebook ;) Thanks

iwatobipen avatar Feb 21 '21 05:02 iwatobipen

I have yoinked this into an example in Interchange: https://github.com/openforcefield/openff-interchange/blob/v0.2.0-rc.2/examples/ligand_in_water/ligand_in_water.ipynb

I think this issue should be closed unless there are plans to duplicate that effort into an example that lives here as well.

mattwthompson avatar Jul 07 '22 15:07 mattwthompson