openmm icon indicating copy to clipboard operation
openmm copied to clipboard

Clarification on whether modeller.addHydrogens method preserves stereochemistry

Open SCMusson opened this issue 3 years ago • 11 comments

Looking at the python script it looks like the modeller.addHydrogens method simply places new hydrogens ~0.1 nm away from the parent atom and then does a energy minimization. Is it not possible for the stereochemistry to be lost if the hydrogen was placed in an unlucky position? If true It renders the function useless to me.

SCMusson avatar Feb 14 '22 18:02 SCMusson

Can you clarify what you mean by stereochemistry being lost? Since it's only adding new hydrogens that aren't already there, it shouldn't lose anything. If the force field is sufficient to define the hydrogens' positions, energy minimization will put them in the correct place. And if it isn't, then the initial positions aren't very important because they'll move around during the simulation.

peastman avatar Feb 14 '22 19:02 peastman

All amino acids except glycine are chiral if the hydrogen are placed incorrectly it could lead to a mirror image.

SCMusson avatar Feb 14 '22 19:02 SCMusson

The force field should take care of that.

peastman avatar Feb 14 '22 20:02 peastman

@peastman : I don't know of any force fields that include terms to deal with stereochemistry. Improper torsions are often used for sp2 centers like nitrogens, but are symmetric. The only thing that preserves stereochemistry is the barrier to stereochemical inversion in most force fields.

We do need to be careful to preserve the stereochemistry when placing hydrogens, and we've seen stereochemistry invert.

Presumably, the non-hydrogen atoms are kept rigid as the hydrogen atoms are minimized. One possible solution would be to attempt multiple additions of a proton and keep the one that ends up with the lowest strain energy to reduce the probability that we invert stereochemistry at a site. We don't have an easy way of computing just the hydrogen's energy in the field of neighboring atoms, but there may be a simple clever solution to this.

jchodera avatar Feb 14 '22 20:02 jchodera

Presumably, the non-hydrogen atoms are kept rigid as the hydrogen atoms are minimized.

Correct.

peastman avatar Feb 14 '22 20:02 peastman

Correct.

Is there any way this could "trap" the hydrogen on the wrong (inverted stereochemistry) side of, say, an sp3 carbon (like an alpha carbon of an amino acid)? 1-2 LJ and electrostatics interactions would be excluded, so the atom in principle can pass through to the correct side, but since we're only using a local energy minimization, the presence of any ruggedness in the landscape would cause this to fail.

Another possibility could be to use a more robust minimizer that can deal with ruggedness, or one that introduces some noise (like Langevin dynamics) as we quench?

jchodera avatar Feb 14 '22 20:02 jchodera

I think it's best to test it out. Can you find a case where the minimization produces the wrong result in practice?

peastman avatar Feb 14 '22 20:02 peastman

I believe we have some scripts that helped us identify these cases in structures generated by comparative modeling. I'll see if I can dig those up.

jchodera avatar Feb 14 '22 20:02 jchodera

pdbfixer uses templates to add missing atoms. I presume that It is desirable to keep openmm modeller more generally applicable than to just proteins but could a similar method defined with general templates for sp, sp2, sp3 atoms work here?

SCMusson avatar Feb 15 '22 11:02 SCMusson

I don't know if this is a related issue but I get some odd results with the below code testing the forcefields with addHydrogens:

from openmm.app import *
from openmm import *
from openmm.unit import *

def test_addhydrogen(test_specify_ff = True, filename = 'example.pdb'):
    forcefield = Forcefield('amber14-all.xml') 
    pdb = PDBFile(filename)
    mod = Modeller(pdb.topology, pdb.positions)
    if test_specify_ff:
        mod.addHydrogens(forcefield)
    else:
        mod.addHydrogens()
    system = forcefield.createSystem(mod.topology)
    integrator = LangevinMiddleIntegrator(300*kelvin, 1/picosecond, 0.002*picoseconds)
    platform = Platform.getPlatformByName('CUDA')
    simulation = Simulation(mod.topology, system, integrator, platform)
    simulation.context.setPositions(mod.positions)
    return(simulation.context.getState(getEnergy=True).getPotentialEnergy()._value)
    
print(f'Specify forcefield: {str(test_addHydrogen(test_specify_ff = True))}')
print(f'Dont Specify forcefield: {str(test_addHydrogen(test_specify_ff = False))}')

gives outputs of:

Specify forcefield: -25359.3806 Dont specify forcefield: 195447.6351

SCMusson avatar Feb 15 '22 14:02 SCMusson

That's not surprising. If you specify the force field, it minimizes the energy for that particular force field. So it makes sense that leads to lower energy if you then evaluate the energy of that force field.

peastman avatar Feb 15 '22 16:02 peastman