blues icon indicating copy to clipboard operation
blues copied to clipboard

Refactor NCMC code ot eliminate code duplication?

Open jchodera opened this issue 8 years ago • 15 comments

We've moved all of the nonequilibrium switching functionality from perses into NonequilibriumLangevinIntegrat.

All of the nonequilibrium integrators can be moved out of ncmc_switching.py and replaced by calls to openmmtools.integrators.NonequilibriumLangevinIntegrator with the appropriate integrators splittings:

  • For g-BAOAB, this would be R V O H O V R, where the Hamiltonian update happens in the middle to make the protocol symmetric. We currently recommend this, neglecting shadow work and only including protocol work in the NCMC acceptance probability---this will be the topic of a paper we're trying to tackle ASAP).
  • For VVVR, this is O V R H R V O
  • For velocity Verlet, this is V R H R V
  • For GHMC (Metropolized VVVR) with alchemical updates, you want O { V R H R V } O

@maxentile, @bas-rustenburg, and @patrickgrinaway were integral in putting this together, and can help you if you need more info.

jchodera avatar Apr 23 '17 18:04 jchodera

Since we have to worry about external protocol work (external as in work from outside the lambda perturbations) we can't use the NonequilibriumLangevinIntegrat class directly. I think that there were some discussions about modifying the base code so that it can handle the external protocol work case, but it's not implemented as of yet. My temporary work around for this was to create a class that inherits from the above integrator so that it handles external protocol work. I could look to put it up once we are able to specify the integrator in the refactored code (which I'll open up another issue for).

sgill2 avatar Apr 24 '17 17:04 sgill2

@jchodera @pgrinaway -- thoughts on Sam's response? I would love to avoid code duplication but I'm not seeing an obvious way to do this here.

And @sgill2 - have you brought in your work-around yet or is that a "to do" item?

davidlmobley avatar Apr 24 '17 18:04 davidlmobley

My temporary work around for this was to create a class that inherits from the above integrator so that it handles external protocol work

This is what I would do. You can take a look at the ExternalPerturbationLangevinIntegrator if that is helpful. If you have trouble with that, I could help.

pgrinaway avatar Apr 24 '17 19:04 pgrinaway

Thanks for the input @pgrinaway! @davidlmobley I haven't brought the integrator into github yet. I'll go about putting up a WIP for it now.

sgill2 avatar Apr 24 '17 20:04 sgill2

Wait, where is your external work coming from?

jchodera avatar Apr 25 '17 13:04 jchodera

The external work should be zero in the case of a rotation/translation of a given whole fully non-interacting ligand. However, we're also looking to apply other moves, such as rotational side chain sampling, in which the angle/torsion energy changes would need to be accounted for by the external work.

sgill2 avatar Apr 25 '17 18:04 sgill2

However, we're also looking to apply other moves, such as rotational side chain sampling, in which the angle/torsion energy changes would need to be accounted for by the external work.

One idea would be to use Metropolis Monte Carlo on the lambda = 0 (or softened lambda) state for the sidechain. If you do this, you don't need to include the protocol work contributions since you Metropolize them out. It would also allow you to rapidly decorrelate the sidechain in the softened state. Then you continue your integration back from lambda = 0 -> 1 and just use the total protocol work from NonequilibriumIntegrator for acceptance.

jchodera avatar Apr 25 '17 18:04 jchodera

For the BLUES paper, however, you can probably just get away with using NonequilibriumIntegrator throughout.

jchodera avatar Apr 25 '17 18:04 jchodera

@jchodera - where is NonequilibriumIntegrator? Or do you mean NonequilibriumLangevinIntegrator?

@sgill2 - I think a key question is whether we can replace the ncmc_switching.py (with code which we'd then be responsible to maintain which (entirely? partially?) duplicates stuff the Chodera lab has elsewhere) with code directly from openmmtools so we can just introduce that as a dependency. This would be for the BLUES paper, which is just on rotational moves. Unless there is something we need that is not there, it strikes me that this would be preferable because we want to avoid unnecessary code duplication. Is there any functionality we need that's NOT in openmmtools?

Thanks.

davidlmobley avatar Apr 25 '17 18:04 davidlmobley

Sam and I just discussed this. We are enthusiastic about the idea of doing Metropolis Monte Carlo on the sidechain (that may make certain other aspects a bit cleaner/easier to deal with as well) as per https://github.com/MobleyLab/blues/issues/50#issuecomment-297127960 so we think we'll proceed in that direction.

That also removes the need for the NonequilibriumExternalLangevinIntegrator that #53 introduces, so we can update that PR to note that this integrator is vestigial/not currently used by the code.

This also means we can do as this issue's title suggests and migrate the code to use NonequilibriumLangevinIntegrator as our default, replacing ncmc_switching.py. @sgill2 remarks that in his tests on lysozyme he's getting similar acceptance to what he was getting before (though presumably acceptance will be better than before with more relaxation) so at least that shouldn't break anything, though we're going to get @nathanmlim to look at it on another test system.

So, long story short, I think we can kill a couple birds with one stone here and move to NonequilibriumLangevinIntegrator to: (a) move to the better integrator (b) remove code duplication

And then separately, we'll use MMC for the sidechain moves in the tangential project.

davidlmobley avatar Apr 25 '17 19:04 davidlmobley

Upon further discussion we think it might still be relevant to test a metropolized and external protocol approach and see how they perform in different cases. If we consider the sidechain case again, say a side chain rotation is performed at lambda=0 (for the sterics, electrostatics) and that rotamer is 5 kcals/mol more unfavorable. If we metropolized the move we'd often reject it. However if it turns out that at lambda=1 the intermolecular interactions contribute 6 kcals/mol the move would actually be accepted in the 'external protocol work' scheme since the overall energy change would be favorable.

sgill2 avatar Apr 25 '17 19:04 sgill2

@jchodera - where is NonequilibriumIntegrator? Or do you mean NonequilibriumLangevinIntegrator?

Whoops! Yes, NonequilibriumLangevinIntegrator.

This also means we can do as this issue's title suggests and migrate the code to use NonequilibriumLangevinIntegrator as our default, replacing ncmc_switching.py. @sgill2 remarks that in his tests on lysozyme he's getting similar acceptance to what he was getting before (though presumably acceptance will be better than before with more relaxation) so at least that shouldn't break anything, though we're going to get @nathanmlim to look at it on another test system.

That's surprising!

  • How long are the NCMC trajectories you were using?
  • What timestep?
  • Are you using explicit solvent?
  • What constraint tolerance for the integrator and PME tolerance?

jchodera avatar Apr 25 '17 22:04 jchodera

If we consider the sidechain case again, say a side chain rotation is performed at lambda=0 (for the sterics, electrostatics) and that rotamer is 5 kcals/mol more unfavorable. If we metropolized the move we'd often reject it. However if it turns out that at lambda=1 the intermolecular interactions contribute 6 kcals/mol the move would actually be accepted in the 'external protocol work' scheme since the overall energy change would be favorable.

I believe you can still use NonequilibriumLangevinIntegrator in that case. You can accumulate protocol work associated with the lambda changes during integration, and manually include the protocol work from the sidechain rotamer flip using the initial and final potential energies (from state.getPotentialEnergy()) before and after the Monte Carlo move. NonequilibriumLangevinIntegrator will correctly accumulate the protocol work from the lambda context parameter updates in this case (and even the shadow work too, if desired).

jchodera avatar Apr 25 '17 22:04 jchodera

Update: As noted in https://github.com/choderalab/openmmtools/issues/201#issuecomment-302725904, I got the step order wrong, and BAOAB based methods should have the order V R O R V and not the other way around!

jchodera avatar May 19 '17 15:05 jchodera

For the record, I believe the g-BAOAB scheme noted above is not represented correctly, per John in this thread: https://github.com/choderalab/openmmtools/issues/201#issuecomment-302738799

davidlmobley avatar May 19 '17 15:05 davidlmobley