BioSimSpace
BioSimSpace copied to clipboard
Are atomtypes meant to be updated when using repartitionHydrogenMass?
Hi,
I'm attempting to use HMR with FEP calculations for longer sampling times. I noticed when using the repartitionHydrogenMass method that the hydrogen masses in the [atomtypes] directive of the top file are not updated to reflect the scaled hydrogen masses (i.e., they are labelled as ~1Da rather than 4Da), whereas in the [atoms] directive they are labelled as ~4Da.
Example:
; Gromacs Topology File written by Sire
; File written 02/06/23 16:36:15
[ defaults ]
; nbfunc comb-rule gen-pairs fudgeLJ fudgeQQ
1 2 yes 0.5 0.833333
[ atomtypes ]
; name at.num mass charge ptype sigma epsilon
c 6 12.010700 0.000000 A 0.331521 0.413379
c3 6 12.010700 0.000000 A 0.339771 0.451035
ca 6 12.010700 0.000000 A 0.331521 0.413379
cl 17 35.453000 0.000000 A 0.346595 1.103739
h4 1 1.007940 0.000000 A 0.253639 0.067362
ha 1 1.007940 0.000000 A 0.262548 0.067362
hc 1 1.007940 0.000000 A 0.260018 0.087027
hn 1 1.007940 0.000000 A 0.110650 0.041840
n 7 14.006700 0.000000 A 0.318086 0.684502
nb 7 14.006700 0.000000 A 0.338417 0.393714
o 8 15.999400 0.000000 A 0.304812 0.612119
[ moleculetype ]
; name nrexcl
MOL 3
[ atoms ]
; nr type resnr residue atom cgnr charge mass
1 ca 1 MOL C 1 -0.143679 12.010000
2 ca 1 MOL C 2 0.052321 12.010000
3 ca 1 MOL C 3 -0.129079 8.986000
4 ca 1 MOL C 4 0.127521 12.010000
5 c 1 MOL C 5 0.691621 12.010000
6 ca 1 MOL C 6 -0.303379 8.986000
7 ca 1 MOL C 7 0.444121 8.986000
8 ca 1 MOL C 8 0.052321 12.010000
9 ca 1 MOL C 9 -0.096079 8.986000
10 ca 1 MOL C 10 -0.129079 8.986000
11 cl 1 MOL Cl 11 -0.059979 35.450000
12 o 1 MOL O 12 -0.550179 16.000000
13 n 1 MOL N 13 -0.460179 10.986000
14 nb 1 MOL N 14 -0.737079 14.010000
15 ca 1 MOL C 15 0.550121 12.010000
16 ca 1 MOL C 16 -0.320379 8.986000
17 n 1 MOL N 17 -0.551479 10.986000
18 c 1 MOL C 18 0.669021 12.010000
19 o 1 MOL O 19 -0.589179 16.000000
20 cl 1 MOL Cl 20 -0.059979 35.450000
21 ha 1 MOL H 21 0.156921 4.032000
22 ha 1 MOL H 22 0.182921 4.032000
23 h4 1 MOL H 23 0.025021 4.032000
24 ha 1 MOL H 24 0.147921 4.032000
25 ha 1 MOL H 25 0.156921 4.032000
26 hn 1 MOL H 26 0.330421 4.032000
27 ha 1 MOL H 27 0.177921 4.032000
28 hn 1 MOL H 28 0.330421 4.032000
29 c3 1 MOL C 29 -0.132779 8.986000
30 c3 1 MOL C 30 -0.091179 2.938000
31 c3 1 MOL C 31 -0.091179 2.938000
32 hc 1 MOL H 32 0.066621 4.032000
33 hc 1 MOL H 33 0.047121 4.032000
34 hc 1 MOL H 34 0.047121 4.032000
35 hc 1 MOL H 35 0.047121 4.032000
36 hc 1 MOL H 36 0.047121 4.032000
37 hc 1 MOL H 37 0.047121 4.032000
38 hc 1 MOL H 38 0.047121 4.032000
It's not clear to me whether this is intended, or whether this is a problem, hence why I'm raising this issue.
Thanks, Noah
Thanks, @noahharrison64. I'll take a look. The code just adjusts the mass
property, so I'll see how this is used by the AMBER topology parser. It may be pre-filling some info from the atomic number, e.g. just using the internal element information. If the entries in the [atoms]
section take precedence, then all should be okay, I'm just not 100% certain if this is the case in all circumstances.
(Sorry, meant to say the GROMACS topology parser.)
Yes, the mass is coming from the mass of the element, which isn't modified. I think this is okay, though, since the mass in the [atoms]
section takes precedence. For example, this would be the case for an FEP simulation, since the mass in the [atoms]
section gives the modified mass in the given end state, whereas the [atomtypes]
section gives you the mass of the unmodified atom type. I'll see if I can think of something simple that will check that this is the case.
Hi Lester,
Thanks for getting back to me and checking this. Reassuring that [atoms] directive takes precedence over [atomtypes] and this shouldn't make a difference to FEP calcs.
Just to run something else by you - I was having instability issues with running a certain transformation (Methyl -> Propyl R group transformation) and this only evolved when I was using HMR and long timestep (i.e. was fine with 2fs and no HMR). This happens during initial equilibration, following energy minimisation. Do you think I could solve this by running a EM and short EQ with no HMR / short timestep, followed by repartitioning hydrogens and then a longer equilibration?
Hi there, It's hard to say, but it certainly wouldn't hurt to try. I could certainly imagine that HMR might cause issues if the system was poorly equilibrated to begin with.
@annamherz may be able to comment further. I would suggest a gentle equilibration protocol that raises temperature gradually with restraints on ligand and protein atoms to begin with, followed by gradual release. The following node may be useful https://github.com/michellab/BioSimSpace/blob/devel/nodes/playground/BSSEqBoundNode.ipynb
I would suggest a gentle equilibration protocol that raises temperature gradually with restraints on ligand and protein atoms to begin with, followed by gradual release. The following node may be useful https://github.com/michellab/BioSimSpace/blob/devel/nodes/playground/BSSEqBoundNode.ipynb
Hi Julien, thanks for weighing in.
Are you suggesting gentle equilibration with HMR and 4fs timestep? Or gentle equilibration followed by repartitioning and further equilibration?
Thanks, Noah
Equilibration with 4 fs HMR
I noticed when using the repartitionHydrogenMass method that the hydrogen masses in the [atomtypes] directive of the top file are not updated to reflect the scaled hydrogen masses (i.e., they are labelled as ~1Da rather than 4Da), whereas in the [atoms] directive they are labelled as ~4Da.
I think this is okay, though, since the mass in the
[atoms]
section takes precedence.
I think this is also true for the other atom types as well, eg c3 - the mass at the top cannot be the mass used as the mass used changes depending on how many H are attached and how it is repartitioned.
I was having instability issues with running a certain transformation (Methyl -> Propyl R group transformation) and this only evolved when I was using HMR and long timestep (i.e. was fine with 2fs and no HMR)
I have also been having instability issues for this type of perturbation as well, and it is able to proceed at a lower timestep with no HMR. I'm still not sure entirely what is causing the instabilities, but am currently investigating!
Hi @annamherz
I have also been having instability issues for this type of perturbation as well, and it is able to proceed at a lower timestep with no HMR. I'm still not sure entirely what is causing the instabilities, but am currently investigating!
Yep same here - no HMR and 2fs timestep is no problem. I've tried using gentle equilibration as suggested by Julien - restraint on the ligand and protein, followed by 500ps slow warming but I get crashes within the first 30ps (at 0K...) of simulation. I'm going to try with a HMR scaling factor of 3 and see if I can get anywhere.
Sounds good! I'm currently trying 2fs with HMR to see if it is the HMR itself that is the issue.
What do the trajectories look like when they crash? Are there any weird fluctuations or steric clashes?
I noticed that the default hydrogen mass with perses has been changed from 4 to 3 amu fairly recently due to stability issues. The example given in the issue looks very similar - a CH3 to CH2 perturbation resulting in both carbon and Hs of ~ 4 amu.
Okay good news - switching from HMR scaling factor of 4 to 3 prevents the solvent leg of the simulation breaking, while retaining the 4fs timestep! Feeling optimistic that this will be the same for the complex side as it was the ligand that was causing the issues - not the protein.
What do the trajectories look like when they crash? Are there any weird fluctuations or steric clashes?
My trajectories were crashing so quickly I actually wasn't producing any frames. The starting structures looked reasonable other than the fact it's a hybrid topology and so strange bond connectivity.
Okay good news - switching from HMR scaling factor of 4 to 3 prevents the solvent leg of the simulation breaking, while retaining the 4fs timestep! Feeling optimistic that this will be the same for the complex side as it was the ligand that was causing the issues - not the protein.
Hello! Doing this also meant my solvent leg was no longer failing, however my bound leg is still failing for the CH3 -> CH2-cyclopropyl with a factor of 3. I was wondering if you had any more luck?
My trajectories were crashing so quickly I actually wasn't producing any frames. The starting structures looked reasonable other than the fact it's a hybrid topology and so strange bond connectivity.
This is the same for me.
Hello! Doing this also meant my solvent leg was no longer failing, however my bound leg is still failing for the CH3 -> CH2-cyclopropyl with a factor of 3. I was wondering if you had any more luck?
Hey, the complex side was fine when using a re-scaling factor of 3. Odd that it's failing for you when the problematic part is the ligand, not the protein (at least this was the case for me). I am using simulated annealing in NVT to initially equilibrate my systems, followed by NPT at constant temperature.