PyAutoFEP icon indicating copy to clipboard operation
PyAutoFEP copied to clipboard

Solvation issues and enhancements

Open insukjoung opened this issue 4 years ago • 6 comments

Hi, First of all, thank you for sharing your precious source code.

I am having trouble in topology generation regarding solvation.

https://github.com/luancarvalhomartins/PyAutoFEP/blob/fcc52e914fac9edebc1675dd032421915a2986f4/prepare_dual_topology.py#L1648 In the line, None is passed to guess_water_box, but passing solvate_data['water_model'] might be better. In case guess_water_box fails to guess the type of water, the former cannot help but the latter can resolve the issue as far as the user provides water model.

https://github.com/luancarvalhomartins/PyAutoFEP/blob/fcc52e914fac9edebc1675dd032421915a2986f4/prepare_dual_topology.py#L3364-L3366 Here, ion_conentration: 0 should be ion_concentration: arguments.buildsys_ionconcentration to be consistent with complex solvation.

Please let me know if the issues can be resolved without changing the source code.


Checklist for closing this (Added by @luancarvalhomartins so GitHub can track completion, hope you don't mind me editing your issue):

  • [ ] Allowing for setting ionic strength of the water leg (independent of complex leg)
  • [x] Passing arguments.buildsys_water to guess_water_box (1074ad4cb569c56b3a9b1f4fbcbbf05e15a4b2c6)
  • [x] Allow for arbitrary water molecule names (Added 2022-02-09) (189fc2f1dbc044cef0dc34dbfcd105ed862e0089)
  • [ ] Run gmx genion for solvated systems
  • [ ] Rewrite parts of prepare_dual_topolopy.py to avoid assuming the molecule name of the protein (how exactly?)

insukjoung avatar Dec 29 '21 01:12 insukjoung

I found other related issues. For pre-solvated systems, gmx genion must be run to neutralize the complex system. Currently, this step is skipped if the system is pre-solvated. When the ligand is charged, it has an effect. The required argument the run must have is -conc 0 -neutral.

insukjoung avatar Jan 01 '22 13:01 insukjoung

Here is another related problem.

The topology file generated in the protein leg. The order of LIG is misplaced like:

[ molecules ]
; Compound        #mols
LIG                 1
system1             1
system2             1
SOL         11092
K                   40
CL                  30

The correct order should be like:

[ molecules ]
; Compound        #mols
system1             1
system2             1
LIG                 1
SOL         11092
K                   40
CL                  30

The order is correct in the pdb file though. I put presolvated_protein_chains = 2 in the input configuration file, but I am not sure if the parameter is actually utilized.

insukjoung avatar Jan 01 '22 14:01 insukjoung

Here is another related problem.

The topology file generated in the protein leg. The order of LIG is misplaced like:

[ molecules ]
; Compound        #mols
LIG                 1
system1             1
system2             1
SOL         11092
K                   40
CL                  30

The correct order should be like:

[ molecules ]
; Compound        #mols
system1             1
system2             1
LIG                 1
SOL         11092
K                   40
CL                  30

The order is correct in the pdb file though. I put presolvated_protein_chains = 2 in the input configuration file, but I am not sure if the parameter is actually utilized.

https://github.com/luancarvalhomartins/PyAutoFEP/blob/fcc52e914fac9edebc1675dd032421915a2986f4/prepare_dual_topology.py#L1900 The correct molecule name should starts with 'Protein' according to the line. Then, please ignore my previous comment.

insukjoung avatar Jan 01 '22 14:01 insukjoung

Thank you for your detailed report. Sorry for the tardiness, I was taking some days off for the holidays.

Here, ion_conentration: 0 should be ion_concentration: arguments.buildsys_ionconcentration to be consistent with complex solvation.

The use of ion_conentration: 0 was added on purpose, as the water system is often very small and achieving meaningful ionic strengths may be problematic. But I agree it would be useful to allow for selecting ions in the water leg, as long as this is somehow independent from the complex leg ionic strength.

In the line, None is passed to guess_water_box, but passing solvate_data['water_model'] might be better. In case guess_water_box fails to guess the type of water, the former cannot help but the latter can resolve the issue as far as the user provides water model.

Good catch, I am changing that in the next commit.

For pre-solvated systems, gmx genion must be run to neutralize the complex system. Currently, this step is skipped if the system is pre-solvated. When the ligand is charged, it has an effect. The required argument the run must have is -conc 0 -neutral.

Yep, I noted that in: https://github.com/luancarvalhomartins/PyAutoFEP/blob/fcc52e914fac9edebc1675dd032421915a2986f4/prepare_dual_topology.py#L3319 It is not something I would expect to be very common (ie, equilibrating a system with ligand bearing a different charge to the ligand series). But I fixing that.

The correct molecule name should starts with 'Protein' according to the line. Then, please ignore my previous comment.

Yes, that's a current limitation of the prepare_dual_topology.py. I hope I can remove that eventually, but I am not sure how exactly. May be the LIG line could be added before the SOL line (which would require rewriting of some code for pre_solvated systems and some testing for systems with other small molecules/DNA/cofactors). In case the protein name is not Protein, does make_ndx create a Protein group? If so, maybe using the same logic as make_ndx may solve problem. How does is sound? What do you think?

luancarvalhomartins avatar Jan 04 '22 16:01 luancarvalhomartins

Yes, that's a current limitation of the prepare_dual_topology.py. I hope I can remove that eventually, but I am not sure how exactly. May be the LIG line could be added before the SOL line (which would require rewriting of some code for pre_solvated systems and some testing for systems with other small molecules/DNA/cofactors). In case the protein name is not Protein, does make_ndx create a Protein group? If so, maybe using the same logic as make_ndx may solve problem. How does is sound? What do you think?

I am new to gromacs so I don't know if the molecule type can affect the behaviour of make_ndx. I think using pre-defined name is fine as far as it is documented. Maybe you can print out error messages instead of warnings if the name of molecule type is unknown. (I think is is indeed an error :grin:)

Quick test on make_ndx: Make_ndx uses gro file. I find [ Protein ] in the index file if I run gmx make_index -f FullSystem.gro -o index.ndx

insukjoung avatar Jan 05 '22 00:01 insukjoung

Checklist for closing this:

  • [ ] Allowing for setting ionic strength of the water leg (independent of complex leg)
  • [x] Passing arguments.buildsys_water to guess_water_box (1074ad4cb569c56b3a9b1f4fbcbbf05e15a4b2c6)
  • [x] Allow for arbitrary water molecule names (Added 2022-02-09) (189fc2f1dbc044cef0dc34dbfcd105ed862e0089)
  • [ ] Run gmx genion for solvated systems
  • [ ] Rewrite parts of prepare_dual_topolopy.py to avoid assuming the molecule name of the protein (how exactly?)

luancarvalhomartins avatar Feb 03 '22 15:02 luancarvalhomartins