auto_martini
auto_martini copied to clipboard
Bead division problem
I found that the latest version is not the same as the old version of the beads. Such as the molecular:O(CCOC(=O)c1ccccc1)CO The new version divides:SN0 SN0 SN0 P1 P1 The old:SC5 SC5 SC5 P1 P1 That is,the new one divides benzene into SN0 SN0 SN0.
In addition,some of the beads were not 4-1 or 3-1 mapping,such as(SMILES:c1cc(ccc1)C(=O)OCCOCO): [atoms] ; id type resnr residu atom cgnr charge smiles 1 SC5 1 HCB2 S01 1 0 ; [H]c1cc([H])c([H])c([H])c1[H] 2 SC5 1 HCB2 S02 2 0 ; [H]c1cc([H])c([H])c([H])c1[H] 3 SC5 1 HCB2 S03 3 0 ; [H]c1cc([H])c([H])c([H])c1[H] 4 P1 1 HCB2 P01 4 0 ; OC=O 5 P1 1 HCB2 P02 5 0 ; [H]OC([H])([H])OC([H])([H])C([H])[H]
Another group [H]OC([H])([H])C([H])[H]OC=O was divided into P2,that is,there are 3-1, 4-1,5-1,6-1 in one molecule. I wonder if this division is possible.
Hi, The goal of this algorithm is to determine a Martini parameterization that minimizes a cost function with respect to the mapping and chooses bead types such that the octanol/water partition free energy of each corresponding fragment as predicted by the ALOGPS neural network matches as closely with the chosen bead type as possible. For the second issue, of the mapping of 5 and 6 heavy atom fragments to a single bead: When I downloaded the earlier version of the code and tested the fragments you pointed out, I saw that these mappings are still single-bead mappings, meaning that the updates to the code did not affect this mapping. For some fragments, single-bead mappings are allowed if the additivity check fails for mappings with higher numbers of beads. Also note that mappings coarser than 4:1 mappings are possible when using Martini (for example butanol: maps to a single bead, also see Advanced Parameterization of Martini by Manuel Nuno Melo from the martini lectures page http://cgmartini.nl/index.php/lectures). Regardless of the number of heavy atoms mapped, auto_martini assigns a bead type such that the partition free energy of the bead type matches as closely as possible with the ALOGPS prediction (which does take into account the fragment size). We are still working on further improvements to the algorithm to further minimize the number of fragments for which this occurs. For the first issue, regarding the bead typing for benzene: The approach for ring molecules is to use the entire set of atoms in the ring for each fragment and weight each bead’s contribution by a scaling factor. For all ring molecules, this was previously set to 2/3 so as to reproduce the Martini parameterizations for benzene and cyclohexane. However, we found, by parameterizing many multitudes of ring-containing molecules from the Generated Database, that a factor of 1/2 for 5-membered rings and 1/3 for six-membered rings yielded much better agreement with respect to the ALOGPS predictions for the ring molecules. This does, however, cause the benzene and cyclohexane parameterizations prescribed by Martini to no longer be the ones obtained from auto_martini. If you wish to use the old scaling factor that preserves the benzene and cyclohexane mappings, I recommend you try the refactor branch. Simply type:
git checkout refactor
and then running the code is slightly different:
python auto_martini.py --smi “O(CCOC(=O)c1ccccc1)CO” --mol MOL
You can also fine tune the results of the parameterization yourself by replacing certain assignments if you know that a certain bead type makes more sense for that specific fragment based on more relevant properties for your system (rather than the octanol/water partitioning free energy).
Thanks for your patient answer. I will try it.
Hi So I had similar query. I was trying to get parameters for dodecanol which is having 13 heavy atoms and what I got using auto_martini was
**;;;; GENERATED WITH auto-martini ; dodecanol.sdf ; Tristan Bereau (2014)
[moleculetype] ; molname nrexcl DODE 2
[atoms] ; id type resnr residu atom cgnr charge smiles 1 C1 1 DODE C01 1 0 ; [H]C([H])C([H])([H])C([H])([H])C([H])([H])C([H])([H])C([H])([H])C([H])([H])[H] 2 C5 1 DODE C02 2 0 ; [H]OC([H])([H])C([H])([H])C([H])([H])C([H])([H])C([H])[H]
[bonds] ; i j funct length force.c. 1 2 1 0.65 1250**
It is 13:2 (heavy atoms to bead ratio), I mean how reliable is this? Any comments? It will be helpful if you can also give some other examples (of-course butanol) where people used something similar.
Using the master
branch and dodecanol's smiles string ("OCCCCCCCCCCCC") I get 3 beads:
; GENERATED WITH auto_martini.py
; INPUT SMILES: [H]OC([H])([H])C([H])([H])C([H])([H])C([H])([H])C([H])([H])C([H])([H])C([H])([H])C([H])([H])C([H])([H])C([H])([H])C([H])([H])C([H])([H])[H]
; Tristan Bereau (2014)
[moleculetype]
; molname nrexcl
MOL 2
[atoms]
; id type resnr residue atom cgnr charge smiles
1 N0 1 MOL N01 1 0 ; CCCCO
2 C3 1 MOL C01 2 0 ; CCC
3 C1 1 MOL C02 3 0 ; CCCCC
[bonds]
; i j funct length force.c.
1 2 1 0.32 1250
2 3 1 0.32 1250
[angles]
; i j k funct angle force.c.
1 2 3 2 94.3 25.0
maybe try using smiles string rather than sdf files, as we've done more tests with the former.
What I am getting using smiles string is :
;;;; GENERATED WITH auto-martini ; INPUT SMILES: [H]OC([H])([H])C([H])([H])C([H])([H])C([H])([H])C([H])([H])C([H])([H])C([H])([H])C([H])([H])C([H])([H])C([H])([H])C([H])([H])C([H])([H])[H] ; Tristan Bereau (2014)
[moleculetype] ; molname nrexcl DODE 2
[atoms] ; id type resnr residu atom cgnr charge smiles 1 Nda 1 DODE N01 1 0 ; CCCCO 2 C3 1 DODE C01 2 0 ; CCC 3 C1 1 DODE C02 3 0 ; CCCCC
[bonds] ; i j funct length force.c. 1 2 1 0.32 1250 2 3 1 0.32 1250
[angles] ; i j k funct angle force.c. 1 2 3 2 94.3 25.0
Bead type 1 is different.
Please confirm which version of the code you are using. Tristan's result (which I also get) is from using the master
branch. I do get the same result as yours when using the refactor
branch. In addition to the parameters for ring molecules, the refactor
branch has a higher tolerance for assigning fragments to donor/acceptor types as compared to the master
branch. Overall, the main difference between the two branches is that the master
branch more strictly tries to match the water/octanol partition free energy of the CG molecule to the atomistic one as predicted by ALOGPS. On the other hand, the
refactor
branch uses different parameters that aim to preserve some of the mappings that Martini has prescribed for specific molecules, like the butanol fragment that is given in this case. For dodecanol the partition free energy value is -7.3573 kcal/mol. You can see the comparison yourself by adding the -v
flag like so ./auto_martini --smi "OCCCCCCCCCCCC" -v --mol MOL
. For dodecanol, using the master
branch, the CG partition free energy is -7.3327 kcal/mol, whereas the result is -6.9192 kcal/mol using the refactor
branch. (Note that, if using the refactor
branch and the -v
flag, the units are for octanol/water partitioning free energy in kJ/mol, so dividing by a factor of -4.184 will give you the same numbers as I have listed here, but this unit conversion is already implemented in the master
branch). Both results are within 10% of the target value, but the master branch is slightly more accurate. However, if the hydrogen bonding of the alcohol functional group is very relevant to your system, it may be worth keeping the Nda bead type versus the N0 bead type.
Sure. Thanks a lot for the explanation. I have understood the difference now.
Comment - The script is very useful :)