auto_martini icon indicating copy to clipboard operation
auto_martini copied to clipboard

Can't kekulize mol due to what appears to be a Boost Error

Open shushanhe3 opened this issue 6 years ago • 3 comments

Hi,

I was trying to coarse-grain a small drug molecule tenofovir with SMILES

"CC(CN1C=NC2=C1N=CN=C2N)OCP(=O)(O)O"

For some reason it does not work and returns the following message:

[16:13:49] Can't kekulize mol. Unkekulized atoms: 0 1 2 3 4

Traceback (most recent call last): File "/home/shushan/opt/auto_martini-master/auto_martini", line 1226, in ringAtomsFlat, True) File "/home/shushan/opt/auto_martini-master/auto_martini", line 848, in printAtoms molFrag = genMoleculeSMI(smiFrag) File "/home/shushan/opt/auto_martini-master/auto_martini", line 93, in genMoleculeSMI mol = Chem.AddHs(mol) Boost.Python.ArgumentError: Python argument types in rdkit.Chem.rdmolops.AddHs(NoneType) did not match C++ signature: AddHs(RDKit::ROMol mol, bool explicitOnly=False, bool addCoords=False, boost::python::api::object onlyOnAtoms=None, bool addResidueInfo=False)

Any feedback would be greatly appreciated!

Thanks,

Shushan

shushanhe3 avatar Oct 30 '18 23:10 shushanhe3

Are you sure you only get this problem for this particular molecule? The problem looks similar to issue https://github.com/tbereau/auto_martini/issues/3.

tbereau avatar Nov 01 '18 10:11 tbereau

Are you sure you only get this problem for this particular molecule? The problem looks similar to issue #3.

Hi Tristan,

Yes I'm sure this problem only appears when the genMoleculesSMI fails to general any mol object for some particular SMI sequences. I tried a little debugging my self and have the script print out the sequence and the corresponding mol object. It turns out that for this particular molecule with smi "CC(Cn1cnc2c1ncnc2N)OCP(=O)(O)O", when it gets to the double ring part of the structure the genMoleculesSMI fails to generate a mol object. The printed message from one call of:

auto_martini --smi "CC(Cn1cnc2c1ncnc2N)OCP(=O)(O)O" --mol TFV

is shown as follows:

smi is: CC(Cn1cnc2c1ncnc2N)OCP(=O)(O)O mol is: <rdkit.Chem.rdchem.Mol object at 0x7f7d6a1b02f0> smi is: CC mol is: <rdkit.Chem.rdchem.Mol object at 0x7f7d6a1b0600> smi is: Cn1ccnc1 mol is: <rdkit.Chem.rdchem.Mol object at 0x7f7d6a1b0520> [14:26:08] Can't kekulize mol. Unkekulized atoms: 0 1 2 3 4

smi is: c1cncn1 mol is: None Traceback (most recent call last): File "/home/shushan/opt/auto_martini-master/auto_martini", line 1229, in ringAtomsFlat, True) File "/home/shushan/opt/auto_martini-master/auto_martini", line 851, in printAtoms molFrag = genMoleculeSMI(smiFrag) File "/home/shushan/opt/auto_martini-master/auto_martini", line 96, in genMoleculeSMI mol = Chem.AddHs(mol) Boost.Python.ArgumentError: Python argument types in rdkit.Chem.rdmolops.AddHs(NoneType) did not match C++ signature: AddHs(RDKit::ROMol mol, bool explicitOnly=False, bool addCoords=False, boost::python::api::object onlyOnAtoms=None, bool addResidueInfo=False)

Please let me know what you think might be causing the problem. Thank you so much for your help!

shushanhe3 avatar Nov 05 '18 22:11 shushanhe3

I see, yes that can happen sometimes where rdkit fails to analyze certain chemical groups. I would recommend you break it down to the (double)ring part that is problematic, and try first to parametrize a molecule that is very close but has few atom substitutions to bypass the problem. I suppose that replacing some of these n by c might do the trick.

tbereau avatar Nov 07 '18 14:11 tbereau