rdkit icon indicating copy to clipboard operation
rdkit copied to clipboard

RGD with tetrazole core yields to core that cannot be kekulized

Open ptosco opened this issue 3 years ago • 6 comments

To Reproduce

from rdkit import Chem
from rdkit.Chem import rdRGroupDecomposition

core = Chem.MolFromSmiles("n1nn[nH]c1Cc1ccccc1")
core

image

mol = Chem.MolFromSmiles("n1nnn(C)c1Cc1ccccc1")
mol

image

rgd = rdRGroupDecomposition.RGroupDecomposition(core)
rgd.Add(mol)
0

rgd.Process()
True

rgd.GetRGroupsAsRows(asSmiles=True)
[{'Core': 'c1ccc(Cc2[nH]nnn2[*:1])cc1', 'R1': 'C[*:1]'}]

Chem.MolFromSmiles('c1ccc(Cc2[nH]nnn2[*:1])cc1')
[16:31:47] Can't kekulize mol.  Unkekulized atoms: 5 7 8

rgd.GetRGroupsAsRows()[0]["Core"]

image

ptosco avatar Aug 16 '22 14:08 ptosco

Hi @ptosco, I met this problem before. The RGD is problematic when dealing with core that contains chemical groups with aromatic nitrogens. Our workaround on this issue is to convert the core as a SMARTS pattern, and then set the explicit hydrogen count to 0: 1662119531749

The RGD will work for this modified pattern mol. Hope this helps.

Hong-Rui avatar Sep 02 '22 11:09 Hong-Rui

@Hong-Rui,

I tested doing this on this tetrazole containing molecule Cc1cc(-c2nn[nH]n2)ccc1OS(=O)(=O)c1cccc(-c2nnnn2C)c1 and then followed up with Draw.MolsToGridImage to see if it could still be drawn and I get Unkekulized atoms: 4 5 6 7 8 which are the atoms of the tetrazole ring.

BJWiley233 avatar Jan 01 '23 03:01 BJWiley233

You can always have rgd return the molecule version of the core and just render that. Barring that, you may have to render the smiles with sanitization off.

bp-kelley avatar Jan 01 '23 03:01 bp-kelley

rgd.GetRGroupsAsRows(asSmiles=False)

bp-kelley avatar Jan 01 '23 03:01 bp-kelley

Sorry I didn't mention I wasn't using RGD and I was just using the smiles as a reference to AssignBondOrdersFromTemplate. Now I realize that it can't be drawn because when you remove the hydrogen from 1H-Tetrazole with atom.SetNumExplicitHs(0) this works as a template and my actual molecule is returned correctly with the tetrazole intact with AssignBondOrdersFromTemplate (but should it? since a hydrogen of the tetrazole is now missing from reference molecule) but obviously the reference molecule cannot be drawn because the valence for the nitrogen is 2.

actual molecule before AssignBondOrdersFromTemplate and atom.SetNumExplicitHs(0) of reference smiles image

actual molecule after image

BJWiley233 avatar Jan 01 '23 03:01 BJWiley233