rdkit icon indicating copy to clipboard operation
rdkit copied to clipboard

Canonicalize shifts double bond out of conjugated system

Open pechersky opened this issue 9 months ago • 9 comments

Describe the bug The TautomerEnumerator.Canonicalize is creating a tautomer that is moving a double bond from out of a conjugated system further into a ring. This tautomer is not corroborated by OEChem or Moka tautomer tools.

To Reproduce

from rdkit import Chem
from rdkit.Chem.MolStandardize import rdMolStandardize

#              v alpha-beta to carbonyl
smi = "O=CC1CCC=C(C(=O)c2ccccc2)C1"

mol = Chem.MolFromSmiles(smi)
print(Chem.MolToSmiles(mol))
assert Chem.MolToSmiles(mol) == smi

enumerator = rdMolStandardize.TautomerEnumerator()

canonical = enumerator.Canonicalize(mol)
print(Chem.MolToSmiles(canonical))
#                       v beta-gamma to carbonyl
canonical_smi = "O=CC1CC=CC(C(=O)c2ccccc2)C1"
# assert Chem.MolToSmiles(canonical) == canonical_smi

import openeye.oechem as oe
import openeye.oequacpac as oequacpac

# OE does not tautomerize this way

mol = oe.OEGraphMol()
oe.OESmilesToMol(mol, smi)
print(oe.OEMolToSmiles(mol))  # c1ccc(cc1)C(=O)C2=CCCC(C2)C=O
oequacpac.OEGetReasonableProtomer(mol)
print(oe.OEMolToSmiles(mol))  # c1ccc(cc1)C(=O)C2=CCCC(C2)C=O
#                                               ^ doesn't shift

mol = oe.OEGraphMol()
oe.OESmilesToMol(mol, canonical_smi)
print(oe.OEMolToSmiles(mol))  # c1ccc(cc1)C(=O)C2CC(CC=C2)C=O
oequacpac.OEGetReasonableProtomer(mol)
print(oe.OEMolToSmiles(mol))  # c1ccc(cc1)C(=O)C2CC(CC=C2)C=O
#                                                     ^ doesn't shift

Expected behavior I expect the input molecule to stay the same in this Canonicalize call.

Screenshots image

Configuration (please complete the following information):

  • RDKit version: 2022.9.5; 2023.9.1
  • OS: Ubuntu 20.04
  • Python version (if relevant): py38, py311
  • Are you using conda? no
  • If you are using conda, which channel did you install the rdkit from? n/a
  • If you are not using conda: how did you install the RDKit? pip install rdkit

Additional context Possibly related to https://github.com/rdkit/rdkit/issues/5937, but this case is worse because we're going from SMILES, not 2D mols. And we're not creating an aromatic ring here.

pechersky avatar Nov 16 '23 05:11 pechersky