rdkit icon indicating copy to clipboard operation
rdkit copied to clipboard

cis/trans Invariant Violation in TautomerEnumerator.Canonicalize

Open JochenSiegWork opened this issue 2 years ago • 1 comments

Hello,

I encountered an Invariant Violation error likely caused by a bug or an edge case that is not handled yet.

Describe the bug

TautomerEnumerator.Canonicalize fails with an Invariant Violation for the smiles "CN1C=CC=C/C1=C\N=O". A minimal example SMILES is "C/C=C\N=O". The SMILES "CC=CN=O" with the cis/trans annotation removed does not cause the error. This seems to be a new issue in rdkit version 2023.09.1 because when I switch back to 2023.03.3 no error is thrown.

The traceback:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[31], line 10
      8 display(mol)
      9 enumerator = rdMolStandardize.TautomerEnumerator()
---> 10 enumerator.Canonicalize(mol)

RuntimeError: Invariant Violation
	could not find atom2
	Violation occurred on line 227 in file Code/GraphMol/Canon.cpp
	Failed Expression: firstFromAtom2
	RDKIT: 2023.09.1
	BOOST: 1_78

To Reproduce

from rdkit import Chem
from rdkit.Chem.MolStandardize import rdMolStandardize

original_smiles = "CN1C=CC=C/C1=C\\N=O"
minimal_smiles = "C/C=C\\N=O"

mol = Chem.MolFromSmiles(minimal_smiles)
# display(mol)
enumerator = rdMolStandardize.TautomerEnumerator()
enumerator.Canonicalize(mol)

Expected behavior

I expect no error when calling TautomerEnumerator.Canonicalize with the SMILES and that the canonical tautomer is returned.

Screenshots

The original SMILES "CN1C=CC=C/C1=C\N=O"

image

The minimal example SMILES "C/C=C\N=O"

image

Configuration (please complete the following information):

  • RDKit version: 2023.09.1
  • OS: Ubuntu 22.04.2 LTS (using WSL)
  • Python version (if relevant): Python 3.8.16
  • Are you using conda? yes
  • If you are using conda, which channel did you install the rdkit from? pypi

Additional context

The output of conda list | grep rdkit for the failing environment is rdkit 2023.9.1 pypi_0 pypi and for the working environment rdkit 2023.03.3 py38h6c71e64_2 conda-forge

JochenSiegWork avatar Nov 16 '23 15:11 JochenSiegWork

I have the same problem with a nitro group next to a cc double bond.

DavidSchallerNuvisan avatar Nov 23 '23 09:11 DavidSchallerNuvisan