rdkit
rdkit copied to clipboard
MolToSmiles fails with `Invariant Violation`
Configuration:
- RDKit Version: 2020.03.1
- Are you using conda? Yes
- If you are using conda, which channel did you install the rdkit from? -c rdkit
Description:
from rdkit import Chem
mol = list(Chem.ForwardSDMolSupplier('output.sdf', removeHs=False))[0]
Chem.MolToSmiles(mol)
Error:
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-90-0f5b2a163dca> in <module>
1 mol = list(Chem.ForwardSDMolSupplier('output.sdf', removeHs=False))[0]
----> 2 Chem.MolToSmiles(mol)
RuntimeError: Invariant Violation
inconsistent state
Violation occurred on line 306 in file Code/GraphMol/Canon.cpp
Failed Expression: ((firstFromAtom2->getBeginAtomIdx() == atom2->getIdx()) ^ (secondFromAtom2->getBeginAtomIdx() == atom2->getIdx()))
RDKIT: 2020.03.1
BOOST: 1_67
example.sdf:
0
RDKit 3D
47 50 0 0 0 0 0 0 0 0999 V2000
2.8094 5.5347 3.9549 C 0 0 0 0 0 0 0 0 0 0 0 0
2.9803 5.7631 2.4518 C 0 0 0 0 0 0 0 0 0 0 0 0
2.2057 4.7513 1.7096 N 0 0 0 0 0 0 0 0 0 0 0 0
2.8065 3.5331 1.3600 C 0 0 0 0 0 0 0 0 0 0 0 0
2.2813 2.9697 0.2259 C 0 0 0 0 0 0 0 0 0 0 0 0
1.9852 2.6026 -1.1270 C 0 0 0 0 0 0 0 0 0 0 0 0
1.9766 2.4326 -2.5030 C 0 0 0 0 0 0 0 0 0 0 0 0
0.7497 2.4105 -3.1577 C 0 0 0 0 0 0 0 0 0 0 0 0
-0.3271 3.2209 -3.0933 C 0 0 0 0 0 0 0 0 0 0 0 0
-1.3880 2.8207 -4.1041 C 0 0 0 0 0 0 0 0 0 0 0 0
-1.2755 3.7576 -5.2731 C 0 0 0 0 0 0 0 0 0 0 0 0
-0.9456 5.0000 -5.1903 C 0 0 0 0 0 0 0 0 0 0 0 0
-0.6406 5.8135 -4.1576 N 0 0 0 0 0 0 0 0 0 0 0 0
-0.5843 5.5847 -2.8251 C 0 0 0 0 0 0 0 0 0 0 0 0
-0.4862 4.3279 -2.2820 C 0 0 0 0 0 0 0 0 0 0 0 0
-0.6119 4.4193 -0.7931 C 0 0 1 0 0 0 0 0 0 0 0 0
-1.4265 3.3176 -0.2402 C 0 0 0 0 0 0 0 0 0 0 0 0
-1.2401 2.0491 0.0429 C 0 0 0 0 0 0 0 0 0 0 0 0
-0.0161 1.3332 0.0094 C 0 0 0 0 0 0 0 0 0 0 0 0
0.0021 -0.0041 0.0020 F 0 0 0 0 0 0 0 0 0 0 0 0
1.1113 2.0881 0.0027 C 0 0 0 0 0 0 0 0 0 0 0 0
0.5911 5.0028 -0.1419 C 0 0 0 0 0 0 0 0 0 0 0 0
0.8009 5.0277 1.3433 C 0 0 1 0 0 0 0 0 0 0 0 0
-0.1316 4.1335 2.1455 C 0 0 0 0 0 0 0 0 0 0 0 0
3.1695 4.5391 4.2144 H 0 0 0 0 0 0 0 0 0 0 0 0
1.7550 5.6196 4.2178 H 0 0 0 0 0 0 0 0 0 0 0 0
3.3819 6.2826 4.5035 H 0 0 0 0 0 0 0 0 0 0 0 0
4.0347 5.6782 2.1889 H 0 0 0 0 0 0 0 0 0 0 0 0
2.6202 6.7587 2.1923 H 0 0 0 0 0 0 0 0 0 0 0 0
3.6105 3.1378 1.9058 H 0 0 0 0 0 0 0 0 0 0 0 0
2.9068 2.3033 -3.0078 H 0 0 0 0 0 0 0 0 0 0 0 0
0.6465 1.5861 -3.8982 H 0 0 0 0 0 0 0 0 0 0 0 0
-2.3743 2.9119 -3.6532 H 0 0 0 0 0 0 0 0 0 0 0 0
-1.2180 1.7981 -4.4327 H 0 0 0 0 0 0 0 0 0 0 0 0
-1.4892 3.3616 -6.2513 H 0 0 0 0 0 0 0 0 0 0 0 0
-0.9057 5.4941 -6.1477 H 0 0 0 0 0 0 0 0 0 0 0 0
-0.4194 6.7231 -4.4192 H 0 0 0 0 0 0 0 0 0 0 0 0
-0.6137 6.4266 -2.1513 H 0 0 0 0 0 0 0 0 0 0 0 0
-1.3690 5.2979 -0.7089 H 0 0 0 0 0 0 0 0 0 0 0 0
-2.4993 3.6168 -0.0608 H 0 0 0 0 0 0 0 0 0 0 0 0
-2.1306 1.4606 0.3069 H 0 0 0 0 0 0 0 0 0 0 0 0
0.5873 6.0839 -0.4506 H 0 0 0 0 0 0 0 0 0 0 0 0
1.4908 4.6415 -0.6473 H 0 0 0 0 0 0 0 0 0 0 0 0
0.6018 6.0731 1.6827 H 0 0 0 0 0 0 0 0 0 0 0 0
0.0208 3.0946 1.8532 H 0 0 0 0 0 0 0 0 0 0 0 0
-1.1654 4.4181 1.9498 H 0 0 0 0 0 0 0 0 0 0 0 0
0.0815 4.2470 3.2084 H 0 0 0 0 0 0 0 0 0 0 0 0
1 2 1 0 0 0 0
1 25 1 0 0 0 0
1 26 1 0 0 0 0
1 27 1 0 0 0 0
2 3 1 0 0 0 0
2 28 1 0 0 0 0
2 29 1 0 0 0 0
3 4 1 0 0 0 0
3 23 1 0 0 0 0
4 5 2 0 0 0 0
4 30 1 0 0 0 0
5 6 1 0 0 0 0
5 21 1 0 0 0 0
6 7 2 0 0 0 0
6 21 1 0 0 0 0
7 8 1 0 0 0 0
7 31 1 0 0 0 0
8 9 2 0 0 0 0
8 32 1 0 0 0 0
9 10 1 0 0 0 0
9 15 1 0 0 0 0
10 11 1 0 0 0 0
10 33 1 0 0 0 0
10 34 1 0 0 0 0
11 12 2 0 0 0 0
11 35 1 0 0 0 0
12 13 1 0 0 0 0
12 36 1 0 0 0 0
13 14 1 0 0 0 0
13 37 1 0 0 0 0
14 15 2 0 0 0 0
14 38 1 0 0 0 0
15 16 1 0 0 0 0
16 17 1 0 0 0 0
16 22 1 0 0 0 0
16 39 1 0 0 0 0
17 18 2 0 0 0 0
17 40 1 0 0 0 0
18 19 1 0 0 0 0
18 41 1 0 0 0 0
19 20 1 0 0 0 0
19 21 2 0 0 0 0
22 23 1 0 0 0 0
22 42 1 0 0 0 0
22 43 1 0 0 0 0
23 24 1 0 0 0 0
23 44 1 0 0 0 0
24 45 1 0 0 0 0
24 46 1 0 0 0 0
24 47 1 0 0 0 0
M END
> <id> (1)
0
> <smiles> (1)
CCN1C=C2C3=CC=C4CC=CNC=C4C(C=CC(F)=C32)CC1C
$$$$
A few notes about my investigation:
-
The "removeHs=False" is not required as part of the reproducible.
-
Open Babel converts this into a SMILES with a lot of stereochemistry:
>>> from openbabel import pybel
>>> for mol in pybel.readfile("sdf", "output.sdf"):
... print(mol.write("smi"))
...
CCN1/C=C\2/C/3=C/C=C\4/CC=CNC=C4[C@@H](/C=C\C(=C23)\F)C[C@H]1C 0
- RDKit can convert the structure if
sanitize=FalseandisomericSmiles=False:
>>> for mol in Chem.ForwardSDMolSupplier('output.sdf', sanitize=False):
... print(Chem.MolToSmiles(mol, isomericSmiles=False))
...
[H]C1=C2C3=C(F)C([H])=C([H])C([H])(C4=C([H])N([H])C([H])=C([H])C([H])([H])C4=C1[H])C([H])([H])C([H])(C([H])([H])[H])N(C([H])([H])C([H])([H])[H])C([H])=C23
However, enabling even one of sanitize or isomericSmiles results in the Invariant Violation.
- This structure does not fail under 2016.09.3, though neither does the result have any stereochemistry:
>>> import rdkit; rdkit.__version__
'2016.09.3'
>>> for mol in Chem.ForwardSDMolSupplier('output.sdf', removeHs=True):
... print(Chem.MolToSmiles(mol, isomericSmiles=True))
...
CCN1C=C2C3=CC=C4CC=CNC=C4C(C=CC(F)=C32)CC1C
Was this ever resolved @danpol or @adalke? I'm getting much the same error in 2023.09.1:
Exception has occurred: PanicException
python function failed RuntimeError: Invariant Violation
inconsistent state
Violation occurred on line 306 in file Code/GraphMol/Canon.cpp
Failed Expression: ((firstFromAtom2->getBeginAtomIdx() == atom2->getIdx()) ^ (secondFromAtom2->getBeginAtomIdx() == atom2->getIdx()))
RDKIT: 2023.09.1
BOOST: 1_78
when calling Chem.MolToSmiles(mol).