rdkit
rdkit copied to clipboard
Chem.CanonSmiles failed to generate unique smiles
Describe the bug Chem.CanonSmiles generates alternating smiles strings from the same starting molecule, depending on number of runs.
To Reproduce
s = 'COC(=O)[C@@]12[C@@H]3N([C@H]4[C@@]5([C@@H](N([C@H]1[C@]3(C5c1ccc(OC)cc1)C(=O)OC)C(=O)OCc1ccccc1)[C@]4([C@H]2c1ccc(OC)cc1)C(=O)OC)C(=O)OC)C(=O)OCc1ccccc1'
Chem.CanonSmiles(s) == Chem.CanonSmiles(Chem.CanonSmiles(s))
# returns False
Expected behavior Expected behaviour is that Chem.CanonSmiles should provide the same smiles string for single molecule
Configuration (please complete the following information):
- RDKit version 2024.03.1 Build py310h6f17f40_0
- OS: Ubuntu 20.04.6 LTS (GNU/Linux 5.15.0-73-generic x86_64)
- Python version (if relevant): 3.10
- Are you using conda? yes
- If you are using conda, which channel did you install the rdkit from? conda-forge
- If you are not using conda: how did you install the RDKit?
Hi there ,
problem is related with this fragment .... [C@@H]3N([C@H]4.... ,if you use .... [C@H]3N([C@H]4.... seems work fine.
To revise by @greglandrum
Thanks @joseerlang,
It seems you find the solution as it is incorrect to use h
in SMILES, then is probably better to raise an exception in that case.
Thanks once again for a quick feedback!
You're welcome @alkorolyov-selvita
This ticket can be closed