rdkit icon indicating copy to clipboard operation
rdkit copied to clipboard

Chem.CanonSmiles failed to generate unique smiles

Open alkorolyov-selvita opened this issue 9 months ago • 3 comments

Describe the bug Chem.CanonSmiles generates alternating smiles strings from the same starting molecule, depending on number of runs.

To Reproduce

s = 'COC(=O)[C@@]12[C@@H]3N([C@H]4[C@@]5([C@@H](N([C@H]1[C@]3(C5c1ccc(OC)cc1)C(=O)OC)C(=O)OCc1ccccc1)[C@]4([C@H]2c1ccc(OC)cc1)C(=O)OC)C(=O)OC)C(=O)OCc1ccccc1'
Chem.CanonSmiles(s) == Chem.CanonSmiles(Chem.CanonSmiles(s))
# returns False

Expected behavior Expected behaviour is that Chem.CanonSmiles should provide the same smiles string for single molecule

Configuration (please complete the following information):

  • RDKit version 2024.03.1 Build py310h6f17f40_0
  • OS: Ubuntu 20.04.6 LTS (GNU/Linux 5.15.0-73-generic x86_64)
  • Python version (if relevant): 3.10
  • Are you using conda? yes
  • If you are using conda, which channel did you install the rdkit from? conda-forge
  • If you are not using conda: how did you install the RDKit?

alkorolyov-selvita avatar May 09 '24 09:05 alkorolyov-selvita

Hi there ,

problem is related with this fragment .... [C@@H]3N([C@H]4.... ,if you use .... [C@H]3N([C@H]4.... seems work fine.

To revise by @greglandrum

joseerlang avatar May 16 '24 19:05 joseerlang

Thanks @joseerlang,

It seems you find the solution as it is incorrect to use h in SMILES, then is probably better to raise an exception in that case.

Thanks once again for a quick feedback!

alkorolyov-selvita avatar May 17 '24 05:05 alkorolyov-selvita

You're welcome @alkorolyov-selvita

joseerlang avatar May 17 '24 10:05 joseerlang

This ticket can be closed

joseerlang avatar May 20 '24 18:05 joseerlang