rdkit icon indicating copy to clipboard operation
rdkit copied to clipboard

Improved handling of SP/TB/OH reording in SMILES/SMARTS.

Open johnmay opened this issue 2 years ago • 0 comments

Reference Issue

n/a

What does this implement/fix? Explain your changes.

  • add a getMaxNbors(tag) utility function to avoid repeated logic in multiple places
  • tweak getChiralPermutation() for handling implicit/missing ligands (uses -1), allow inverse lookup
  • use the tweaked chiral ordering in the reading/writing from SMILES

Change how missing/implicit ligands on SP/TB/OH atoms are treated. This does not change the internal RDKit semantics only the to/from SMILES. The missing functionality was getChiralPermutation needed to support missing ligands and being able to do the inverse look up.

Cl[Co@OH1]Cl Cl-Co-Cl trans
Cl[Co@OH1H4]Cl Cl-Co-Cl trans
Cl[Co@OH1](*)(*)(*)(*)Cl Cl-Co-Cl trans

This is also useful for SMARTS matching, i.e. matching a partial substructure in a larger octahedral, but will add that another time.

SMILES: O=[N+]([O-])[Co@]([NH3])([NH3])([NH3])([NH3])[N+]([O-])(=O) trans-[Co(NH3)4(NO2)2]
SMARTS: N[Co@OH1]N

SMILES: O=[N+]([O-])[Co@]([NH3])([NH3])([NH3])([NH3])[N+]([O-])(=O) trans-[Co(NH3)4(NO2)2]
SMARTS: N[Co@OH3]N

There was also a minor oversight in the Canon.cp in that reordering was only done if (trueOrder.size() >= 3) which is not correct now for non-tetrahedral stereochemistry.

I didn't think with would be case originally (hence I did them first) but these changes are independent of #6730 and #6772.

Any other comments?

  • I think I got all the test cases but on my M1 Mac I get some segfaults on master so will see what azure says.
  • RDKit book examples will need updating (happy to do that too)

johnmay avatar Oct 04 '23 10:10 johnmay