ChEMBL_Structure_Pipeline icon indicating copy to clipboard operation
ChEMBL_Structure_Pipeline copied to clipboard

Consider alternate handling of chiral centers which have overlapping atoms or atoms on the same line

Open greglandrum opened this issue 1 year ago • 3 comments

The standardizer currently assigns a penalty of 5 when there are overlapping atoms or 6 if there are more than six overlapping atoms. This still allows molecules which have overlapping atoms around a chiral center to pass through.

We also don't have any checks for molecules which have chiral centers with chiral atoms (atoms around a chiral center) which lie on the same line.

Here are some examples we found which have either overlapping chiral atoms (atoms around a chiral center) or chiral atoms which lie on the same line:

CHEMBL3617051
CHEMBL3590586
CHEMBL3590585
CHEMBL3590584
CHEMBL3590587
CHEMBL3752539

Since it's not possible to correctly interpret the stereochemistry of those atoms, I believe it should either be removed or the structure itself should be rejected. I'm probably going to add something to remove the stereo at the RDKit level anyway, but I think that it's worth discussion doing something about this during structure import as well.

greglandrum avatar Dec 05 '22 17:12 greglandrum