ChEMBL_Structure_Pipeline
ChEMBL_Structure_Pipeline copied to clipboard
Consider alternate handling of chiral centers which have overlapping atoms or atoms on the same line
The standardizer currently assigns a penalty of 5 when there are overlapping atoms or 6 if there are more than six overlapping atoms. This still allows molecules which have overlapping atoms around a chiral center to pass through.
We also don't have any checks for molecules which have chiral centers with chiral atoms (atoms around a chiral center) which lie on the same line.
Here are some examples we found which have either overlapping chiral atoms (atoms around a chiral center) or chiral atoms which lie on the same line:
CHEMBL3617051
CHEMBL3590586
CHEMBL3590585
CHEMBL3590584
CHEMBL3590587
CHEMBL3752539
Since it's not possible to correctly interpret the stereochemistry of those atoms, I believe it should either be removed or the structure itself should be rejected. I'm probably going to add something to remove the stereo at the RDKit level anyway, but I think that it's worth discussion doing something about this during structure import as well.