ball
ball copied to clipboard
AromaticityProcessor fails on certain molecules
Testing thousands of molecules from the COD I found another issue with the AromaticityProcessor. Aromaticity is not always correctly assigned.
An example would be the following SDFiles pasted here. 1.) Structure in kekulized form 2.) structure in aromatized form but only two of the 4 rings of the aromatic system are assigned the aromatic bond order...
http://pastebin.com/CNLs2PQ3
(P.S.: seems like it's not possible to upload any files just for an issue?)
Not sure how important these issues on processors/molecular-descriptors are for the future of BALL but just a little additional thought on this issue:
Because the AromaticityProcessor builds upon the RingPerceptionProcessor, one should first fix RingPerception (issue #536) and check afterwards if Aromaticity is perhaps also fixed.
I also found that this problem is not-deterministic and frequently occurs with ringsystems that consist of 4 benzenes.
In such a setting two rings are aromatized and two are not (although they should be). Between multiple program starts (with identical input) the algorithm assigns different pairs of benzenes as being aromatic and being non-aromatic. By chance these pairs might be the same.
so fixing the indeterministic RingPerceptionProcessor (commit 4ff21cbef928ea592dc7076d99011e123c991712) also removed the indeterminism from the AromaticityProcessor. Still the aromaticity assignment is only partially correct for the given example: only two of the 4 aromatic rings are recognized as such.