openff-toolkit icon indicating copy to clipboard operation
openff-toolkit copied to clipboard

Raise exception when processing molecule with radical electrons

Open mattwthompson opened this issue 3 years ago • 2 comments

Here's a block to add to the changelog, keeping it out of the tree to make merges easier:

- [PR #1236](https://github.com/openforcefield/openff-toolkit/pull/1236) raises
  [`RadicalsNotSupportedError`](openff.toolkit.utils.exceptions.RadicalsNotSupportedError) when
  [`Molecule.from_openeye`](openff.toolkit.topology.Molecule.from_openeye),
  [`Molecule.from_rdkit`](openff.toolkit.topology.Molecule.from_rdkit),
  [`Molecule.from_smiles`](openff.toolkit.topology.Molecule.from_smiles), or similar methods
  detect that a molecule contains radical electrons.

Resolves https://github.com/openforcefield/openff-toolkit/issues/1075, assuming I haven't forgotten how chemistry works.

This is a follow-on to #1098 targeting the v0.11.x line.

Two remaining considerations may be:

  • Performance evaluation - this adds an O(number of atoms) check to two key conversion methods.
  • False positives - I have no idea if this will be too noisy and claim to have found radicals where none exist. Throwing this at a large dataset should give an indication of this.

mattwthompson avatar Mar 29 '22 19:03 mattwthompson

Codecov Report

Merging #1236 (63d2a48) into main (3bdc38f) will decrease coverage by 0.11%. The diff coverage is 100.00%.

:exclamation: Current head 63d2a48 differs from pull request most recent head 6dc41ca. Consider uploading reports for the commit 6dc41ca to get more accurate results

Additional details and impacted files

codecov[bot] avatar Mar 29 '22 20:03 codecov[bot]

Some InChi roundtrips are showing different behavior; I don't know if that's good or bad.

I've added a milestone for 0.11.1 because this is not a requirement for the 0.11.0 release.

mattwthompson avatar Mar 29 '22 20:03 mattwthompson

I'll look into these test failures, but to recap a statement I just made in a synchronous meeting with @mattwthompson:

There are three outcomes to a user loading a molecule with a radical into OFFTK:

  1. The molecule could be loaded with a cheminformatics toolkit "fixing" the radical by adding an H
  2. The user could get an error
  3. The user could get an OFFMol with a radical

Right now, trying to parameterize neutral CH3 gets us errors from quacpac and antechamber, so that's clearly an invalid input for our infrastructure. CCCC[S+] emitted an antechamber warning, but succeeded in parameterization and resulted in all-0 partial charges (!).

So, I'm coming around to the idea of totally forbidding radicals for now in the toolkit, at least until we have defined behavior specified, and we establish that we explicitly DO aim to have those in scope.

j-wags avatar Sep 16 '22 20:09 j-wags

Allllright, I think this should be fixed throughout but I'll wait for tests to pass before running a final review. Minidrugbank is such an awful test set, maybe things are finally quiet enough that I can sit down and make a new one.

j-wags avatar Sep 21 '22 00:09 j-wags

Humor shows up in unexpected places:

=================================== FAILURES ===================================
_ examples/check_dataset_parameter_coverage/check_parameter_coverage.ipynb::Cell 2 _
Notebook cell execution failed
Cell 2: Cell execution caused an exception

Input:
molecules = Molecule.from_file("example_molecules.smi", allow_undefined_stereo=True)

# We also provide a SMILES dataset of ~1000 problematic molecules
# molecules =  Molecule.from_file('problem_smiles.smi', allow_undefined_stereo=True)

print(f"Loaded {len(molecules)} molecules")

Traceback:

---------------------------------------------------------------------------
RadicalsNotSupportedError                 Traceback (most recent call last)
Cell In [4], line 1
----> 1 molecules = Molecule.from_file("example_molecules.smi",allow_undefined_stereo=True)
      3 # We also provide a SMILES dataset of ~1000 problematic molecules
      4 # molecules =  Molecule.from_file('problem_smiles.smi', allow_undefined_stereo=True)
      6 print(f"Loaded {len(molecules)} molecules")

mattwthompson avatar Sep 23 '22 01:09 mattwthompson

Humor shows up in unexpected places:

Laughing it up all the way to the deprecated folder. I don't feel like rewriting a large fraction of that notebook, and it was largely a early marketing tool to show groups "hey look, we COULD parameterize your molecules". We have more impressive things to show off now.

j-wags avatar Sep 23 '22 02:09 j-wags