openff-toolkit icon indicating copy to clipboard operation
openff-toolkit copied to clipboard

Topology molecules without topology

Open ijpulidos opened this issue 3 years ago • 3 comments

Describe the bug I create a new topology molecule specifying its topology and the specified topology does not have the topology molecule as part of its topology molecules. Should there even be the possibility to have topology molecules without being part of a topology?

To Reproduce

from openff.toolkit.topology import Topology, TopologyMolecule, Molecule
topology = Topology()
ethanol_molecule = Molecule.from_smiles('CCO')
topology_molecule = TopologyMolecule(ethanol_molecule, topology)
print(topology.topology_molecules)

Output This prints an empty list.

[]

Computing environment (please complete the following information):

  • OS: Linux
  • Output of running conda list https://gist.github.com/ijpulidos/7d140d3cf4f3285ae6e9d1c48a8ba3e3

Additional context I expect the specified topology to add the molecule to its topology_molecules list. This also create some troubles with the atom_start_topology_index method in TopologyMolecule https://github.com/openforcefield/openff-toolkit/blob/1bb07cfa4ceffee4a1df760e44fdbdb1d281d1c7/openff/toolkit/topology/topology.py#L965 since it would try to loop through an empty list (no loop).

ijpulidos avatar May 03 '21 20:05 ijpulidos

Should there even be the possibility to have topology molecules without being part of a topology?

I can't think of a use case for this. Nor any reason why a TopologyMolecule would be directly constructed (i.e. called by the user, like in your snippet, instead of constructed internally in Topology.add_molecule). Would that be reason enough to have it be a "private" class?

mattwthompson avatar May 03 '21 21:05 mattwthompson

This is a general problem in our object models, likely of my own making. The same thing happens when an Atom can optionally have a molecule set in its constructor, even if the molecule doesn't know the atom exists.

I think it's OK to keep the TopologyMolecule class and constructor exposed publicly. Maybe the pattern to follow is that, instead of running

class TopologyMolecule:
    def __init__(self, ..., topology=None, ...)
        ...
        self._topology = topology
        return self

we do

class TopologyMolecule:
    def __init__(self, ..., topology=None, ...)
        ...
        if topology is not None:
            topology.add_topology_molecule(self)
        return self

We could also ensure that we never implement a @topology.setter property, so that things can't fall out of sync.

j-wags avatar May 04 '21 18:05 j-wags

The problem with the suggested approach from @j-wags is that it can easily get circular, as noted in a separate discussion.

So if you call add_molecule you would end up using the same TopologyMolecule constructor, as in https://github.com/openforcefield/openff-toolkit/blob/1bb07cfa4ceffee4a1df760e44fdbdb1d281d1c7/openff/toolkit/topology/topology.py#L2912-L2914

Maybe the solution could be implementing some add_topology_molecule in the topology, such that it does not have to use the TopologyMolecule constructor on the fly, so we can then call that from the TopologyMolecule constructor.

ijpulidos avatar May 05 '21 20:05 ijpulidos

As TopologyMolecule is deprecated and will soon be removed (#1506) , leaving only a short window for this to possibly be fixed, which I don't think is likely to happen.

If I understand this behavior correctly, it is no longer an issue with the removal of TopologyMolecule.

mattwthompson avatar Jan 18 '23 17:01 mattwthompson