mdanalysis Implement fancy guessers (e.g., force-field aware)

So with #363 we removed guessing of various things in Parsers (ie whilst reading the files) and deferred this guessing to be after Universe init.

The guessing now needs to be attached to Universe. These need to create the appropriate Attribute and add it to Universe (u.add_TopologyAttr). So you should be able to call something like u.guess_masses().

Guessers required for:

masses
charges
types? (used to work off name)

PR onto the issue-363 branch

Dec 27 '15 17:12 richardjgowers

So as a signature for this, I was thinking..

from MDAnalysis.core.guessers import guess_masses

masses = guess_masses(u)  # should this add itself to Universe here?

u.topology.add_attribute(guess_masses(u))  # or like this so it's explicit

For now this can follow the old system, but we should leave room for specifying a direction to guess in

guess_masses(u, basis='martini')  # uses Martini names instead?

Aug 25 '16 13:08 richardjgowers

I'm not sure I follow some of the details: With your proposed u.topology.add_attribute(guess_masses(u)), at which point does the TopologyAttribute get created and added to the Universe? It can't be done from within topology because Topology objects should be Universe-agnostic, right?

Or do you mean to have the masses TopologyAttr created empty at Topology instantiation time, added like that to the Universe, and finally supplemented by guess_masses(u)?

And finally, does guess_masses actually have to have a Universe argument? Shouldn't it be just topology-dependent?

Aug 25 '16 14:08 mnmelo

Sorry, I think I got it wrong. We need to add it at Universe level (so it gets propagated into Groups)

https://github.com/MDAnalysis/mdanalysis/blob/issue-363/package/MDAnalysis/core/universe.py#L370

So the Guesser here is what creates the TopologyAttribute

So..

guess_masses(u)  # creates the TopologyAttribute

u.add_TopologyAttribute(guess_masses(u))  # creates and adds to Universe

I'm using u as the argument there as it's the most foolproof way of doing this, I guess here technically all that is needed is u.atoms.names.

Aug 25 '16 14:08 richardjgowers

I still don't fully understand why this is a Universe-level operation. The Topology has all that's needed for the guessing.

Aug 25 '16 14:08 mnmelo

For creating the TopologyAttribute you can change the arguments to whatever you want/need.

The attribute needs to be added to the Universe so that .masses gets added to AtomGroup

Aug 25 '16 14:08 richardjgowers

@mnmelo the Topology object only holds the information; it doesn't manage the AtomGroup, ResidueGroup, and SegmentGroup definitions. The Universe is ultimately what does this, so it make sense for the method to be at the Universe level, since it will modify its Topology object and correspondingly modify its *Group classes.

Aug 26 '16 02:08 dotsdl

I had a chat with @mnmelo about guessers a fews days ago. I would implement each guesser (mass, type, bonds...) as a method of a guesser class. The current implementation would go in some DefaultGuessers or BaseGuessers class. Specialised guessers would inherit from that DefaultGuessers class and overload the relevant methods. This way we could have a GromosGuessers or a MartiniGuessers that would be aware of united atoms or coarse-grained beads. This class model has benefits for both the end-users and the developers.

For the end-users, it allows to specify what kind of system he is about to work with and get the guess done accordingly. API-wise, I see something as follow:

# By default, guess with the default guessers as we currently do
u = Universe(top, traj)
# We can explicitly ask for guesses, assuming the default guessers
u = Universe(top, traj, guess=True)
# We can explicitly ask for the default guessers
u = Universe(top, traj, guess=DefaultGuessers)
# We can make guesses for a Martini system
u = Universe(top, traj, guess=MartiniGuessers)
# We can deactivate guessing
u = Universe(top, traj, guess=False)
u = Universe(top, traj, guess=None)

# Would we provide several sets of guessers, they would be easily
# discoverable as the following line does work well with autocompletion:
from MDAnalysis.topology.guessers import MartiniGuessers

For the developper, it allows to easily build on top of an existing collection of guesser by simply inherit from it. It also allows to have custom guessers in a different module. This last point is especially convenient as it allows to develop and maintain a collection of guessers separately from MDAnalysis. People developing force fields or parametrizing new molecules could keep their guessers in sync with their parameters regardless of MDAnalysis updates. In practice, it means for me that I can switch MDAnalysis branch, and still have the guessers for my force field developments.

One could want to guess bonds but not masses. Universe could have a keyword argument to_guess that would take a list like ['bonds', 'angles', 'mass'], and a keyword argument not_to_guess.

I most likely overlook some critical use cases. But this is what I came with after discussing with @mnmelo .

Sep 02 '16 13:09 jbarnoud

This ties in with #942. By default, we don't want all guessing to be active, right? Bonds/angles aren't guessed by default right now. This means that the default is roughly

u = Universe(top, traj, guess=DefaultGuessers, to_guess=['masses'])

Could we also allow the use of Guessers instances along with Guessers classes? We could then pass to the Guessers' __init__ the list of what to guess or not, and simplify Universe's kwarg treatment:

u = Universe(top, traj, guess=DefaultGuessers(to_guess=['masses']))

And what if I already have a Universe set up and want to guess bonds on it? Which method(s) should I call?

Sep 02 '16 13:09 mnmelo

Could we also allow the use of Guessers instances along with Guessers classes?

I see nothing against this.

And what if I already have a Universe set up and want to guess bonds on it? WHich method(s) should I call?

It could be something like u.guess(MySuperGuessers, to_guess=['masses', 'charges']).

Sep 02 '16 14:09 jbarnoud

We would probably want shortcuts, then, right?

class Universe(object):
    ...
    def guess_masses(self, guessers=MDAnalysis.guessers.DefaultGuessers):
        return self.guess(guessers, to_guess=['masses'])

Sep 02 '16 14:09 mnmelo

We could have shortcuts of course, but we still need the generic method. Indeed, with the new topology scheme we can have any arbitrary topology attribute attached to a system. One can therefore have a guesser that guesses the propensity of atoms toward chocolate cakes. We do not want a guess_chocolate_propensity method, but we want to be able to do u.guess(DefaultGuessers, to_guess=['chocolate_propensity']).

Sep 02 '16 14:09 jbarnoud

This all sounds good. I'd put them in mda.guessers rather than hidden in mda.topology

We could do another register trick with the Guesser class (like we do with Readers) so that the name of the Guesser gets added to mda, allowing something like

# in guess_martini.py
class MartiniGuesser(BaseGuesser):
    basis_name='martini'

# somewhere else
u.guess(basis='martini', to_guess=['masses', 'sigma', 'epsilon'])

But maybe this is isn't necessary.....

One little thought, it should be made easy to supplement an existing guesser, ie if I've added an atomtype, but maybe this is easily done if the variables are class attributes

g = mda.guessers.MartiniGuesser

g.masses['me'] = 100.0
g.charge['me'] = -1.5

u.guess(g, to_guess=['mass', 'charge'])

Sep 02 '16 14:09 richardjgowers

We could do the register trick assuming we can still pass a class directly. I am almost sure I will end up with 5 different versions of a martini guesser in some messy notebook.

I agree that the guessers should be as easy as possible to extend. I did not think about these details yet, but your exemple make sense.

Sep 02 '16 15:09 jbarnoud

I'm writing up an implementation of this API. Stay tuned...

Sep 03 '16 11:09 mnmelo

@mnmelo are you still free to do this or is this open to all?

Sep 21 '16 19:09 richardjgowers

I left this open pending the decisions in #942. If you have availability I'd say go ahead because mine will certainly be flaky.

Sep 21 '16 19:09 mnmelo

From the comments in 1524f75 we may have to think about format-aware guessing, in which guessing might depend on the topology being used. I'm not sure I like the loss of abstraction in this.

Sep 23 '16 12:09 mnmelo

@richardjgowers wasn't there some work about this already in the topology refactor branch?

Nov 27 '16 11:11 kain88-de

All I did was reimplement the existing guessers in topology/guessers. I didn't do any of the fancy and cool things discussed here

On Sun, 27 Nov 2016, 11:55 a.m. Max Linke, [email protected] wrote:

@richardjgowers https://github.com/richardjgowers wasn't there some work about this already in the topology refactor branch?

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/MDAnalysis/mdanalysis/issues/598#issuecomment-263118040, or mute the thread https://github.com/notifications/unsubscribe-auth/AI0jB5Rvu7Y27NJflMyCJZfMXTg0aR0Rks5rCW_CgaJpZM4G7m9M .

Nov 27 '16 12:11 richardjgowers

@richardjgowers aren't guessers implemented now?

May 26 '17 13:05 kain88-de

The guessers we've always had are still there, I think this issue refers to having cool ones like outlined here.

May 26 '17 13:05 richardjgowers

Can this issue be closed with PR #3704 merged? It would be cleaner to raise issues for individual contexts/guessers that experts could start working on. (ping @lilyminium @IAlibay )

Dec 18 '24 19:12 orbeckst

I'm in favour of closing -- I agree the base machinery is present now and FF-specific contexts should have separate issues.

Jan 07 '25 00:01 lilyminium

Sorry you pinged me whilst I was OOO @orbeckst

It would be cleaner to raise issues for individual contexts/guessers

I just had a quick look at our issue tracker and, as far as I can tell, there aren't any other issues for different guesser types. If we open some then this can be closed - that being said a "meta issue" is also kinda useful so if folks don't have the energy to open a separate one, this is a good as any.

Jan 07 '25 01:01 IAlibay