smarty icon indicating copy to clipboard operation
smarty copied to clipboard

Make further modifications for parameter manipulation in ForceField

Open davidlmobley opened this issue 8 years ago • 8 comments

See discussion in #84 and #86 . #86 implemented very basic infrastructure for parameter manipulation. Some other things we will likely need:

  • GetParameterIDs, getting all parameter IDs for a particular section (needed for determining unique parameter IDs for new child parameters)

Presumably later we will also need the ability to introduce new parameters, such as something like:

  • CreateChildParameter(smirks, parentsmirks, section, paramdict)
  • DeleteChildParameter which would delete a specified parameter which is a child of another parameter

We will probably also need something which tracks the parameter hierarchy so we know which parameters can be deleted (at least without substantial work, only parameters which have no children can be deleted). We have a parent_id in the XML now which is used for that, so we'd just need functionality for retrieving this.

davidlmobley avatar Aug 02 '16 20:08 davidlmobley

Additional things which were discussed on today's call that we will need:

  • siblingShuffle( args ) - propose shuffling the order of siblings (with no descendants) within the hierarchy
  • getRandomParameter( args ) - get a random parameter of a particular force type
  • getParents( args ) - get parents of a particular parameter
  • getChildren( args) - get children of a particular parameter

Should we also have getHierarchy and getDescendants, for access to the full tree of parameters for a particular force type? I'm guessing the answer is "yes" as these will be useful at least for visualization purposes but may have functional use as well.

Another key issue: How do we specify what parameters we want to work with?

Currently, for my prototyping, I've been using getParameter and setParameter which allow selection of a set of parameters associated with a particular SMIRKS and force term based on a (parameter ID OR smirks pattern) and, optionally a force type to search such as HarmonicBondForce, i.e. as described here:

    def getParameter(self, smirks = None, paramID=None, force_type='Implied'):
        """Get info associated with a particular parameter as specified by SMIRKS or parameter ID, and optionally force term.
    Parameters
    ----------
    smirks (optional) : str
        Default None. If specified, will pull parameters on line containing this `smirks`.
    paramID : str
        Default None. If specified, will pull parameters on line with this `id`
    force_type : str
        Default "Implied". Optionally, specify a particular force type such as
        "HarmonicBondForce" or "HarmonicAngleForce" etc. to search for a
        matching ID or SMIRKS.


    Returns
    -------
    params : dict
        Dictionary of attributes (parameters and their descriptions) from XML
"""

There's a possible point of confusion in my terminology as the function is called getParameter but returns the parameterS associated with a particular entry in the XML file (i.e. for a bond it would be length, force constant, and any other info tied to that), so maybe I should pluralize the names.

I'm by no means attached to this particular approach, though it currently works well , i.e. params = newff.getParameter(smirks='[#6X4:1]')` is a nice compact way to retrieve the relevant info. (This takes advantage of the fact that the SMIRKS pattern is currently unique in the XML so no force type needs to be specified).

@jchodera - Is there another framework which would be better? This will be key for determining which arguments the above functions ought to take.

I'm happy to implement, I'd just like to know what you see as the best way of specifying which parameter set to work on.

davidlmobley avatar Aug 10 '16 00:08 davidlmobley

@jchodera , input on this?

davidlmobley avatar Aug 16 '16 00:08 davidlmobley

It would help if we first ask: "What will our main activities be for this API?" and then see if the proposed API satisfies all these needs.

We'll want to code up several kinds of proposal engines, but the most basic ones need to do the following:

Propose adding a new parameter, potentially restricted to a given class (nonbonded, bond, angle, torsion)

In this case, we want to

  • select a parameter, possibly from a given class, presumably at random, to be the parent of a new child
  • evaluate the probability the new parameter will be selected by the "propose deleting a parameter" call so we can compute the Metropolis-Hastings acceptance probability

Propose deleting a parameter

In this case, we want to:

  • select a parameter, possibly from a given class, to delete
  • evaluate the probability this parameter's parent would be selected to add a child parameter by the "propose adding a new parameter" call

Propose a parameter to perturb

In this case, we want to:

  • select a parameter, possibly from a given class, to perturb
  • evaluate the probability this parameter was selected by the same method
  • set the parameter with new values

Walk the hierarchy to generate diagnostic readouts

We may want some methods to help us generate diagnostic info.

Propose the exchange of two or more exchangeable parameters

I think this requires we

  • select a parent atom at random
  • get a list of all children that can be reordered
  • set a new order for these children
  • evaluate the probability the parent was selected and the children were reordered (this last part would be computed by whatever code does the reordering)

Does that cover this?

jchodera avatar Aug 16 '16 03:08 jchodera

@jchodera - I think I agree with the basic "main activities" you summarize, and what those would involve.

So we would need to implement:

  • Pick a parameter, possibly from a given class, at random (for deleting, giving a child, or perturbing)
  • Get children of a selected parent which can be re-ordered (children with no descendants)
  • Shuffle order of children which can be re-ordered given a selected parent

We'll also need, as you note, these to return the probabilities of making the specific choices made.

Plus some utility tools for diagnostics, etc.:

  • Walk the hierarchy and report full hierarchy
  • Get parent of a particular parameter
  • Get all children of a selected parent
  • Get a specific parameter (not just a random one), i.e. as in my getParameter example above (i.e. perhaps for debugging I want to explore only what happens to a specific vdW parameter in a specific short test)
  • Set a specific parameter (not just a random one)

Proposed names and ideas on arguments

  • Pick a random parameter: getRandomParameter(args); just takes an optional force_type as an argument...?
  • Get children of selected parent which can be re-ordered (or all children): getChildren(args); this would need a unique identifier of what parameter we're talking about, so I'd suggest the parameter id for conciseness (with the intent that the parameter id be a unique identifier in the forcefield so it specifies a parameter line and force type)), though we could optionally use SMIRKS and force_type instead. Could optionally take a reorderable argument which would toggle behavior between giving back all children (for hierarchy retrieval) versus just those which are reorderable.
  • Shuffle order of re-orderable children: childShuffle(args) which takes a parent and shuffles re-orderable children.

I'll get to the utility functions after we work through these.

Key question

A key question for finalizing the arguments is how we want to identify what parameter line we're working on from SMIRFF, i.e. once we pick something via getRandomParameter we have to pass our choice around. Do we use SMIRKS and force_type everywhere (these will always uniquely specify a parameter line and are the basic functional unit) or do we use the parameter_id (which we can make so it always uniquely specifies a parameter line), or something else?

davidlmobley avatar Aug 17 '16 04:08 davidlmobley

@jchodera - one other question. I'm not an expert on Metropolis-Hastings yet, but it looks to me like you swapped your proposal probabilities above. Specifically, you say that for adding a new parameter, we need to evaluate the probability that this would be selected by the "propose deleting a parameter" call, and for deleting a parameter we need to "evaluate the probability this parameter's parent would be selected to add a child parameter by the 'propose adding a new parameter' call". Isn't this backwards?

Though, perhaps you're just saying that we need both g(x'|x) and g(x|x') for move evaluation, i.e. for adding a new parameter we need both the probability of proposing that specific addition, and the probability of proposing a deletion which would have gotten us here. Is that right?

davidlmobley avatar Aug 17 '16 21:08 davidlmobley

Isn't this backwards?

The Metropolis-Hastings acceptance criteria is

P_accept = min { 1, [P(old | new) / P(new | old)] [P(new) / P(old)] }

When proposing new given old, we need to compute the ratio [P(old | new) / P(new | old)].

Suppose we pick a parameter to delete. We need to know BOTH the probability we selected it to be deleted P(new | old) AND the probability that we could propose creating this parameter again from its parent after we've deleted it P(old | new).

Same deal for the case of creating a parameter.

jchodera avatar Aug 17 '16 22:08 jchodera

That makes sense. It's where I ultimately ended up, though it wasn't what I got when I first read your note. Thanks for the clarification.

davidlmobley avatar Aug 17 '16 22:08 davidlmobley

Offline I asked David if there was a way to extract smirks or parameter IDs from a forcefield object. It would be useful to be able to extract parameter information without knowing about the parameter. Perhaps a method that will return lists of parameter objects if given a forcetype?

bannanc avatar Sep 23 '16 23:09 bannanc