gmso
gmso copied to clipboard
Comparison between equivalent potentials with different functional forms (e.g., k vs. k/2)
Some potentials are the same but can be expressed in different functional forms. E.g., a harmonic bond can be expressed as V(r) = 0.5*k*(r-req)**2
or V(r) = k*(r-req)**2
. Another example is the OPLS vs. Ryckaert-Bellemans forms of dihedral potentials. I think topology should support conversions between the different forms.
One method of handling this is to define different potential types for each functional form (e.g., a HarmonicBondPotential_k
and a HarmonicBondPotential_k2
) and then write conversion functions (e.g., a function that accepts a HarmonicBondPotential_k
and returns a HarmonicBondPotential_k2
, with the updated value for the force constant). Writers for the various engines could then support both forms (accepted_potentials = [ HarmonicBondPotential_k(), HarmonicBondPotential_k2()]
), and call the conversion functions to get the form needed for the engine in question.
Thoughts? If this idea has traction I'll start working on the code for a few cases.
I think we could do something like this:
>>> import sympy
>>> eq1 = sympy.sympify("k*(r-r_eq)**2")
>>> eq2 = sympy.sympify("0.5*k*(r-r_eq)**2")
>>> eq3 = sympy.sympify("k*(r-r_eq)")
>>> eq1 / eq2
2.00000000000000
>>> type(eq1/eq2)
<class 'sympy.core.numbers.Float'>
>>> eq1 / eq3
r - r_eq
>>> type(eq1/eq3)
<class 'sympy.core.mul.Mul'>
So if we divide two expressions and it gives us a sympy float, then they are equivalent. But if it yields a Mul expression, then the two expressions are not equivalent. This might be better than having multiple expressions for all of the possible leading coeffs.
Smart. That is a nice way of handling cases where the functional forms differ by a multiplicative factor. I like the idea of avoiding 'duplicate' potential definitions. In reality, it wouldn't be too much duplication, as most cases are just going to be k
or k/2.
. But avoiding it is still cleaner.
I'm trying to think about how this gets implemented in a writer ... maybe it will come to me as I work on it.
Could we internally store things in one form (call it f1()
), and compare the functional forms in the FF XML file (call it f2()
) to the internal forms when we read in the FF XML file and do the required f2() -> f1()
conversion at that time perhaps?
I think this warrants discussion, I'm not sure the best approach to these issues.
-
A practical approach could be to survey the different ways expressions, like
k
vsk/2
, can be effectively the same but strictly different, and handle each of those cases carefully. I'm not sure how long this list would be, probably not very long. The obvious downside is that it may not be flexible to corners of the molecular simulation world we're not familiar with, and it may therefore be hard (or not?) to hard-code solutions to these conflicts in the future. -
On the other hand, we could think of elegant ways to handle several ways in which this can happen, but in a manner more agnostic to the functional form (Parashara's snippet above would be one example of this). The downside here is may not be much bang for our buck; if we only have the
k
vsk/2
discrepancy then it wouldn't make much sense to build out a ton of mathematical elegance. I think the counter-argument here is that we don't necessarily need to build a ton of stuff out, we can just deal with them as they come.
What are other ways in which expressions in our domain can be similar?
Re: at the level of implementation, I think the k
vs k/2
conflict is often addressed by storing one of them as k'
, which is either half or twice the other force constant.
Another related question: How can we compare potentials with the same expression, but with different variable names? Like if someone use K
or ks
instead of k
for a force constant. Or sig
/eps
instead of sigma
/epsilon
? Just subtracting the expressions wouldn't work:
>>> import sympy
>>> func = sympy.sympify("x + y")
>>> funky = sympy.sympify("ex + why")
>>> func - funky
-ex - why + x + y
>>> sympy.simplify(func - funky) == 0
False
I think this does it https://docs.sympy.org/latest/modules/core.html#sympy.core.basic.Basic.replace
$ python
Python 3.7.6 | packaged by conda-forge | (default, Jan 7 2020, 22:33:48)
[GCC 7.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sympy
>>> sig, eps, sigma, epsilon = sympy.symbols('sig eps sigma epsilon')
>>> lj = sig * eps
>>> lj2 = lj.replace(sig, sigma)
>>> lj, lj2
(eps*sig, eps*sigma)
So should we clean up expressions so the variable names are consistent? Then the difficult part would become identifying which variable names need to be replaced with what.
We should probably make an effort to use similar variable names through this codebase
re: parsing and identifying variables that are means to be the same, I don't know if we can avoid a hardcoded dict that maps lists of things a variable could be to what we want it to be stored as
What if we just renamed each parameter c_1
, c_2
, c_3
, ... in the order that they occur in the expressions while initializing a potential? It we similarly rename the independent variables x1
, x2
, x3
, ... then the two equations would be equivalent. However, this might make the expressions harder to read for the user.
what is an expression is written in a different order? i.e. r12 vs r6 term being first, would that break this?
Yeah. that would break it.
Then this becomes pretty messy. We could do expr.args
, convert them to strings and sort them. That will ensure a deterministic order.
Edit: A better way might be to do expr.simplify()
at the start. I think the result of that is also deterministic.
>>> import sympy
>>> x = sympy.sympify("4 * eps * ((sig / r)**6 - (sig / r)**12)")
>>> x = sympy.sympify("4 * eps * (-(sig / r)**6 + (sig / r)**12)")
>>> y = sympy.sympify("4 * eps * ((sig / r)**12 - (sig / r)**6)")
>>> z = sympy.sympify("-4 * eps * (-(sig / r)**12 + (sig / r)**6)")
>>> x.simplify()
-4*eps*sig**6*(r**6 - sig**6)/r**12
>>> y.simplify()
-4*eps*sig**6*(r**6 - sig**6)/r**12
>>> z.simplify()
-4*eps*sig**6*(r**6 - sig**6)/r**12
One other possible solution is to have one of the sympy solutions already mentioned with an optional id
or similar variable in the tag which specifies common potentials (e.g. "LJ12-6"). This would allow us to use the second where available but default to the first when necessary.