aiida-core RFC: permit class member functions as calcfunctions

It is already possible to specify a class member function as a "calcfunction" and use it when not running via the daemon (and not with caching enabled):

class MyWorkChain(WorkChain):
    @calcfunction
    def mysum(a, b):
        return a+b

    def workfunc(self):
        self.ctx.sum = MyWorkChain.mysum(self.inputs.x, self.inputs.y)

Here I simply extend the loading to unbreak caching and running via the daemon, while brushing up error handling a bit (catch one exception more and cleanup the traceback).

May 26 '21 12:05 dev-zero

As far as I can see, this doesn't bind either the class or the instance in any way - should there maybe be a @staticmethod somewhere to indicate that? From the call site it seems clear enough, but looking only at the function definition was somewhat puzzling to me.

May 26 '21 13:05 greschd

@greschd at first I thought I'd have to do that too. Indeed my first working attempt was this (since an unbound staticmethod is not a callable you have to call it via the descriptor protocol):

class MyWorkChain(WorkChain):
    @staticmethod
    def mysum(a, b):
        return a + b

MyWorkChain.mysum = calcfunction(MyWorkChain.mysum)

But the @calcfunction decorator wraps/replaces the function with a new type anyway, hence it doesn't seem necessary.

May 26 '21 13:05 dev-zero

Right, I understand the @staticmethod isn't actually necessary. What I'm a bit concerned about is that the decorated function looks very similar to a plain method (that would bind self), but doesn't act that way. I'm wondering if there's some way we could make that distinction more clear.

May 26 '21 13:05 greschd

Good point, yes.

Maybe some more things:

while doing testing related to this I encountered some weird database errors when a process could not be loaded (e.g. when loading from caching). My guess is that the loading failed but the in the error path the exception would have been added to the incompletely loaded process node.
I started messing with this since I wanted to see whether it would be possible to have class member calcfunctions which pull their arguments from the context or input automatically, such that

    @calcfunction
    def mysum(a, b):
        return a+b

    def workfunc(self):
        self.ctx.sum = MyWorkChain.mysum(self.ctx.x, self.ctx.y)

could be written as

    @calcfunction
    @args_from_context  # automatically passes members x, y from self.ctx
    def sum(x, y):
        return x+y

In the sense that such functions could be used in the outline, but also directly act as calcfunctions. So instead of having explicit delayed/async function calls as calcfunctions those class member calcfunctions in the outline would then more or less push/pop from the ctx "stack".

May 26 '21 14:05 dev-zero

* while doing testing related to this I encountered some weird database errors when a process could not be loaded (e.g. when loading from caching). My guess is that the loading failed but the in the error path the exception would have been added to the incompletely loaded process node.

Happy to take a look at the caching case if you can create a reproducing example.

I like the idea of a calcfunction that operates directly on the ctx.

May 29 '21 10:05 greschd

Was just experimenting with something similar and turning a class member method of a Data node also works:

from .int import Int
from aiida.engine import calcfunction

class CustomInt(Int):
    """`Data` sub class to represent an integer value."""

    @calcfunction
    def add(self, b):
        return self + b

when running it

In [1]: from aiida.orm.nodes.data.customint import CustomInt

In [2]: i = CustomInt(1)

In [3]: i.add(Int(2))
Out[3]: <Int: uuid: 9da61dda-71ea-4dbf-9d98-8f46676cea8d (pk: 346) value: 3>

In [5]: !verdi process list -a -p1
  PK  Created    Process label    Process State    Process status
----  ---------  ---------------  ---------------  ----------------
 345  8s ago     add              ⏹ Finished [0]

Total results: 1

It is a bit surprising after reading the analysis of having a calcfunction within a process class definition. You seem to be saying that self doesn't get bound, but in this examples, it clearly is. So not sure what the difference is.

Feb 10 '22 14:02 sphuber

My guess would be that the self here is serializable, while a workchain class instance is not.

Feb 10 '22 15:02 dev-zero

@dev-zero as you may have noticed, I have been going through old PRs to clean house a bit. This one I think can still be merged, so I took the opportunity to rebase your branch, hope you don't mind. I have also added tests that test the proposed functionality. I am testing two ways of declaring the calcfunction; as a proper staticmethod of the workchain and as a class attribute. The tests validate that it can be run as well as submitted and also that caching works. The question is now whether we should officially recommend these approaches in the documentation. Maybe we should only recommend the staticmethod approach as it seems to make explicit why self is not getting bound and passed the the calcfunction when invoked. Thoughts?

Nov 02 '22 23:11 sphuber

aiida-core aiida-core copied to clipboard

RFC: permit class member functions as calcfunctions

aiida-core
aiida-core copied to clipboard