Currently all processes (calcfunctions) have to be toplevel (importable) in modules, otherwise uncachable
Since is_valid_cache assumes that calcfunctions (all process classes/functions) are top-level (importable) in modules, calcfunctions which are not, become uncacheable. (If I understood this right)
I had classes of the following design to keep the provenance:
XY.py
def class XY:
def function_x(self, *args, **kwargs):
do something
call function_for_provenance(AiiDA_datatypes)
@calcfunction
def function_for_provenance(AiiDA_datatypes):
do stuff
Caching does not seem to work in these cases and will throw errors of the type:
File "/aiida-core/aiida/orm/nodes/process/process.py", line 485, in is_valid_cache
process_class = self.process_class
File "/aiida-core/aiida/orm/nodes/process/process.py", line 126, in process_class
process_class = getattr(module, class_name)
AttributeError: module '....XY' has no attribute 'function_for_provenance'
A solution is of course to put calcfunctions out of classes and make them top level and importable, which should be always possible I guess. If this enforcement is wanted, this should be documented in the developer guide and caching sections.
I have not thought about if is_valid_cache implementation can be changed to allow for this, do we even want to? @greschd what do you think?
I'm not entirely sure if this is just a documentation issue, or if there is a fixable bug (or several) hidden here somewhere. A few observations:
- The immediate reason this does not work is because the
.process_classproperty can not load thecalcfunction. This is simply a consequence of the node / class split, and the fact that the calcfunction is not importable. However:- it seems wrong for the
.process_classproperty to raiseAttributeError(due to this line, I assume) -- all other ways in which getting the process class can fail result inValueError. - even if the
process_classaccess works, we don't actually do anything with it in the case of process functions - because they don't have theis_valid_cachehook, it is just skipped.
- it seems wrong for the
- It would actually be possible to import and access the functions nested in a class: This is one of the reasons for the
__qualname__(since 3.3) attribute. It gives "the name relative to the top-level of the module", e.g.XY.function_for_provenance. The relevant code in creating the process type string is here: https://github.com/aiidateam/aiida-core/blob/develop/aiida/engine/processes/process.py#L486
Possible things to do:
-
I would propose to catch the
AttributeErrorinprocess_classand re-raise asValueError. That makesis_valid_cachejust returnFalse, instead of straight up excepting. -
Also, I think replacing
__name__with__qualname__would be good, since we no longer need to run on Python versions that don't support it. -
I'm torn on whether we should just move the "hooking into the
process_class" part of theis_valid_cacheimplementation to theCalcJobNode. Right now this is the only place we actually use it -- but in principle you could just define anis_valid_cache(function) attribute on acalcfunctionand it would work as expected.
Tagging @sphuber for comment.