aiida-core Improve plumpy integration in `aiida-core`

trafficstars

This is a collection of issues regarding plumpy that are not well fleshed out but I think they should at least be mentioned somewhere to keep track of it. I will edit this issue to be more precise in language once working on them.

plumpy WorkChain not correctly used in aiida-core

There is plumpy.workchains.WorkChain but is not really used in the aiida.engine.WorkChain. It is used as typehint in the Stepper classes, therefore the plumpy WorkChain should be an Interface or Protocol that aiida WorkChain should implement.

Issues with nest-asyncio

By switching from nest-asyncio to greenlet (see https://github.com/unkcpz/plumpy/pull/39) we can solve test_disconnect timeout problem (see aiida-core tests/manage/test_manager.py::test_disconnect) and the python recursion limit issue, and most importantly it allows us to use calcfunctions in an unblocking manner.

Improved save and load interface

Refactor of save and load to make it work better pickled data https://github.com/unkcpz/plumpy/pull/33 we can thus support some of the functionalities of workgraph regarding pickling more natively in aiida-core.

plumpy WorkChain not correctly used

Why do we have plumpy.workchains.WorkChain if it is not really used in the aiida.engine.WorkChain. It is used as typehint in the Stepper classes, so it seems to me that plumpy WorkChain should be an Interface or Protocol that aiida WorkChain should implement.

Feb 11 '25 13:02 agoscinski

Refactor of save and load to make it work better pickled data https://github.com/unkcpz/plumpy/pull/33 we can thus support

I guess there were historical security consideration that yaml is used for se/de things storing in checkpoint (pinning @giovannipizzi @sphuber for confirmation). After move to pyyaml 5.4 we use UnsafeLoader anyway to support all kinds of user data, for the security point of view it is similarly not very secure. But since the checkpoint is cleaned up when the node is sealed it is "fine". It was mentioned in https://github.com/aiidateam/aiida-core/issues/3709#issuecomment-775873136 that we should strip the checkpoint when import data from archive. But I believe it is not yet implemented in the aiida-core?

There are some differences between storing the checkpoint items into yaml and into pickle, when I tried to move from storing checkpoint from yaml to pickle.

The pickle also stores some environment information, so the full benchmark is required to see how much size increase when using pickle.
The yaml format store the pointer to where the function can be called and passing the parameters to call after recreate the class and function later. It means the change of code of workchains/calcjobs are taking effect after restart the daemon. While if the objects are pickled, restart daemon will not help. Actually this is quite good since once the workchain is launched, it is natural to expect the behavior is unchanged (this part I am not so sure, correct me if I am wrong).

After all, store the object like process itself and run_fn into pickle make it possible to run process without adding it to python path, which is one of powerful feature manifest by workgraph. It deserve a native support from aiida-core and use the same mechanism to store checkpoint.

Feb 11 '25 23:02 unkcpz

By switching from nest-asyncio to greenlet (see https://github.com/unkcpz/plumpy/pull/39)

Also want to mention the idea here is partially inspired by https://github.com/aiidateam/aiida-core/issues/4876#issuecomment-2242728244 to use greenlet to run synchronous code in asynchronous.

-                result = process.execute()
+                result = await_or_new_loop(coro_executor(process.execute))

what we actually need is running the asynchronous code in the synchronous, which is inside execute() call. It can then fully solve the nested event loop problem and bridge two color of functions.

Feb 11 '25 23:02 unkcpz

. It was mentioned in https://github.com/aiidateam/aiida-core/issues/3709#issuecomment-775873136 that we should strip the checkpoint when import data from archive. But I believe it is not yet implemented in the aiida-core?

At the very least there is an implicit guarantee because only terminated processes can be exported and when a process terminates, its checkpoint is removed. But this is just a soft guarantee that is easily skipped. But I vaguely remember that there might be an explicit check in the export code to drop the checkpoint attribute when exporting. Should be easy to find.

Feb 12 '25 08:02 sphuber

At the very least there is an implicit guarantee because only terminated processes can be exported

That is true, but if someone really want to put malicious code in the data node it is easy to go around.

But I vaguely remember that there might be an explicit check in the export code to drop the checkpoint attribute when exporting.

Right! I thought it was only in export code which surely is implemented but not for import. But actually it is already there in https://github.com/aiidateam/aiida-core/blob/d2fbf214ad2fcfe5e39f9ebe2982f05557196397/src/aiida/tools/archive/imports.py#L525-L526

Feb 13 '25 17:02 unkcpz

Hi @unkcpz , @agoscinski , @sphuber !

Hope you are doing well!

I would like to solve this one . The plan in my head is to refactor plumpy.workchains.WorkChain to be an interface within aiida.engine.WorkChain, refactor into greenlet for better async handling, and refactor the save/load functionality of pickled data. Let me know as well if I proposed anything wrong ?I'd be open to discuss and could you please assign this issue to me ? I'd like to solve this one .

Thank you!

Apr 06 '25 09:04 Muhammad-Rebaal

aiida-core aiida-core copied to clipboard

Improve plumpy integration in `aiida-core`

plumpy WorkChain not correctly used in aiida-core

Issues with nest-asyncio

Improved save and load interface

plumpy WorkChain not correctly used

aiida-core
aiida-core copied to clipboard