aiida-core
aiida-core copied to clipboard
Process is only stored after it fires `ProcessListener.on_process_finished` event resulting in unfinished process in next step
Bug report from a question on discourse @t-reents https://aiida.discourse.group/t/workchain-continues-before-finishing-the-pervious-step/472 also observed by @superstar54 in the workgraph development
Describe the bug
Already well described here https://aiida.discourse.group/t/workchain-continues-before-finishing-the-pervious-step/472
Here my results from investigating it: The workchain starts the next step before the finished process is stored in the database and thus loads the process before the process state was updated to Finished
resulting in the nonzero exit code (is_finished_ok
checks if the process state is running).
So on the plumpy side the event is already fired before the process is stored in the database. A simplified backtrace of
the process that fires the event (I guess broadcasting) that it finishes, and by that continuing the next process before it updated its process_state
:
In the aiida.engine.processes.process.Process.on_entered
function
https://github.com/aiidateam/aiida-core/blob/c7c289d3892bf76894714f53f58b7ce5b0761178/src/aiida/engine/processes/process.py#L422
the parent method is invoked that is in plumpy
https://github.com/aiidateam/plumpy/blob/b3837fc9dbf7dc5aca0785e93b94cf5b89d04a91/src/plumpy/processes.py#L701
This invokes much later
https://github.com/aiidateam/plumpy/blob/b3837fc9dbf7dc5aca0785e93b94cf5b89d04a91/src/plumpy/processes.py#L837-L840
self._fire_event(ProcessListener.on_process_finished, self.future().result())
that broadcasts the event to all processes resulting in the next process being continued. The update of the process state to Finished
in the database (the object's process state has been updated but not in the database!) happens later in the aiida.engine.processes.process.Process.on_entered
function
https://github.com/aiidateam/aiida-core/blob/c7c289d3892bf76894714f53f58b7ce5b0761178/src/aiida/engine/processes/process.py#L442
I will try to switch the events and see what it breaks.
Environment
I think it happens on all AiiDA version (tried newest 2.6.2 and 2.2) and all backends, since this looks like a bug in the engine.
Supplementary
Backtrace log of up to the _fire_event
backtrace.log