kedro icon indicating copy to clipboard operation
kedro copied to clipboard

Add resume suggestion to parallel runner

Open jmholzer opened this issue 2 years ago • 0 comments

Description

See #1795 for the original discussion and context surrounding this issue.

Currently, when Kedro is run using SequentialRunner and a Node creates an exception, Kedro will suggest resuming the run from the nearest Nodes with persisted input. This saves the user a great deal of time that would otherwise be wasted by running earlier Nodes whose output was successfully saved. Currently, no suggestion is made in the same case when Kedro is run using ParallelRunner.

This is because, when a Node reaches an exception while using ParallelRunner, other nodes may still be in the process of running. The 'end state' of Nodes which have finished running when the exception is reached is not guaranteed due to the inherent stochasticity of the sequence with which Nodes are executed in a parallel scheme. Therefore, if 'resume-suggestion' logic were applied to ParallelRunner, the suggestion would be inconsistent between runs, with the correct resume scenario suggestion only generated sporadically.

Context

This change will suggest a resume scenario to users who use ParallelRunner, saving them a great deal of time that would otherwise be wasted in running Nodes unnecessarily.

Possible Implementation

Ensuring that all nodes that can be run are run before the 'end state' is reached and the exception is generated using (for example) joins.

jmholzer avatar Sep 05 '22 15:09 jmholzer