flux-core icon indicating copy to clipboard operation
flux-core copied to clipboard

Run administrative epilog even if job is canceled before starting

Open jameshcorbett opened this issue 1 year ago • 1 comments
trafficstars

If the prolog action described in https://github.com/flux-framework/flux-coral2/issues/166 goes into production, it will make changes the compute nodes which must be undone by a matching epilog action. However, if the job is canceled or fails before the application begins to execute, the epilog action doesn't run. This leaves the potential for the node to be left in a bad state where the changes made by the prolog are never undone by the epilog.

jameshcorbett avatar Jun 25 '24 01:06 jameshcorbett