cbrain icon indicating copy to clipboard operation
cbrain copied to clipboard

Automatic restarting of task that are in the failed state - janitor

Open shots47s opened this issue 5 years ago • 0 comments

As we have now been moving to running large datasets at once and there is a greater potential of a job failing due to factors unrelated to CBRAIN (e.g. machine failure), it would be good to give users the ability to set that their tasks get automatically restarted on a failure.

Questions:

  1. What failed state should this be dependent on?
  2. Should we let the users set the number of times it tries to restart or should we hard code it?

shots47s avatar Aug 13 '19 18:08 shots47s