toil icon indicating copy to clipboard operation
toil copied to clipboard

Better recovery after running out of memory.

Open DailyDreaming opened this issue 1 year ago • 1 comments

From Glenn H: "Say my job requests 10Gb but usees 16Gb RAM. Slurm will rightfully evict it but then what? I need to fix Cactus to estimate more memory then rerun the entire workflow. I've recently hacked around this a bit by going through environment variables in Cactus, but it'd be nice to have an option to restart evicted jobs with more memory."

┆Issue is synchronized with this Jira Story ┆Issue Number: TOIL-1519

DailyDreaming avatar Mar 11 '24 17:03 DailyDreaming

From Adam: "This might mean more documentation/a better version of the --doubleMemory thing I think we already have."

DailyDreaming avatar Mar 11 '24 17:03 DailyDreaming