Jim Garlick
Jim Garlick
Kind of a gray area, but generally file system archive programs like tar do not do that unless you specifically request it, and if backing up a system as root...
Is slurm enforcing a 2 hour time limit? Perhaps we are handling a signal poorly? The mcast error is not fatal, but it may indicate that rank 0 lost contact...
Now I'm wondering if that "resource temporarily unavailable" is from an older libzmq. Please post output of `flux version` when you have a chance.
This doesn't help you right now, but to be clear, this is the current intended behavior. The exec system handles "a node crashed that was allocated to a job" by...
No plans in the near term, although our priorities will no doubt shift around as we get feedback during rollout.
(b) is already possible by setting `--broker-opts="-Stbon.fanout=N" on a batch job. (c) is IMHO too challenging to work on in the El Capitan time frame, given everything else that is...
Would `nohup sudo -u \#${FLUX_JOB_USERID} ...` get the job done?
A couple of quick answers: If the flux reactor is the inner reactor loop, the only straighforward way to embed IMHO it is to register a periodic timer in the...
Meh, I think this is probably enough on this. If we have issues down the road we can deal with it then?
Oops, broken && chain! I force pushed a fix and I'll set MWP. Thanks!