Mark Grondona
Mark Grondona
> the implementation gets the URI of the remote instance and tries to communicate with it, which requires ssh if I remember correctly?) Yes, currently the remote URI for jobs...
If this requires a call to `unshare()` then we may have to implement this an an IMP "plugin" - a facility that unfortunately does not yet exist. Do you know...
While transitioning to `__func__`, perhaps we could also reduce many uses of this predefined macro in debug and error messages in favor of explicit strings. Many times I've tried to...
Users are seeing this error again in a slightly different scenario. Example: 1. user submits batch job with 30m timelimit 2. job is allocated an R with expiration of starttime...
Shouldn't the watch be canceled by the disconnect when the shell exits and closes its handle?
Ah, ok. That makes sense now :-) I wonder if there's a way to fix the disconnect case. There's probably lots of cases of that lingering around...
Note that in this particular case, we had to kill off `flux module remove sched-fluxion-qmanager` which was hanging due to the leaked alloc requests issue (can't find the issue right...
@milroy - Is this branch ready for any testing? I've got @garlick's houskeeping PR (flux-framework/flux-core#5818) installed on a test cluster, but with sched-simple for now since it supports partial release....
No option in wait-event to wait on multiple jobs (a good idea though), but it would be trivial to write a python script to do it.
Since this was also asked in slack, here's a small script to wait for the `start` event for any number of jobs and exit with success on the first: ```python...