Jim Garlick
Jim Garlick
@chu11, FYI I added this issue to our milestone for next week's release. LMK if that's not feasible. Seems like at least we could get the performance improvements in.
> I don't think this is going to be an optimization option. Darn, I didn't notice that either. Well there are still the other optimizations and the mystery of the...
> Could we add a new function to disable the existing callbacks and register a new one that is passed output as soon as it is ready, including the data...
That reminds me - I have a subprocess cleanup branch in which i consolidated the stdout, stderr, and channel output callback ops into one (since the stream name is passed...
`flux queue stop`, which ran earlier in the cleanup sequence, should have ensured that no alloc requests are pending to the scheduler. Although dmesg shows that it ran successfullly: ```...
This also shows the problem: ``` $ flux queue status -v batch: Job submission is enabled batch: Scheduling is stopped debug: Job submission is enabled debug: Scheduling is stopped 0...
Closing. The problem as stated in this issue's description was resolved by flux-framework/flux-sched#1209.
For background the protocol was simplified in #5004 and that is when the flags were added as a replacement for three explicit booleans that told whether some of the callbacks...
We do set `RemainAfterExit` on all units now: https://github.com/flux-framework/flux-core/blob/master/src/common/libsdexec/start.c#L302 As I recall, successful units do not remain but failed ones do, until `ResetFailedUnit` is called: https://github.com/flux-framework/flux-core/blob/master/src/modules/sdexec/sdexec.c#L283
Just wanted to be sure we're all on the same page since some comments indicate otherwise: the subprocess _server_ in play here is the one in the `sdexec` module (indicated...