Mark Grondona

Results 517 comments of Mark Grondona

The job-exec module can already send notifications to the rank 0 shell, maybe it could send an RPC when the critical-ranks set is modified? The job shell could add a...

Unless I'm missing something, the starting value for critical-ranks is the set of all ranks. This is only updated if the job manually issues the critical-ranks RPC to the job-exec...

Hm, I thought of a counter-example here that suggests ignoring exit-timeout for non-critical ranks should be opt-in only. If a batch script runs a series of full-size jobs, and a...

Now that #6652 has been merged, the point in the last comment above has been addressed, so maybe now we can make `-o exit-timeout=none` the default in `flux alloc` and...

Example of housekeeping errors: ``` job-manager.err[0]: housekeeping: tuolumneXXX (rank XXX) fALfUWKpRdZ: No route to host job-manager.err[0]: housekeeping: tuolumneYYY (rank YYY) fALfUW3WZXm: No route to host job-manager.err[0]: housekeeping: tuolumneZZZ (rank ZZZ)...

> The housekeeping errors are probably to be expected and shouldn't be related to the running job count since housekeeping starts when the job transitions to INACTIVE. Ok, I only...

> hypothetically flux job attach just stops outputting after N lines and outputs to the user "too many lines, stopping output" or whatever. But due to what @garlick said above:...

I like @garlick's idea of limiting output data in the shell output plugin. RFC 24 even has a [Redirect Event](https://flux-framework.readthedocs.io/projects/flux-rfc/en/latest/spec_24.html#redirect-event), so the shell could potentially post a redirect event and...

I had one other idea here that might be simpler than some of the lower level KVS changes. What if we allowed eventlog rotation, e.g. for standard io once an...

Eh, like I said the naming isn't critical, by default the Flux job shell will load `*.so` from the plugin search path, so either `fluxspindle.so` or `libfluxspindle.so` would work.