flux-core icon indicating copy to clipboard operation
flux-core copied to clipboard

job-exec: cleanup comments / files / functions of `bulk-exec` implementation

Open chu11 opened this issue 9 months ago • 5 comments

The primary bulk-exec implementation is largely handled in a set of files and functions called exec and its config in exec_config while the heavy lifting is done in bulk-exec.c. It appears that exec and exec_config were initially designed under the assumption of multiple "exec implementations", but that didn't appear to come to fruition, so the naming of files and functions is a little confusing. The split out of exec_config is from cfd961eacdfd4f069bffc3fa50b057ed4300a464 when it was used by bulk-exec and the initial sdexec prototype.

In addition, there are several comments that are sort of not super clear

Initialize common configurations for use by job-exec exec modules.` (there's only the bulk-exec one not counting testexec)

Flux job-exec configuration common code (there's only one bulk-exec using it)

(some discussion from #5913)

This could be cleaned up a bit, although I'm not entirely sure how at the moment. Minimally exec_config can be re-absorbed back into exec. And then perhaps a layer of abstraction can be removed and everything would be in bulk-exec?

chu11 avatar May 07 '24 16:05 chu11

under the assumption of multiple "exec implementations", but that didn't appear to come to fruition,

Well, there are already 2 implementations: a mock execution or testexec implementation, that doesn't actually execute job shells and is also used to implement "exec override" (which also happened to be the first implementation), and also the "exec" implementation, which actually executes job shells.

This could be cleaned up a bit, although I'm not entirely sure how at the moment. Minimally exec_config can be re-absorbed back into exec. And then perhaps a layer of abstraction can be removed and everything would be in bulk-exec?

I wouldn't suggest that. The bulk-exec.c code is a standalone bulk execution module that actually produces a bulk-exec executable (which is useful for manual testing purposes).

Maybe the problem is that the name of the implementation shouldn't be bulk-exec?

grondo avatar May 07 '24 17:05 grondo

There's also #3346 and the need to rework job-exec to support restart with running jobs. I'm wondering if we should defer minor cleanup of the prototype-y nature of the current design until we know where we're going with all that.

grondo avatar May 07 '24 17:05 grondo

There's also #3346 and the need to rework job-exec to support restart with running jobs. I'm wondering if we should defer minor cleanup of the prototype-y nature of the current design until we know where we're going with all that.

Fair enough. It was just getting confusing. Lemme clean up some of the comments mentioned in #5913 atleast.

chu11 avatar May 07 '24 17:05 chu11

Maybe the problem is that the name of the implementation shouldn't be bulk-exec?

Possibly. It is just a little weird to open up exec.c and see

struct exec_implementation bulkexec = {
    .name =     "bulk-exec",
    .config =   exec_config,
    .init =     exec_init,
    .exit =     exec_exit,
    .start =    exec_start,
    .kill =     exec_kill,
    .cancel =   exec_cancel,
    .stats =    exec_stats,
};

So there's a "why isn't this just called exec?" and a decent chunk of the callbacks just call bulk-exec directly anyways.

chu11 avatar May 07 '24 17:05 chu11

i guess we could absorb exec_config.c back into exec.c. Those wrapper functions may not be providing much nowadays. Although it's somewhat harmless to leave it split out.

chu11 avatar May 07 '24 17:05 chu11