flux-core icon indicating copy to clipboard operation
flux-core copied to clipboard

libsubprocess: support FLUX_SUBPROCESS_FLAGS_LOCAL_UNBUF

Open chu11 opened this issue 9 months ago • 1 comments

Per discussion in #5919, some notes

  • built on top of my cleanups in #5974
  • should the use of this flag in job-exec be a different PR?
  • in job-exec subprocess flags are passed in via the API to bulk-exec, so it felt wrong to just assume that flag is always set. So that led to adding a new flux_subprocess_get_flags() function and calling an appropriate "read" function depending on flags. This could be done differently of course, perhaps the flags aren't passed in from the API and are internalized within bulk-exec?
  • haven't tested to ensure that the performance profile changes as expected, dunno if we have a simple reproducer laying around?

chu11 avatar May 16 '24 21:05 chu11

re-pushed given the comments above, did an additional profiling run and it showed some smalll incremental improvements.

chu11 avatar May 21 '24 21:05 chu11

re-pushed w/ tweaks per comments above. Only notable thing is kept EPERM vs EINVAL for the "can't use these functions" error case.

chu11 avatar May 29 '24 15:05 chu11

Sounds good. I'll get this on my test system that is configured for sdexec and see how it goes.

One concern is the lack of output buffering in sdexec, which in combination with this flag means no buffering at all. I'll try it and we should also audit job-exec with that in mind to see if we can spot any problem areas. We may need to add output buffering to sdexec before switching job-exec over if there are problems.

garlick avatar May 29 '24 19:05 garlick

The shell barrier protocol is fine since I believe it's the only user of stdout, and the protocol consists of the shell sending a line of output and then job exec sending one line back. There is not much opportunity for buffering (or lack thereof) to screw that up.

@grondo suggested I print something from a userrc to check how job-exec handles shell output on stderr.

-- foo.lua
io.stderr:write("foo bar\n")
io.stderr:write("foo bar\n")
io.stderr:write("foo bar\n")
io.stderr:write("foo bar\n")

With master

$ flux run -o userrc=foo.lua true
0.040s: flux-shell[0]: stderr: foo bar
0.040s: flux-shell[0]: stderr: foo bar
0.040s: flux-shell[0]: stderr: foo bar
0.040s: flux-shell[0]: stderr: foo bar
$ flux job eventlog -H -p exec $(flux job last) | grep log
[  +0.017012] log component="flux-shell" stream="stderr" rank="0" data="foo bar\n"
[  +0.017024] log component="flux-shell" stream="stderr" rank="0" data="foo bar\n"
[  +0.017031] log component="flux-shell" stream="stderr" rank="0" data="foo bar\n"
[  +0.017038] log component="flux-shell" stream="stderr" rank="0" data="foo bar\n"

With this PR there is evidence that lines are not being buffered. Surprisingly, this happens with and without sdexec. I thought we had remote line buffering with the normal subprocess server.

$ flux run -o userrc=foo true
0.064s: flux-shell[0]: stderr: foo bar
foo bar
foo bar
foo bar
$ flux job eventlog -H -p exec $(flux job last) | grep log
[  +0.033395] log component="flux-shell" stream="stderr" rank="0" data="foo bar\nfoo bar\nfoo bar\nfoo bar\n"

garlick avatar May 29 '24 21:05 garlick

Oh! Even though the remote subprocess is line buffered (the default), we are using flux_subprocess_read() in the server to consume data from the buffer and put it into messages. We need to use the line oriented read functions so that we are sending only one line per message or else the benefits of line buffering are lost. I guess the server output callback needs to check the stream's LINE_BUFFER option setting and then use the appropriate read function.

garlick avatar May 29 '24 22:05 garlick

This seems to fix that problem.

diff --git a/src/common/libsubprocess/server.c b/src/common/libsubprocess/server.c
index fca01444e..139923f23 100644
--- a/src/common/libsubprocess/server.c
+++ b/src/common/libsubprocess/server.c
@@ -246,7 +246,10 @@ static void proc_output_cb (flux_subprocess_t *p, const char *stream)
     const char *buf;
     int len;
 
-    if ((len = flux_subprocess_read (p, stream, &buf)) < 0) {
+    len = flux_subprocess_getline (p, stream, &buf);
+    if (len < 0 && errno == EPERM) // not line buffered
+        len = flux_subprocess_read (p, stream, &buf);
+    if (len < 0) {
         llog_error (s,
                     "error reading from subprocess stream %s: %s",
                     stream,

garlick avatar May 29 '24 22:05 garlick

This seems to fix that problem.

Hmmmm. There's a part of me wondering if I originally messed up and had intended to NOT line buffer on the server side, i.e. all line buffer handling occurs on the client side.

But now that we have LOCAL_UNBUF, perhaps that no longer matters and the approach you have is what needs to be done.

chu11 avatar May 29 '24 23:05 chu11

I guess it's a trade off. The client has turned out to be pretty heavy with the buffering on that end. OTOH if we buffer on the server end then we can end up with more message framing overhead when there are bursts of messages. But since it's a common idiom to have many clients in one place spread out across many servers, moving the buffering to the server side probably makes sense.

Should we add the server side change in this PR and also maybe a test to show that we get one callback per line in UNBUF mode with the libsubprocess server? Then we could follow up with a PR for sdexec to add some buffering to it. I can look at that.

garlick avatar May 30 '24 00:05 garlick

re-pushed, adding commit to update server side to send each line buffered chunk and added a multiline test.

chu11 avatar May 30 '24 14:05 chu11

hmmm alpine builder failed

Error: not ok 13 - job-exec: can specify default-shell on cmdline
Error: not ok 16 - job-exec: update default shell via config reload
Error: not ok 17 - job-exec: cmdline default shell takes priority
Error: not ok 18 - job-exec: can specify imp path on cmdline
Error: not ok 21 - job-exec: update imp path via config reload
Error: not ok 22 - job-exec: cmdline imp path takes priority
Error: not ok 23 - job-exec: can specify exec service on cmdline
Error: not ok 25 - job-exec: update exec service via config reload
Error: not ok 26 - job-exec: cmdline exec service takes priority
Error: not ok 28 - job-exec: update exec service override via config reload
Error: not ok 32 - job-exec: update sdexec properties via config reload
Error: ERROR: t2403-job-exec-conf.t - exited with status 1

these are all job-exec reload module tests. Wondering if the recent additions kill-timeout, etc. have issues when the module is reloaded. on alpine, the globals may not be reset.

Here's one troubling failure

  flux-module: job-exec.stats-get: Out of memory
  not ok 13 - job-exec: can specify default-shell on cmdline

I'll restart the builder, this obviously passed on other runs allowing this to be merged.

Edit: failed again same way

chu11 avatar May 30 '24 22:05 chu11

without any better ideas, re-pushed with an extra commit that re-initializes the new signaling globals. I'm guessing my changes to job-exec tickled things enough to make things fail.

chu11 avatar May 30 '24 23:05 chu11

Ah, yeah, that makes sense. Probably we were only lucky to get away with it previously.

grondo avatar May 30 '24 23:05 grondo

oh whoops, i should have removed merge-when-passing until that new commit was reviewed .... hopefully it's ok :-) here it is

https://github.com/flux-framework/flux-core/pull/5975/commits/a35f615a06693c23a838adf804e93f993e32faa1

chu11 avatar May 30 '24 23:05 chu11

Well that change overwrites the previous setting on any configuration reload with the defaults. However, probably not a big deal since these are mostly used only for testing at this point...

grondo avatar May 31 '24 00:05 grondo

Well that change overwrites the previous setting on any configuration reload with the defaults. However, probably not a big deal since these are mostly used only for testing at this point...

Gah ... that's right, should have passed the saved argc/argv into it like we do with the exec_config ... I'll do a fix.

    if (job_exec_set_config_globals (h, conf, 0, NULL, &err) < 0)
        goto error;

    while ((impl = implementations[i]) && impl->name) {
        if (impl->config) {
            if ((*impl->config) (h, conf, ctx->argc, ctx->argv, &err) < 0)
                goto error;
        }
        i++;
    }

chu11 avatar May 31 '24 03:05 chu11

Codecov Report

Attention: Patch coverage is 76.11940% with 16 lines in your changes missing coverage. Please review.

Project coverage is 83.26%. Comparing base (9e84f0f) to head (a35f615). Report is 500 commits behind head on master.

Files with missing lines Patch % Lines
src/common/libsubprocess/remote.c 62.50% 15 Missing :warning:
src/modules/job-exec/bulk-exec.c 66.66% 1 Missing :warning:
Additional details and impacted files
@@           Coverage Diff           @@
##           master    #5975   +/-   ##
=======================================
  Coverage   83.26%   83.26%           
=======================================
  Files         518      518           
  Lines       83468    83513   +45     
=======================================
+ Hits        69502    69541   +39     
- Misses      13966    13972    +6     
Files with missing lines Coverage Δ
src/common/libsubprocess/server.c 78.86% <100.00%> (+0.19%) :arrow_up:
src/common/libsubprocess/subprocess.c 89.56% <100.00%> (+0.63%) :arrow_up:
src/modules/job-exec/job-exec.c 76.99% <100.00%> (+0.09%) :arrow_up:
src/modules/job-exec/bulk-exec.c 77.86% <66.66%> (ø)
src/common/libsubprocess/remote.c 76.62% <62.50%> (-0.13%) :arrow_down:

... and 12 files with indirect coverage changes

codecov[bot] avatar Aug 31 '24 14:08 codecov[bot]