flux-core
flux-core copied to clipboard
flux job attach: support option to not output whole output history
Began looking at #4869 briefly and realized it can be annoying that flux job attach (and tool proposed in #4869) will output ALL stdout/stderr when being watched.
There should be an option to only output new information. Like a user outputs some status to stdout and doesn't need to see the last 10 hours of status, only the new stuff coming in. I'm thinking something equivalent to tail -f.
I tried to add a basic flag to flux_job_event_watch() to deal with this but realized that wouldn't work. We really need something is knowledge of RFC24 standard I/O format so it knows that the header/redirect/etc. type information should be forwarded but not data.
Luckily, since eventlog's can never be zero length per RFC18, that eliminates some corner case handling and some trickery in job-info may work. Otherwise we need to get the eventlog's initial length. It could be a costly operation, perhaps a flux_kvs_lookup_length() or flux_kvs_lookup_stats() kinda function would be less costly.
I don't think this is high priority. Specifically in #4869 we do want all the job output, since that is the point of the use case (to not lose any job output or errors when the instance ends)
I don't think this is high priority. Specifically in #4869 we do want all the job output, since that is the point of the use case (to not lose any job output or errors when the instance ends)
Understood. I guess when the idea of a potential flux wait tool was mentioned in #4869, it did suggest to me that a user may want to wait for a job and see its output in a "general" way.
I'm not sure I follow what is meant by seeing output in a general way, sorry! I think without a use case, it would be difficult to design such a thing.
Oh, do you mean give me output starting from now until the job finishes? That might be neat, but not sure there is an actual need for it?
Oh, do you mean give me output starting from now until the job finishes? That might be neat, but not sure there is an actual need for it?
Yeah this is what I was thinking. Like someone submitted a job, walked away. Then it started later and they are like "oh, I wonder what the status of it is right now", so they would want to do flux wait --watch --follow on it.
Dunno if there's a huge need, just something I was thinking of. B/c I'm imagining the hours and hours of stdout that someone may not want to output.
Yeah, maybe at first it could just gobble up lines without printing until a timestamp. (kind of like I assume tail does when reading stdin?)
Yeah, maybe at first it could just gobble up lines without printing until a timestamp. (kind of like I assume tail does when reading stdin?)
That is a simple and very good idea on how to solve it for a round 1 implementation!! I initially went down the path of flux wait --watch --follow TONS OF JOB IDS and thinking it would be good to eliminate all of that data being sent.
Yes, long term that would be good. However, given there isn't a requirement at this time for a solution to this issue, and in fact in most cases we do not want to drop any output, that might be premature optimization for now.
Now we do have a couple user requests to add this option (e.g. #6843, which I'll close in favor of this issue). So it requires a priority bump! Maybe we can at first add the timestamp based support and later replace that with something using a KVS feature to only send the latest data?
Per some offline discussion with @grondo, we'd like to have some tail-like behavior. For example: flux job attach --TBD-option-name JOBID might output the last 10 lines on a inactive job. On an active job, output the last 10 lines and then any new lines after it.
Unfortunately, anything involving this would (right now) require us to get KVS treeobj data. i.e. if we were to add some mechanism to kvs-watch to start at a specific index in the valref array, we need to know the length of the array first. And getting this treeobj data would be outside of the job-info module and the security checks in there. That or we update job-info to return KVS treeobjsects.
So just brainstorming a bit, I thought maybe we could have an option like flux job attach --start-index-percent=90 which means "start 90% of the way through the valref array", and stream starting data from that point. --start-index-percent=0 is the same as just reading from the beginning. Internally kvs-watch can just calculate where that starting point is based on the percent number passed in.
pros: I don't believe we have to get any KVS treeobj data to calculate starting indexes and things like that. cons: the "tail" is going to be non-fine grained. It's totally dependent on the output from the job and how much "chunking" of output goes into blobrefs. it could output varying number of lines.