flux-core
flux-core copied to clipboard
kvs-watch/job-info: support a sentinel value indicating "end of initial values"
It would be convenient if kvs-watch could inform watchers that all data that is presently available (i.e. data may appear in the future but we don't know) has already been streamed and sent.
This could be useful for #5035. We could cache the last N lines of data until we reach the current "end of initial value" and then could output those N lines via a "tail" option. Because we don't have a sentinel or flag indicating "we're at the end of the initial value", we can't implement this algorithm.
How could this be done?
- new rpc parameter on responses indicating this condition.
- Pro: makes the most sense, a
flux_kvs_get_sentinel()function could be used to retrieve the flag - Con: sometimes multiple events are returned in "one blob", so gotta calculate "is this the last eventlog entry in this blob" (or count entries before we parse) in
job-infomodule
- Pro: makes the most sense, a
- some "sentinel" RPC, like maybe an empty value or something, works sort of like with previous option with some sort of "is a sentinel" function.
- Pro: easier than above b/c I don't have to count all the eventlog entries, just "forward" it in the
job-infomodule. - Con: sort of breaks the idea of "streaming events"
- Pro: easier than above b/c I don't have to count all the eventlog entries, just "forward" it in the
- errno other than ENODATA could indicate this condition.
- Pro: no need for new helper functions
- Con: This technique is a bit dangerous b/c it could break most existing code, so the errno would only be sent if the behavior is requested by the caller
Edit: Doh, option 1 above won't work b/c we need something for "no initial value". I'm thinking option 2 is best now.
Option 3 wouldn't work since any errror response terminates the stream per RFC 6 (and this is relied on for RPC tracking in the broker)
Since a sentinel value presumably would require a flag in the watch request anyway (since it would potentially break other users if not), maybe a better flag would be one to request that the watch start at the last blob, or wait for the next one?
maybe a better flag would be one to request that the watch start at the last blob, or wait for the next one?
This is similar to the idea in #5035 to simply ignore all events with timestamps less than "now", i.e. only output new events / data. But it ended up that we do need to "tail" some of the last bits of data before watching "newer events".
We have no way right now to know when the "end" of the stream of events is, thus the idea of this sentinel value.
The two use cases I was thinking of for this feature would still require getting all history first, so ignoring some portion of events already in the eventlog wouldn't work. (Unless I'm missing the intent of the comment)
For example, if a subinstance is going to watch its own job eventlog for exceptions, it will perhaps want to synchronously process all existing events, then start asynchronous processing once it "catches up" via the sentinel. This would potentially avoid race conditions in startup where jobs could start on resources that should have been marked down due to prolog or startup errors.
Similarly, as @chu11 says above, we may want to capture output events in a circular buffer then dump the last N lines once we've "caught up" via the sentinel.
This is inspired by, and very similar to, the event journal interfaces.
Sorry, I misunderstood the intent. I guess that would have to happen at the job-info level then, since kvs-watch doesn't grok eventlogs.
Sorry, I misunderstood the intent. I guess that would have to happen at the job-info level then, since kvs-watch doesn't grok eventlogs.
It'll have to be at both levels, kvs-watch says "this is the last blobref" and job-info (which will split up an eventlog into events) will indicate to the caller the last "initial" event.