flux-core kvs-watch/job-info: support a sentinel value indicating "end of initial values"

It would be convenient if kvs-watch could inform watchers that all data that is presently available (i.e. data may appear in the future but we don't know) has already been streamed and sent.

This could be useful for #5035. We could cache the last N lines of data until we reach the current "end of initial value" and then could output those N lines via a "tail" option. Because we don't have a sentinel or flag indicating "we're at the end of the initial value", we can't implement this algorithm.

How could this be done?

new rpc parameter on responses indicating this condition.
- Pro: makes the most sense, a flux_kvs_get_sentinel() function could be used to retrieve the flag
- Con: sometimes multiple events are returned in "one blob", so gotta calculate "is this the last eventlog entry in this blob" (or count entries before we parse) in job-info module
some "sentinel" RPC, like maybe an empty value or something, works sort of like with previous option with some sort of "is a sentinel" function.
- Pro: easier than above b/c I don't have to count all the eventlog entries, just "forward" it in the job-info module.
- Con: sort of breaks the idea of "streaming events"
errno other than ENODATA could indicate this condition.
- Pro: no need for new helper functions
- Con: This technique is a bit dangerous b/c it could break most existing code, so the errno would only be sent if the behavior is requested by the caller

Edit: Doh, option 1 above won't work b/c we need something for "no initial value". I'm thinking option 2 is best now.

Jun 06 '25 23:06 chu11

Option 3 wouldn't work since any errror response terminates the stream per RFC 6 (and this is relied on for RPC tracking in the broker)

Jun 09 '25 20:06 garlick

Since a sentinel value presumably would require a flag in the watch request anyway (since it would potentially break other users if not), maybe a better flag would be one to request that the watch start at the last blob, or wait for the next one?

Jun 09 '25 20:06 garlick

maybe a better flag would be one to request that the watch start at the last blob, or wait for the next one?

This is similar to the idea in #5035 to simply ignore all events with timestamps less than "now", i.e. only output new events / data. But it ended up that we do need to "tail" some of the last bits of data before watching "newer events".

We have no way right now to know when the "end" of the stream of events is, thus the idea of this sentinel value.

Jun 09 '25 20:06 chu11

The two use cases I was thinking of for this feature would still require getting all history first, so ignoring some portion of events already in the eventlog wouldn't work. (Unless I'm missing the intent of the comment)

For example, if a subinstance is going to watch its own job eventlog for exceptions, it will perhaps want to synchronously process all existing events, then start asynchronous processing once it "catches up" via the sentinel. This would potentially avoid race conditions in startup where jobs could start on resources that should have been marked down due to prolog or startup errors.

Similarly, as @chu11 says above, we may want to capture output events in a circular buffer then dump the last N lines once we've "caught up" via the sentinel.

This is inspired by, and very similar to, the event journal interfaces.

Jun 09 '25 20:06 grondo

Sorry, I misunderstood the intent. I guess that would have to happen at the job-info level then, since kvs-watch doesn't grok eventlogs.

Jun 09 '25 20:06 garlick

Sorry, I misunderstood the intent. I guess that would have to happen at the job-info level then, since kvs-watch doesn't grok eventlogs.

It'll have to be at both levels, kvs-watch says "this is the last blobref" and job-info (which will split up an eventlog into events) will indicate to the caller the last "initial" event.

Jun 09 '25 20:06 chu11

flux-core flux-core copied to clipboard

kvs-watch/job-info: support a sentinel value indicating "end of initial values"

flux-core
flux-core copied to clipboard