flux-core
flux-core copied to clipboard
idea: stream status updates from flux-resource or other tools
Partly from a hallway conversation I had with @morrone.
I added a whatsup option called --monitor a long time ago. The idea is you run
> whatsup --monitor
at the end of the day, and it'll output things like "node123 (10/22/22 9:00PM): down" (can't remember the exact format, but that's the basics) when things go down/up during the night. You come in in the morning and you get a nice mini status in your terminal for what happened when you were gone.
Similar option could be useful with flux-resource or some other tools.
BUT, the additional benefit is that if we add this, the events stream that implements this underneath could then also be used as a more friendly events streaming service for #4569.
We do have a resource eventlog in the KVS that can be watched, e.g. in raw form:
$ sudo flux kvs eventlog get -w resource.eventlog
1669817412.474601 resource-init {"restart":true,"drain":{},"online":"","exclude":"0"}
1669817412.476516 resource-define {"method":"configuration"}
1669817414.620857 online {"idset":"0"}
1669914360.522472 online {"idset":"1"}
1669914360.672504 online {"idset":"2"}
1669914360.986879 online {"idset":"3-5"}
1669914361.136419 online {"idset":"6"}
1669914417.560495 offline {"idset":"6"}
1669914538.094525 drain {"idset":"7","reason":"testing drain","overwrite":0}
This has nothing to do with #4569 though, since that issue deals with job events. Were you thinking a utility or service that would aggregate all known eventlogs into a single event stream for a consumer? (would need to be instance owner only)
Were you thinking a utility or service that would aggregate all known eventlogs into a single event stream for a consumer?
Ahh, I forgot that #4569 was job events specific. @morrone and I were talking about the potential of other event streams as well. Although it wasn't discussed aggregating them all into one, we were discussing just the general availability of them.