Streams have no concept of RESTORING status
We have implemented custom health check actuators for our Streams, which report "healthy" when Stream Status is RUNNING.
However, I have noticed that a Task Status can be in a RESTORING state (while it loads from RocksDb, or from a changelog topic) but that this is not represented in the Stream Status, which instead shows RUNNING. Is there a reason for this? We have encountered very occasional issues where a task is stuck in the RESTORING state, but continues to report "healthy" because Stream Status is RUNNING. This has lead to pods that are effectively dead, but that are not killed by Kubernetes, leading to vast backlogs of unprocessed data.
Would it be possible to expose the RESTORING state of the Task at all?
This is something that we can investigate for sure. Let me know some time to review how it can be done.
Great! Thank you again for all your hard work :)
@LGouellec Hi there! I'd like to add that it would be great to be able to trigger a StateListener (or something like that) when the state store restoration is completed. Also, it would be beneficial to have more information about that event (ie. which stream state store was restored?)
Hi @xdave ,
100%, the plan is to expose an event when the store starts restoring and when it's finished.
Same thing in JAVA as far as I remember.