Support displaying past invocation
In order to better support workflow use cases, we believe that it is important to store the history of past invocation runs and to make them easily accessible. The assumption is that the completion/failure of invocation runs itself represent valuable information that the user wants to access. Moreover, it allows building support for re-executing individual workflow runs and to investigate potential failure causes.
1. Option: History server
The Restate runtime itself, does not store the history of invocation runs. Instead, one idea could be to offload this task to an external component, the HistoryServer. In order to implement such a HistoryServer, the runtime would need to expose information about ongoing and completed invocations (e.g. via a CDC stream).
2. Option: Cache in Restate server
An alternative w/o an additional dependency is to cache the past invocations in the Restate server. What we need to figure out is which information needs to be retained and for how long.
I wanted to bump this feature request up. For workflows, while the result is retained for a configurable duration, the invocation journal is immediately removed.
A configuration parameter to delay journal deletion would be helpful - enough time for a periodic process (maybe a cron scheduled using Restate itself) to query the admin api and save the data.
- This removes the need for a separate data store for low frequency usage. A restate deployment can be a complete job queue server.
- To track the steps in a completed/canceled workflow, the other alternative is wrapping each run/sleep/service-call with actions that publish start/stop events to the workflow state.
FYI we're working on this.
Dropping some design decisions here.
There will be 3 knobs:
-
idempotency_retention(the one we already got) -
workflow_completion_retention -
journal_retention
There's also the invariant that idempotency_retention and workflow_completion_retention should be greater than or equal to journal_retention, if not we set them accordingly. This invariant is an implementation requirement.
These should be configurable on a service/handler basis, from the code too (similar to #3201). The behavior then is:
| Invocation type | When journal retention == 0/unset | When journal retention > 0 |
|---|---|---|
| Regular | Retain nothing | Retain journal (also status) for journal_retention time |
| With idempotency-key | Deduplicate requests for idempotency_retention time, don't retain journal |
Deduplicate requests for idempotency_retention time, retain journal for journal_retention |
Workflow handler (run) |
Deduplicate requests for workflow_completion_retention time, don't retain journal |
Deduplicate requests for workflow_completion_retention time, retain journal for journal_retention |
In terms of defaults:
- We keep the defaults of
idempotency_retentionandworkflow_completion_retentionas they are - For newly registered workflows, we set
journal_retentiondefault to 24h (same asworkflow_completion_retention)
#3296 merges the support, we now need to implement the annotations in the SDKs.