documentation
documentation copied to clipboard
[Feature Request] Activities concept documentation should have information about guarantees
Page https://docs.temporal.io/docs/concepts/activities/
should have information about our guarantees regarding at-least at-most once execution of activities. Probably it should have an information that it's "at most once" kindaaaaa (if we don't count a successfully fully finished activity that just failed before/on reporting stage - and this edge case probably should be described in the doc)
For a context, related user question that made me check that we don't have this piece in writing on Activities concept page:
are there situations where activity execution guarantees change from “at most once” to “at least once”, eg. during cluster failover?
Ryland Goldstein: yes multi-cluster replication is one such case.
Can you break this down more from your perspective? When it comes to defining a guarantee, "kindaaaa" seems a little bit problematic.
Sounds more like the request is to explicitly answer the question "What happens to an Activity Execution during a Cluster failover?"
Per @vikstrous2 https://github.com/temporalio/documentation/issues/671
https://github.com/temporalio/documentation/blob/master/docs/go/side-effect.md
unlike the Temporal guarantee of at-most-once execution for Activities
https://github.com/temporalio/documentation/blob/master/docs/go/activities.md#activity-timeouts
Temporal guarantees that Activities are executed at least once
Inconsistent statements about guarantees -
Sounds more like the request is to explicitly answer the question "What happens to an Activity Execution during a Cluster failover?"
No, the problem is that the Activity doc page doesn't clearly define this guarantee for activities.
Can you break this down more from your perspective? When it comes to defining a guarantee, "kindaaaa" seems a little bit problematic.
Well, I kindaaaa did in the brackets. And Ryland did in the response. And the problem is that I don't know all the corner cases like that, that's why it's "kindaaaa". There could be more. We should collect more knowledge and feedback here.
I now know at least three situations when "at-most-once" breaks:
- Per Ryland, multi-cluster replication
- When activity actually is finished, but the reporting to the server failed
- Local Activity because of reporting of it is delayed to the end of WFT.
- Something more?