documentation icon indicating copy to clipboard operation
documentation copied to clipboard

[Feature Request] Activities concept documentation should have information about guarantees

Open Spikhalskiy opened this issue 4 years ago • 3 comments

Page https://docs.temporal.io/docs/concepts/activities/

should have information about our guarantees regarding at-least at-most once execution of activities. Probably it should have an information that it's "at most once" kindaaaaa (if we don't count a successfully fully finished activity that just failed before/on reporting stage - and this edge case probably should be described in the doc)

For a context, related user question that made me check that we don't have this piece in writing on Activities concept page:

are there situations where activity execution guarantees change from “at most once” to “at least once”, eg. during cluster failover?

Ryland Goldstein: yes multi-cluster replication is one such case.

Spikhalskiy avatar Oct 26 '21 16:10 Spikhalskiy

Can you break this down more from your perspective? When it comes to defining a guarantee, "kindaaaa" seems a little bit problematic.

Sounds more like the request is to explicitly answer the question "What happens to an Activity Execution during a Cluster failover?"

flossypurse avatar Dec 14 '21 15:12 flossypurse

Per @vikstrous2 https://github.com/temporalio/documentation/issues/671

https://github.com/temporalio/documentation/blob/master/docs/go/side-effect.md

unlike the Temporal guarantee of at-most-once execution for Activities

https://github.com/temporalio/documentation/blob/master/docs/go/activities.md#activity-timeouts

Temporal guarantees that Activities are executed at least once

Inconsistent statements about guarantees -

flossypurse avatar Jan 14 '22 20:01 flossypurse

Sounds more like the request is to explicitly answer the question "What happens to an Activity Execution during a Cluster failover?"

No, the problem is that the Activity doc page doesn't clearly define this guarantee for activities.

Can you break this down more from your perspective? When it comes to defining a guarantee, "kindaaaa" seems a little bit problematic.

Well, I kindaaaa did in the brackets. And Ryland did in the response. And the problem is that I don't know all the corner cases like that, that's why it's "kindaaaa". There could be more. We should collect more knowledge and feedback here.

I now know at least three situations when "at-most-once" breaks:

  1. Per Ryland, multi-cluster replication
  2. When activity actually is finished, but the reporting to the server failed
  3. Local Activity because of reporting of it is delayed to the end of WFT.
  4. Something more?

Spikhalskiy avatar Jan 14 '22 21:01 Spikhalskiy