sdk-go
sdk-go copied to clipboard
Activity cancel error returned when not initiated server side is rejected by server
Expected Behavior
When a worker stops while an activity is running and that context is cancelled and bubbled out as an activity error, that activity attempt should be recorded as failed server side. This is how it works in most SDKs today.
Actual Behavior
Today if you stop a worker, the activity tries to return to the server the cancelled error but the server fails the activity task completion with:
unable to mark activity as canceled without activity being request canceled first
I think we need to send 1) only send activity cancelled if the server requested cancel (otherwise wrap as application error if canceled error received), and 2) write a test proving that worker stop during a non-heartbeating activity does the next attempt on the next worker.
This maybe could be done at the same time as #1086
Am I correct in thinking the impact of this is just performance—that is, it takes longer overall to successfully complete the activity than it otherwise might?
My mental model of this is:
- Activity cancellation error gets sent to server
- Server fails the task completion, and therefore doesn't realize the activity was canceled
- Eventually the activity times out server-side (<-- this is where the extra time goes)
- Server retries the activity (hopefully on a new worker since the old one died)
- Eventually the activity completes (if nothing else goes wrong)
lmk what I got wrong. :)
Yes, that seems correct. I think in most Core-based SDKs, we already convert non-server-requested-activity-cancellation errors (i.e. worker shutdown) to activity errors. I think we probably just confirm and do this in all SDKs.