sdk-core icon indicating copy to clipboard operation
sdk-core copied to clipboard

[Bug] Cancellation when using local activities

Open Irvenae opened this issue 1 year ago • 2 comments

What are you really trying to do?

In a workflow I am running a local activity. When I receive a signal I want to cancel the activity which runs in a cancellable scope and start a new activity.

Describe the bug

In some situations when the signal is received I get a

Fatal(\"Invalid transition while attempting to cancel LocalActivityMachine in MarkerCommandCreated\")

on which the workflow is stuck. This happens I think when a signal is received on the worker from the server while my local activity in the worker is finished before I handle this signal. The signal then tries to cancel the scope while it is already in a completed state resulting in an incorrect state machine?

I don't know how the state machine exactly works when having local activities. I thought that when you run local activities signals were not handled until the local activity is done. So maybe a question is then why the LocalActivityMachine cancelled?

Screenshot 2024-03-14 at 09 04 01

Here local activity markers where received by the server in some other situations the local marker is not there probably this depends on the timing between local activity / workflow task and signal.

Minimal Reproduction

I tried to reproduce this but because this is a race condition I did not manage to reproduce this in a simple example...

Environment/Versions

Linux x86_64 AMD EPYC 7B12 Temporal Server Version | 1.22.0 Temporal UI Version | 2.16.2 Temporal TS SDK 1.9.1 Kubernetes

Irvenae avatar Mar 14 '24 08:03 Irvenae

Hey @Irvenae - thanks for opening. If you have code that even sometimes repros this, that would be useful.

Also, please try out the newest TS SDK, it has an updated Core that might address this

Sushisource avatar Mar 14 '24 16:03 Sushisource

Ok, I missed that I was behind 😊 Seems like the last patch update has some fixes which might resolve this ^^ Unfortunately, I can't share this code. I will make a simplified case and try to randomly load it to see if I can reproduce.

Irvenae avatar Mar 14 '24 17:03 Irvenae