Timeout exception on IWorkflowService#ResetWorkflowExecution
Code (modified samples):
public static void main(String[] args) throws TException, IOException {
IWorkflowService cadenceService = new WorkflowServiceTChannel(
"127.0.0.1",
7933,
new WorkflowServiceTChannel.ClientOptions.Builder()
.setRpcTimeout(1_000_000L)
.setListArchivedWorkflowRpcTimeout(1_000_000_000L)
.setQueryRpcTimeout(1_000_000_000L)
.setRpcLongPollTimeout(1_000_000_000L)
.build()
);
System.out.println("---------------------------------------------------------------");
System.out.println("Run for " + 4);
ResetWorkflowExecutionRequest request = new ResetWorkflowExecutionRequest();
request.setWorkflowExecution(
new WorkflowExecution()
.setWorkflowId("f5e392e2-20ed-4239-9633-65a352fbd202")
.setRunId("5115e281-f48b-4f51-a3de-f1b9880677a3")
);
request.setDomain("DOMAIN");
request.setDecisionFinishEventId(4);
try {
cadenceService.ResetWorkflowExecution(request);
System.out.println("Success");
} catch (Exception e) {
LoggerFactory.getLogger("Logger").error("Error", e);
}
System.exit(0);
}
What I get:
09:06:20.822 [main] ERROR Logger - Error
org.apache.thrift.transport.TTransportException: timeout
at com.uber.cadence.serviceclient.WorkflowServiceTChannel.throwOnRpcError(WorkflowServiceTChannel.java:546)
at com.uber.cadence.serviceclient.WorkflowServiceTChannel.doRemoteCall(WorkflowServiceTChannel.java:519)
at com.uber.cadence.serviceclient.WorkflowServiceTChannel.resetWorkflowExecution(WorkflowServiceTChannel.java:1597)
at com.uber.cadence.serviceclient.WorkflowServiceTChannel.lambda$ResetWorkflowExecution$25(WorkflowServiceTChannel.java:1586)
at com.uber.cadence.serviceclient.WorkflowServiceTChannel.measureRemoteCall(WorkflowServiceTChannel.java:569)
at com.uber.cadence.serviceclient.WorkflowServiceTChannel.ResetWorkflowExecution(WorkflowServiceTChannel.java:1585)
at com.uber.cadence.samples.common.RegisterDomain.main(RegisterDomain.java:65)
Through CLI everything works. Ahead of questions it is crucial for me to be capable of rerunning workflows programmatically to be able to do so under Spring.
It seems like cadence server stops the processing because timeout is not configured (in CLI we have --context_timeout option for that), but I mot sure it's true.
Can you help me with that?
Docker-compose
version: '3.2'
services:
cassandra:
image: cassandra:3.11
restart: unless-stopped
networks:
- cross-comms
volumes:
- type: volume
source: mycassandrastore
target: /var/lib/cassandra
ports:
- "${CASSANDRA_PORT}:${CASSANDRA_PORT}"
statsd:
image: graphiteapp/graphite-statsd
restart: unless-stopped
networks:
- cross-comms
ports:
- "8080:80"
- "2003:2003"
- "8125:8125"
- "8126:8126"
cadence:
image: ubercadence/server:master-auto-setup
restart: unless-stopped
networks:
- cross-comms
ports:
- "${CADENCE_PORT}:${CADENCE_PORT}"
- "7934:7934"
- "7935:7935"
- "7939:7939"
environment:
- "CASSANDRA_SEEDS=cassandra"
- "STATSD_ENDPOINT=statsd:8125"
- "DYNAMIC_CONFIG_FILE_PATH=config/dynamicconfig/development.yaml"
- "CADENCE_CONTEXT_TIMEOUT=600"
depends_on:
- cassandra
- statsd
cadence-web:
image: ubercadence/web:latest
restart: unless-stopped
networks:
- cross-comms
environment:
- "CADENCE_TCHANNEL_PEERS=cadence:${CADENCE_PORT}"
ports:
- "${CADENCE_WEB_PORT}:${CADENCE_WEB_PORT}"
depends_on:
- cadence
cadence-cli-shell:
image: crux-cadence-cli-shell:latest
restart: unless-stopped
networks:
- cross-comms
environment:
- "CADENCE_HOST=cadence"
- "CADENCE_PORT=${CADENCE_PORT}"
- "CADENCE_DOMAIN=${CADENCE_DOMAIN}"
depends_on:
- cadence
volumes:
- cadencedata:/var/lib/cadencedata
volumes:
mycassandrastore:
cadencedata:
networks:
cross-comms:
cadence --domain DOMAIN --address host.docker.internal:7933 workflow reset -w f5e392e2-20ed-4239-9633-65
a352fbd202 -r 5115e281-f48b-4f51-a3de-f1b9880677a3 --event_id 4 --reason "<Some string>"
Works fine
If I set event id = 5 then it returns this error:
19:29:50.332 [main] ERROR Logger - Error
org.apache.thrift.TException: Rpc error:<ErrorResponse id=5 errorType=UnexpectedError message=cadence internal error, msg: nDCStateRebuilder unable to rebuild mutable state to event ID: 4, version: -24>
at com.uber.cadence.serviceclient.WorkflowServiceTChannel.throwOnRpcError(WorkflowServiceTChannel.java:548)
at com.uber.cadence.serviceclient.WorkflowServiceTChannel.doRemoteCall(WorkflowServiceTChannel.java:519)
at com.uber.cadence.serviceclient.WorkflowServiceTChannel.resetWorkflowExecution(WorkflowServiceTChannel.java:1597)
at com.uber.cadence.serviceclient.WorkflowServiceTChannel.lambda$ResetWorkflowExecution$25(WorkflowServiceTChannel.java:1586)
at com.uber.cadence.serviceclient.WorkflowServiceTChannel.measureRemoteCall(WorkflowServiceTChannel.java:569)
at com.uber.cadence.serviceclient.WorkflowServiceTChannel.ResetWorkflowExecution(WorkflowServiceTChannel.java:1585)
at com.uber.cadence.samples.common.RegisterDomain.main(RegisterDomain.java:65)
This is screen with event types and their ids around 4 and 5
@sokada1221 @meiliang86 @mfateev Guys, please, help me with that
@polyansky-syberry sorry for late response. Are you able to address the issue finally? Basically reset is only allowed at DecisionTask boundary(DecisionTaskCompleted/failed/timeout events, in newer server versions, we also support scheduled/started)
@longquanzheng Hi! You mentioned that now it's possible to reset workflow execution from DecisionTaskScheduled event.
I have timeouted execution with such eventHistory:

I tried to reset execution from event 2 (used java-client version 3.6.1 and server v0.23.2 and v0.22.4).
public String resetWorkflow() {
var request = new ResetWorkflowExecutionRequest();
var workflowExecution = new WorkflowExecution()
.setRunId(runId)
.setWorkflowId(workflowId);
request.setWorkflowExecution(workflowExecution);
request.setDecisionFinishEventId(2);
request.setDomain(domain);
try {
return cadenceService.ResetWorkflowExecution(request).getRunId();
} catch (TException ex) {
throw new CadenceServiceException("Couldn't reset workflow execution", ex);
}
}
But it throws exception while resetting execution:
Caused by: com.uber.cadence.BadRequestError: nDCStateRebuilder unable to rebuild mutable state to event ID: 1, version: -24, baseLastEventID + baseLastEventVersion is not the same as the last event of the last batch, event ID: 2, version :-24 ,typically because of attemptting to rebuild to a middle of a batch
at com.uber.cadence.WorkflowService$ResetWorkflowExecution_result$ResetWorkflowExecution_resultStandardScheme.read(WorkflowService.java:38530) ~[cadence-client-3.6.1.jar:na]
at com.uber.cadence.WorkflowService$ResetWorkflowExecution_result$ResetWorkflowExecution_resultStandardScheme.read(WorkflowService.java:38507) ~[cadence-client-3.6.1.jar:na]
at com.uber.cadence.WorkflowService$ResetWorkflowExecution_result.read(WorkflowService.java:38406) ~[cadence-client-3.6.1.jar:na]
at org.apache.thrift.TDeserializer.deserialize(TDeserializer.java:81) ~[libthrift-0.9.3.jar:0.9.3]
at org.apache.thrift.TDeserializer.deserialize(TDeserializer.java:67) ~[libthrift-0.9.3.jar:0.9.3]
at com.uber.tchannel.messages.ThriftSerializer.decodeBody(ThriftSerializer.java:101) ~[tchannel-core-0.8.30.jar:na]
at com.uber.tchannel.messages.Serializer.decodeBody(Serializer.java:49) ~[tchannel-core-0.8.30.jar:na]
at com.uber.tchannel.messages.EncodedResponse.getBody(EncodedResponse.java:85) ~[tchannel-core-0.8.30.jar:na]
at com.uber.cadence.serviceclient.WorkflowServiceTChannel.resetWorkflowExecution(WorkflowServiceTChannel.java:1490) ~[cadence-client-3.6.1.jar:na]
at com.uber.cadence.serviceclient.WorkflowServiceTChannel.lambda$ResetWorkflowExecution$27(WorkflowServiceTChannel.java:1477) ~[cadence-client-3.6.1.jar:na]
at com.uber.cadence.serviceclient.WorkflowServiceTChannel.measureRemoteCallWithTags(WorkflowServiceTChannel.java:374) ~[cadence-client-3.6.1.jar:na]
at com.uber.cadence.serviceclient.WorkflowServiceTChannel.measureRemoteCall(WorkflowServiceTChannel.java:362) ~[cadence-client-3.6.1.jar:na]
If to try to reset from event 3 programmatically it throws exception:
Caused by: org.apache.thrift.TException: Rpc error:<ErrorResponse id=6 errorType=UnexpectedError message=cadence internal error, msg: CreateWorkflowExecution operation failed. Error: invalid UUID "">
at com.uber.cadence.serviceclient.WorkflowServiceTChannel.throwOnRpcError(WorkflowServiceTChannel.java:345) ~[cadence-client-3.6.1.jar:na]
at com.uber.cadence.serviceclient.WorkflowServiceTChannel.doRemoteCall(WorkflowServiceTChannel.java:316) ~[cadence-client-3.6.1.jar:na]
at com.uber.cadence.serviceclient.WorkflowServiceTChannel.resetWorkflowExecution(WorkflowServiceTChannel.java:1488) ~[cadence-client-3.6.1.jar:na]
at com.uber.cadence.serviceclient.WorkflowServiceTChannel.lambda$ResetWorkflowExecution$27(WorkflowServiceTChannel.java:1477) ~[cadence-client-3.6.1.jar:na]
at com.uber.cadence.serviceclient.WorkflowServiceTChannel.measureRemoteCallWithTags(WorkflowServiceTChannel.java:374) ~[cadence-client-3.6.1.jar:na]
at com.uber.cadence.serviceclient.WorkflowServiceTChannel.measureRemoteCall(WorkflowServiceTChannel.java:362) ~[cadence-client-3.6.1.jar:na]
at com.uber.cadence.serviceclient.WorkflowServiceTChannel.ResetWorkflowExecution(WorkflowServiceTChannel.java:1476) ~[cadence-client-3.6.1.jar:na]
If to reset this execution via cli from event 3, it will be reset successfully.
cadence --domain WORKFLOWS_PRIMARY --address host.docker.internal:7933 workflow reset -w timeout_test_with_childWF.2
022-01-19T11:05:06Z -r 15b64382-28ef-4c03-8bfe-5be59ac4b390 --event_id 3 --reason "<Reset>"
{
"runId": "2d6caf81-e780-4eed-a117-d167dd5d0c92"
}
But how can be reset such execution programmatically? Can the whole workflow be reset from the beginning?
Yeah I think the new feature is just to allow resetting to the event next to decision scheduled. You can look up the history to find first decision scheduled and add 1 to the event Id .
On Fri, Feb 11, 2022 at 6:55 AM Anastasia Vitkovskaya < @.***> wrote:
@longquanzheng https://github.com/longquanzheng Hi! You mentioned that now it's possible to reset workflow execution from DecisionTaskScheduled event. I have timeouted execution with such eventHistory: [image: image] https://user-images.githubusercontent.com/77055765/153611967-840240f2-6e4c-4813-8b43-039fb74ba37c.png
I tried to reset execution from event 2 (used java-client version 3.6.1 and server v0.23.2 and v0.22.4).
public String resetWorkflow() { var request = new ResetWorkflowExecutionRequest(); var workflowExecution = new WorkflowExecution() .setRunId(runId) .setWorkflowId(workflowId); request.setWorkflowExecution(workflowExecution); request.setDecisionFinishEventId(2); request.setDomain(domain); try { return cadenceService.ResetWorkflowExecution(request).getRunId(); } catch (TException ex) { throw new CadenceServiceException("Couldn't reset workflow execution", ex); } }But it throws exception while resetting execution:
Caused by: com.uber.cadence.BadRequestError: nDCStateRebuilder unable to rebuild mutable state to event ID: 1, version: -24, baseLastEventID + baseLastEventVersion is not the same as the last event of the last batch, event ID: 2, version :-24 ,typically because of attemptting to rebuild to a middle of a batch at com.uber.cadence.WorkflowService$ResetWorkflowExecution_result$ResetWorkflowExecution_resultStandardScheme.read(WorkflowService.java:38530) ~[cadence-client-3.6.1.jar:na] at com.uber.cadence.WorkflowService$ResetWorkflowExecution_result$ResetWorkflowExecution_resultStandardScheme.read(WorkflowService.java:38507) ~[cadence-client-3.6.1.jar:na] at com.uber.cadence.WorkflowService$ResetWorkflowExecution_result.read(WorkflowService.java:38406) ~[cadence-client-3.6.1.jar:na] at org.apache.thrift.TDeserializer.deserialize(TDeserializer.java:81) ~[libthrift-0.9.3.jar:0.9.3] at org.apache.thrift.TDeserializer.deserialize(TDeserializer.java:67) ~[libthrift-0.9.3.jar:0.9.3] at com.uber.tchannel.messages.ThriftSerializer.decodeBody(ThriftSerializer.java:101) ~[tchannel-core-0.8.30.jar:na] at com.uber.tchannel.messages.Serializer.decodeBody(Serializer.java:49) ~[tchannel-core-0.8.30.jar:na] at com.uber.tchannel.messages.EncodedResponse.getBody(EncodedResponse.java:85) ~[tchannel-core-0.8.30.jar:na] at com.uber.cadence.serviceclient.WorkflowServiceTChannel.resetWorkflowExecution(WorkflowServiceTChannel.java:1490) ~[cadence-client-3.6.1.jar:na] at com.uber.cadence.serviceclient.WorkflowServiceTChannel.lambda$ResetWorkflowExecution$27(WorkflowServiceTChannel.java:1477) ~[cadence-client-3.6.1.jar:na] at com.uber.cadence.serviceclient.WorkflowServiceTChannel.measureRemoteCallWithTags(WorkflowServiceTChannel.java:374) ~[cadence-client-3.6.1.jar:na] at com.uber.cadence.serviceclient.WorkflowServiceTChannel.measureRemoteCall(WorkflowServiceTChannel.java:362) ~[cadence-client-3.6.1.jar:na]
If to try to reset from event 3 programmatically it throws exception:
Caused by: org.apache.thrift.TException: Rpc error:<ErrorResponse id=6 errorType=UnexpectedError message=cadence internal error, msg: CreateWorkflowExecution operation failed. Error: invalid UUID ""> at com.uber.cadence.serviceclient.WorkflowServiceTChannel.throwOnRpcError(WorkflowServiceTChannel.java:345) ~[cadence-client-3.6.1.jar:na] at com.uber.cadence.serviceclient.WorkflowServiceTChannel.doRemoteCall(WorkflowServiceTChannel.java:316) ~[cadence-client-3.6.1.jar:na] at com.uber.cadence.serviceclient.WorkflowServiceTChannel.resetWorkflowExecution(WorkflowServiceTChannel.java:1488) ~[cadence-client-3.6.1.jar:na] at com.uber.cadence.serviceclient.WorkflowServiceTChannel.lambda$ResetWorkflowExecution$27(WorkflowServiceTChannel.java:1477) ~[cadence-client-3.6.1.jar:na] at com.uber.cadence.serviceclient.WorkflowServiceTChannel.measureRemoteCallWithTags(WorkflowServiceTChannel.java:374) ~[cadence-client-3.6.1.jar:na] at com.uber.cadence.serviceclient.WorkflowServiceTChannel.measureRemoteCall(WorkflowServiceTChannel.java:362) ~[cadence-client-3.6.1.jar:na] at com.uber.cadence.serviceclient.WorkflowServiceTChannel.ResetWorkflowExecution(WorkflowServiceTChannel.java:1476) ~[cadence-client-3.6.1.jar:na]
If to reset this execution via cli from event 3, it will be reset successfully.
cadence --domain WORKFLOWS_PRIMARY --address host.docker.internal:7933 workflow reset -w timeout_test_with_childWF.2 022-01-19T11:05:06Z -r 15b64382-28ef-4c03-8bfe-5be59ac4b390 --event_id 3 --reason "<Reset>" { "runId": "2d6caf81-e780-4eed-a117-d167dd5d0c92" }
But how can be reset such execution programmatically? Can the whole workflow be reset from the beginning?
— Reply to this email directly, view it on GitHub https://github.com/uber/cadence-java-client/issues/562#issuecomment-1036296256, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABCQPM3J7FHI7ZR2VN7HQPLU2UPPVANCNFSM4TK546QA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
You are receiving this because you were mentioned.Message ID: @.***>
--
Thanks, Quanzheng
Hi, @longquanzheng
If I have such events history in workflow run
I reset execution from event 3
and java client returns error
Caused by: org.apache.thrift.TException: Rpc error:<ErrorResponse id=7 errorType=UnexpectedError message=cadence internal error, msg: CreateWorkflowExecution operation failed. Error: invalid UUID "">
Is this a server side error? But via cli such workflow is resetted
how can such workflow can be resetted using java client?
What if you reset to event 2?
On Wed, Mar 2, 2022 at 5:05 AM Anastasia Vitkovskaya < @.***> wrote:
Hi, @longquanzheng https://github.com/longquanzheng If I have such events history in workflow run [image: image] https://user-images.githubusercontent.com/77055765/156365393-8d05f0af-b05c-4220-9c91-34991fb6b80e.png I reset execution from event 3 and java client returns error Caused by: org.apache.thrift.TException: Rpc error:<ErrorResponse id=7 errorType=UnexpectedError message=cadence internal error, msg: CreateWorkflowExecution operation failed. Error: invalid UUID ""> Is this a server side error? But via cli such workflow is resetted how can such workflow can be resetted using java client?
— Reply to this email directly, view it on GitHub https://github.com/uber/cadence-java-client/issues/562#issuecomment-1056908962, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABCQPM7A6UNN4C6I25M3M2DU55RTBANCNFSM4TK546QA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
You are receiving this because you were mentioned.Message ID: @.***>
--
Thanks, Quanzheng
Hey, @longquanzheng
If reset from event 2 from java-client or cli it fails
Error: reset failed Error Details: BadRequestError{Message: nDCStateRebuilder unable to rebuild mutable state to event ID: 1, version: -24, baseLastEventID + baseLastEventVersion is not the same as the last event of the last batch, event ID: 2, version :-24 ,typicaly because of attemptting to rebuild to a middle of a batch} ('export CADENCE_CLI_SHOW_STACKS=1' to see stack traces)
@longquanzheng, hi! Can you pls provide info how to reset such executions?