Workflow Reset: add logic of determining the reset point to the service
Is your feature request related to a problem? Please describe.
tctl workflow reset supports reset_type argument. The ResetWorkflowExecution gRPC API accepts only workflow_task_finish_event_id. So all the logic of finding reset point resides in the tctl. This makes the logic not reusable when SDKs invoke reset operation directly.
Describe the solution you'd like Move logic of finding reset point to the service by adding reset_type argument to ResetWorkflowExecutionRequest.
Hey, @mfateev would you mind, if I work on this issue? I have been using temporal.io for quite a long time and want to contribute something : ) Thanks.
Hello,
In case there are parallel branches in a workflow it can be difficult to find the "reset point" event corresponding to the branch one wishes to restart. In the Java SDK, we haven't found any correlation between the "workflow task started" event and activity events that follow.
If the logic of determining the reset point were moved to the service, ideally one could just invoke an API to e.g. "reset a workflow to the point preceding the first failure".
Thanks
Auto‐select reset point when not provided
Problem
By default the Reset API requires clients to supply a WorkflowTaskFinishEventId, forcing callers to inspect history and pick an internal event ID themselves. This makes resets brittle and user‐unfriendly.
Solution
If the client omits WorkflowTaskFinishEventId (i.e. it’s zero), the service will:
- Read the workflow’s history branch up to
(NextEventID – 1). - Scan the returned
HistoryEventsfor the lastWORKFLOW_TASK_COMPLETEDevent. - Set
WorkflowTaskFinishEventIdto that event’s ID, so the reset will roll back to just before it completed. - Proceed with the existing validation and reset logic.
Key Code Snippet
// in service/history/api/resetworkflow/api.go → Invoke(...)
baseMutableState := baseLease.GetMutableState()
// 1) Auto‐select finish ID if caller omitted it
if req.GetWorkflowTaskFinishEventId() == 0 {
// read history up to last event
resp, err := shardCtx.GetExecutionManager().ReadHistoryBranch(ctx, &persistence.ReadHistoryBranchRequest{
BranchToken: baseMutableState.GetExecutionInfo().GetCurrentBranchToken(),
MinEventID: common.FirstEventID,
MaxEventID: baseMutableState.GetNextEventID() - 1,
PageSize: defaultPageSize,
})
if err != nil {
return nil, serviceerror.NewInternal("fetching history for reset: " + err.Error())
}
// 2) find last WorkflowTaskCompleted
var lastComplete int64
for _, ev := range resp.HistoryEvents {
if ev.GetEventType() == enumspb.EVENT_TYPE_WORKFLOW_TASK_COMPLETED {
lastComplete = ev.GetEventId()
}
}
if lastComplete <= common.FirstEventID {
return nil, serviceerror.NewInvalidArgument("no completed workflow task found to reset to")
}
// 3) use that as the finish‐event ID
req.WorkflowTaskFinishEventId = lastComplete
}
// 4) existing validation now passes, then core reset logic runs...