cli icon indicating copy to clipboard operation
cli copied to clipboard

Combine `workflow reset` and `workflow reset-batch`?

Open josh-berry opened this issue 1 year ago • 5 comments

We should explore whether we want to combine the workflow reset and workflow reset-batch commands. AFAIK, they do the same thing, and the only difference is whether they operate on a single workflow or multiple workflows.

There are other workflow commands, like workflow terminate and workflow cancel, that also support operating on multiple workflows at once. So having reset-batch be separate from reset seems inconsistent.

josh-berry avatar Mar 15 '24 18:03 josh-berry

reset-batch uses client side batches. We want to deprecate it and use the server's batch reset capabilities. When we do that it'll be a completely different API and all of these options will be gone:

   --exclude-file value                    Input file that specifies Workflow Executions to exclude from a reset.
   --input-file value                      Input file that specifies Workflow Executions to reset. Each line contains one Workflow Id as the base Run and, optionally, a Run Id.
   --input-parallelism value               Number of goroutines to run in parallel. Each goroutine processes one line per second. (default: 1)
   --input-separator                         Separator for the input file. The default is a tab (  ). (default: "\t")

We should not port this command as is in the rewrite.

bergundy avatar Mar 18 '24 17:03 bergundy

The question is whether this will be enough of a functionality loss moving from client side to server side (I haven't compared the server capabilities compared to the CLI behavior today). But if it's not considered too much of a functionality loss, I definitely agree, client-side batch is bad.

cretz avatar Mar 18 '24 18:03 cretz

We decided a while ago that dropping client-side batch is fine. We can add more features to server-side batch as requested (and tctl still exists in the meantime).

dnr avatar Mar 18 '24 18:03 dnr

Sounds good—though I thought reset-batch also supports more --query functionality than batch does today, does it not? Do we have something to track that missing functionality already?

josh-berry avatar Mar 21 '24 23:03 josh-berry

D'oh, so I'm realizing we actually handled batch resets in #473. So I think this can be closed so long as we understand (and have other issues for) whatever functionality is still missing server-side and CLI-side.

josh-berry avatar Mar 22 '24 19:03 josh-berry

Here are the features I am seeing that were in manual client-side batch reset in tctl/0.11.0 that are not in server-side batch reset in 0.12.0:

  • --input-file/--exclude-file - Ability to provide/exclude specific workflow/run IDs. I think this can be done on the query.
  • --input-parallelism - Ability to set client-side concurrency. Server-side batch handles its own parallelism.
  • --skip-current-open - Ability to skip if there are any open runs for the workflow ID. This can be done on the query.
  • --skip-base-is-not-current - Ability to only reset if the run ID is the current for the workflow ID. I don't see how to limit query to only latest run ID for a workflow ID. Is this what https://github.com/temporalio/api/blob/8c57e7b694efd2fee3180c9422d40d421bf14df4/temporal/api/common/v1/message.proto#L179 is? We don't expose it in CLI today. Regardless, this never could be 100% accurate client side because a workflow could roll over mid-batch.
  • --non-deterministic - Ability to only reset ones with non-deterministic as the last task failure. This is not supported on server-side batch reset.
  • --type LastContinuedAsNew - Ability to reset to the continue as new task on the previous run ID (the implementation is shoddy though). This is not supported on server-side batch reset.
  • --dry-run - Ability to do everything but the actual reset. This is not supported on server-side batch reset.

Will confer with team on whether these are needed.

cretz avatar May 30 '24 15:05 cretz

Ability to only reset if the run ID is the current for the workflow ID. I don't see how to limit query to only latest run ID of current execution chain. Is this what https://github.com/temporalio/api/blob/8c57e7b694efd2fee3180c9422d40d421bf14df4/temporal/api/common/v1/message.proto#L179 is? We don't expose it in CLI today. Regardless, this never could be 100% accurate client side because a workflow could roll over mid-batch.

what do you mean by current execution chain?

paulnpdev avatar May 30 '24 15:05 paulnpdev

what do you mean by current execution chain?

I shouldn't have said "of current execution chain", I should have said "for a workflow ID". I have updated the text.

cretz avatar May 30 '24 15:05 cretz

Got it. I see no reason we should not be able to apply the reset to the current run of each workflow key provided by the query, input, or file with optional limitation to only open or closed workflows

paulnpdev avatar May 30 '24 15:05 paulnpdev

Agreed, but this is about documenting what does exist vs what could exist. For anything that doesn't exist but we want it to, I support adding to the server. The limitation to only open or only closed can already be done via query filtering IIUC.

cretz avatar May 30 '24 15:05 cretz

Closing issue. The commands have already been combined and the differences are documented at https://github.com/temporalio/cli/issues/499#issuecomment-2139910124.

cretz avatar May 30 '24 22:05 cretz