actions icon indicating copy to clipboard operation
actions copied to clipboard

`pulumi cancel` on workflow cancelation

Open blampe opened this issue 1 year ago • 1 comments

Hello!

  • Vote on this issue by adding a 👍 reaction
  • If you want to implement this feature, comment to let us know (we'll work with you on design, scheduling, etc.)

Issue details

Sometimes a user might cancel workflow steps because they no longer need to run them, or because they know they will fail for some reason. If one of these steps happens to be in the middle of a Pulumi update, it will leave the stack in a locked state until someone manually restores it by running pulumi cancel.

This is inconvenient, and we should be able to do this automatically because the workflow cancelation sends us a SIGINT. We have 10 seconds to catch that and call pulumi cancel. From the docs:

For steps that need to be canceled, the runner machine sends SIGINT/Ctrl-C to the step's entry process (node for javascript action, docker for container action, and bash/cmd/pwd when using run in a step). If the process doesn't exit within 7500 ms, the runner will send SIGTERM/Ctrl-Break to the process, then wait for 2500 ms for the process to exit. If the process is still running, the runner kills the process tree.

Affected area/feature

blampe avatar May 02 '23 22:05 blampe

Even bigger problem is that in many cases cancellation ends up with pulumi dropping resource ccreation in the middle. Resource is then created by the provider, but not added to state. In most cases that effects with zombie resources that can't be removed, yet blocking parent resources from being cleaned up, and blocking any upsert restart attempt. Refresh might help, but only if the name is fixed, not randomized. If we could improve that, it would be a massive upgrade. I understand that pulumi process must exit, but pulumi should be registering such tries internally and confirming if resource was in the end created or not (e.g. during refresh)

mikocot avatar Mar 05 '24 09:03 mikocot