unable to delete job
Hi,
I'm attempting to delete a job, but it hangs indefinitely. Upon inspecting AWS State Machines, I noticed that the job machine has a status of "deleting"...
Upon Googling, I came across an AWS workaround described in https://repost.aws/knowledge-center/step-functions-stuck-deleting-status. After a few clicks to stop the running executions, everything worked as expected.
Should Copilot automatically stop those running executions itself?
Thanks
hi @sebastianovide! I agree that this is something that can be looked into! I assume it was the CloudFormation stack deletion that was hanging, is that correct?
One option is like you said, for Copilot to stop all executions. This assumes that copilot job delete implies the intent to stop any running executions. This is a best-effort attempt, any executions triggered after Copilot's attempt are not handled.
Another option that I'm thinking of right now is sort of a compromise from what you suggested. Copilot can detect and help you diagnose. If Copilot detects that the deletion of the AWS::StepFunction::StateMachine resource is taking a very long time, it suggests on the terminal the script to stop all executions. This option lets users decide whether to stop existing executions instead of for Copilot to make assumptions.
Another option that I'm thinking of right now is sort of a compromise from what you suggested. Copilot can detect and help you diagnose. If Copilot detects that the deletion of the AWS::StepFunction::StateMachine resource is taking a very long time, it suggests on the terminal the script to stop all executions. This option lets users decide whether to stop existing executions instead of for Copilot to make assumptions
Yeah, this option will definitely provide more transparency and ensure that we are not stopping tasks unintentionally, perhaps with the usual 'force' flag too