runner icon indicating copy to clipboard operation
runner copied to clipboard

Using `force-cancel` returns a 500 error on a hanging workflow

Open abelbraaksma opened this issue 1 year ago • 2 comments

Describe the bug Calling the force-cancel endpoint using the following syntax, returns a 500 error (related to the recently implemented #1846).

To Reproduce

Have a workflow run that's hanging (if you need to repro this, there are currently two hanging runs in my org, see getaddify and ping me if you want me to try some things).

gh api --method POST repos/GetAddify/colonizer/actions/runs/9420995077/force-cancel

Response:

{
  "message": "Failed to cancel workflow run",
  "documentation_url": "https://docs.github.com/rest/actions/workflow-runs#force-cancel-a-workflow-run",
  "status": "500"
}

Expected behavior

Run is canceled. However, this does not happen.

Runner Version and Platform

Version of your runner? this in GitHub itself

OS of the machine running the runner? online in github.com

What's not working?

See response JSON above. It won't cancel the hanging run (even after 5 days).

Job Log Output

None. The runs appear as running, but didn't even start. It looks like this in the overview:

image

Runner and Worker's Diagnostic Logs

N/A

abelbraaksma avatar Jun 11 '24 16:06 abelbraaksma

any news on how to fix this?

mishelhiti avatar Jul 09 '24 08:07 mishelhiti

When this occurs with a job that has concurrency defined, the inability to cancel the previous hung job prevents new jobs from running. The workaround we've identified so far is to remove the concurrency option from the job, which is not particularly desirable for this job.

EDIT: there appear to be other reports of this issue, e.g. https://github.com/orgs/community/discussions/114588

Cardds avatar Sep 19 '24 13:09 Cardds