runner Workflow run is stuck for over 2 hours on GitHub-hosted macOS runners (macOS 10.15, 11 and 12)

Describe the bug I have a workflow that gets stuck on a regular basis waiting for macOS VMs. See for instance https://github.com/pombredanne/scancode-toolkit/actions/runs/2877397119 where the workflow has been stuck for over 2 hours waiting for macOS runners. This has been happening to me a couple times over the last week.

To Reproduce See https://github.com/pombredanne/scancode-toolkit/actions/runs/2877397119 I cannot find any pattern

Expected behavior The macOS VMs should become available without waiting hours.

Runner Version and Platform

That's the MSFT/GitHub hosted runner

OS of the machine running the runner? OSX/Windows/Linux/... macOS

What's not working?

The log of each stuck job has something like this:

Test POSIX PyPI wheels (macos-12, 3.9)
Started 2h 22m 15s ago
The agent pool assigned to this job has hit their MacOs concurrency limits
Requested labels: macos-12
Job defined at: pombredanne/scancode-toolkit/.github/workflows/scancode-release.yml@refs/tags/v31.0.0rc51
Waiting for a runner to pick up this job...

Screenshot from 2022-08-17 22-38-13

Job Log Output

There is no log yet.

Aug 17 '22 20:08 pombredanne

Some recent jobs were subjects to the same issue:

https://github.com/pombredanne/scancode-toolkit/actions/runs/2853533833 : 5hours!

With these two I lost patience and eventually killed some of them:

https://github.com/pombredanne/scancode-toolkit/actions/runs/2863073652 : 2hours +
https://github.com/pombredanne/scancode-toolkit/actions/runs/2862796857 : ~3hours

Aug 17 '22 20:08 pombredanne

I kicked another job for kicks and the mac runners are stuck too:

Test POSIX PyPI wheels (macos-12, 3.9) Started 1m 17s ago The agent pool assigned to this job has hit their MacOs concurrency limits Requested labels: macos-12 Job defined at: pombredanne/scancode-toolkit/.github/workflows/scancode-release.yml@refs/tags/v31.0.0rc51 Waiting for a runner to pick up this job...

This is not something that would be under my control AFAIK.

Aug 17 '22 20:08 pombredanne

Your waiting jobs are waiting for macos concurrency for your account, and all your 5 free macos hosted concurrency are used by https://github.com/pombredanne/PyOxidizer/actions/runs/2876456583 now.

Aug 17 '22 20:08 TingluoHuang

Your waiting jobs are waiting for macos concurrency for your account, and all your 5 free macos hosted concurrency are used by https://github.com/pombredanne/PyOxidizer/actions/runs/2876456583 now.

Thanks! good catch... but duh... that's a fork! I never asked nor wanted any workflow to run on these forks. I wonder if I can disable workflows globally unless I select to run some. This is wasting a tons of resource otherwise.

Aug 17 '22 21:08 pombredanne

There seems to be no way to disable globally actions and then to enable them selectively only on certain repos ... And since I have a certain, not too small number of forks I cannot humanly control what is happening. I cannot even know which forked repos are running workflows behind my back. Or could I?

Aug 17 '22 21:08 pombredanne

@TingluoHuang How did you figure out there was some workflow stealing my quotas somewhere? How can I find that out?

Aug 17 '22 21:08 pombredanne

@pombredanne I am able to check it via internal telemetry. It's bad that you can't self-server this kind of problem. 🙇

Aug 17 '22 21:08 TingluoHuang

@TingluoHuang re:

It's bad that you can't self-server this kind of problem. bow

Yep this would be awesome if I could... alternatively or in addition disabling Actions on forks and make them opt-in would make the issue much simpler.

Or ... just give me access to your telemetry :smiling_imp:

Aug 17 '22 21:08 pombredanne

May be there is something I can hack with some API calls at least so I can disable actions on all my repos, except for a few where I want them to run?

Aug 17 '22 21:08 pombredanne

Side note: I surmise that running jobs randomly on all the forks must waste quite a bit CPU and resources globally because it eventually saturates using resources for jobs that users never requested. It's probably worth millions that are wasted.

Aug 17 '22 21:08 pombredanne

You might want to report those feedback around the fork repo to https://github.com/community/community/discussions/categories/actions-and-packages The runner itself has no control over those. 😢

Aug 17 '22 22:08 TingluoHuang

I guess that this problem is more real then eve, and we even have zero output on console, so no clue regarding what is really happening there. https://github.com/ansible/vscode-ansible/actions/runs/3731365458/jobs/6329509879

Dec 19 '22 13:12 ssbarnea

This issue is stale because it has been open 365 days with no activity. Remove stale label or comment or this will be closed in 15 days.

Dec 25 '23 00:12 github-actions[bot]

This issue was closed because it has been stalled for 15 days with no activity.

Jan 15 '24 00:01 github-actions[bot]