runner Self-Hosted Runner does not pick up queued jobs

Describe the bug I have a workflow with various jobs that have to be ran one by one by the runner. But the runner only picks up the first one or two jobs and doesn't pick up the rest of the jobs in the queue. It enters into an idle state when there are multiple jobs in the queue, the only way for the runner to pick them up and finish the workflow is if I manually cancel and re-run the failed jobs.

This was not happening a couple days ago, it just randomly started without any significant modification to the workflow or runner.

Expected behavior The self-hosted runner should finish all the jobs in the queue.

Runner Version and Platform

Version of your runner? 2.319.1

OS of the machine running the runner? OSX/Windows/Linux/... Linux Ubuntu 24.04.1 LTS

Sep 30 '24 18:09 JasperRaccoon

I am having the same issue, self hosted runner is not picking up queued tasks: version:2.319.1 Linux Arm64 It shows up as idle on GitHub org level runners page, tried to recreate the runner multiple times, still getting issue.

Oct 01 '24 19:10 SyedShayaan

Have you tried restart listener ? Is it work? I do seems have the same issue. Actions did not pick up the job, but after restart listener, Actions picked up the jobs.

It's not long term solution but have no idea how to solve this issue yet. Seems Actions has this issue from several previous version until latest version.

Oct 03 '24 03:10 hikouki-gumo

@hikouki-gumo Restarting the GitHub action runner did not solve the problem for me. My GitHub action runner was made at an organization level, but what worked was creating a repo level GitHub action runner. This repo level runner was able to pick up jobs. Not sure why the org level runner broke.

To create repo level GitHub action runner: MyRepo -> Settings -> Actions ->Runners [click New self-hosted runner]

Oct 03 '24 05:10 SyedShayaan

I found one resolution for my issue. By default the self-hosted runner was being created at ORG level. I had to delete the same and create one using repo settings.

Screenshot from 2024-11-07 19-04-21

Nov 07 '24 13:11 vishwaspai

Thank you!! After hours of troubleshooting, removing the runner from ORG settings and adding it from REPO settings worked charms. Huge appreciation.

Nov 07 '24 16:11 kangroklee

We had the similar issue. In our case, we can't have GitHub runners are repository level, we need provision at the Org Level

We are running Self Hosted Github runner on Kubernetes using Scale Set Runners

We had issue where Scale Set runner wasn't scaling up with the demand of the workflow run. We had 50 workflow runs pending for a particular scale set listener and it was running only 10-15 runs while the other 35-40 runners were sitting idle.

When we started troubleshooting, we found the following container logs from one of the runner pod

RUNNER 2024-11-14 13:57:12Z INFO GitHubActionsService] AAD Correlation ID for this token request: Unknown                                                                │
│ RUNNER 2024-11-14 13:57:13Z ERR  GitHubActionsService] GET request to https://broker.actions.githubusercontent.com/message?sessionId=5244b719-b408-4da3-993c-c03a3fb81b1 │
│ RUNNER 2024-11-14 13:57:13Z ERR  BrokerServer] Catch exception during request                                                                                            │
│ RUNNER 2024-11-14 13:57:13Z ERR  BrokerServer] System.Exception: Failed to get job message. Request to https://broker.actions.githubusercontent.com/message failed with  │
│ RUNNER 2024-11-14 13:57:13Z ERR  BrokerServer]    at GitHub.Actions.RunService.WebApi.BrokerHttpClient.GetRunnerMessageAsync(Nullable`1 sessionId, String runnerVersion, │
│ RUNNER 2024-11-14 13:57:13Z ERR  BrokerServer]    at GitHub.Runner.Common.BrokerServer.<>c__DisplayClass7_0.<<GetRunnerMessageAsync>b__0>d.MoveNext()                    │
│ RUNNER 2024-11-14 13:57:13Z ERR  BrokerServer] --- End of stack trace from previous location ---                                                                         │
│ RUNNER 2024-11-14 13:57:13Z ERR  BrokerServer]    at GitHub.Runner.Common.RunnerService.RetryRequest[T](Func`1 func, CancellationToken cancellationToken, Int32 maxRetry │
│ RUNNER 2024-11-14 13:57:13Z WARN BrokerServer] Back off 12.599 seconds before next retry. 4 attempt left.                                                                │
│ RUNNER 2024-11-14 13:57:25Z ERR  GitHubActionsService] GET request to https://broker.actions.githubusercontent.com/message?sessionId=5244b719-b408-4da3-993c-c03a3fb81b1 │
│ RUNNER 2024-11-14 13:57:25Z ERR  BrokerServer] Catch exception during request                                                                                            │
│ RUNNER 2024-11-14 13:57:25Z ERR  BrokerServer] System.Exception: Failed to get job message. Request to https://broker.actions.githubusercontent.com/message failed with  │
│ RUNNER 2024-11-14 13:57:25Z ERR  BrokerServer]    at GitHub.Actions.RunService.WebApi.BrokerHttpClient.GetRunnerMessageAsync(Nullable`1 sessionId, String runnerVersion, │
│ RUNNER 2024-11-14 13:57:25Z ERR  BrokerServer]    at GitHub.Runner.Common.BrokerServer.<>c__DisplayClass7_0.<<GetRunnerMessageAsync>b__0>d.MoveNext()                    │
│ RUNNER 2024-11-14 13:57:25Z ERR  BrokerServer] --- End of stack trace from previous location ---                                                                         │
│ RUNNER 2024-11-14 13:57:25Z ERR  BrokerServer]    at GitHub.Runner.Common.RunnerService.RetryRequest[T](Func`1 func, CancellationToken cancellationToken, Int32 maxRetry │
│ RUNNER 2024-11-14 13:57:25Z WARN BrokerServer] Back off 7.012 seconds before next retry. 3 attempt left.                                                                 │
│ RUNNER 2024-11-14 13:57:33Z ERR  GitHubActionsService] GET request to https://broker.actions.githubusercontent.com/message?sessionId=5244b719-b408-4da3-993c-c03a3fb81b1 │
│ RUNNER 2024-11-14 13:57:33Z ERR  BrokerServer] Catch exception during request                                                                                            │
│ RUNNER 2024-11-14 13:57:33Z ERR  BrokerServer] System.Exception: Failed to get job message. Request to https://broker.actions.githubusercontent.com/message failed with  │
│ RUNNER 2024-11-14 13:57:33Z ERR  BrokerServer]    at GitHub.Actions.RunService.WebApi.BrokerHttpClient.GetRunnerMessageAsync(Nullable`1 sessionId, String runnerVersion, │
│ RUNNER 2024-11-14 13:57:33Z ERR  BrokerServer]    at GitHub.Runner.Common.BrokerServer.<>c__DisplayClass7_0.<<GetRunnerMessageAsync>b__0>d.MoveNext()                    │
│ RUNNER 2024-11-14 13:57:33Z ERR  BrokerServer] --- End of stack trace from previous location ---                                                                         │
│ RUNNER 2024-11-14 13:57:33Z ERR  BrokerServer]    at GitHub.Runner.Common.RunnerService.RetryRequest[T](Func`1 func, CancellationToken cancellationToken, Int32 maxRetry │
│ RUNNER 2024-11-14 13:57:33Z WARN BrokerServer] Back off 9.356 seconds before next retry. 2 attempt left.                                                                 │
│ RUNNER 2024-11-14 13:58:33Z INFO MessageListener] BrokerMigration message received. Polling Broker for messages...                                                       │
│ RUNNER 2024-11-14 13:59:23Z INFO MessageListener] BrokerMigration message received. Polling Broker for messages...                                                       │
│ RUNNER 2024-11-14 14:00:13Z INFO MessageListener] BrokerMigration message received. Polling Broker for messages...                                                       │

Many of the pods that were sitting idle when we had demand for it has the following logs

[2024-11-14 11:43:31Z INFO GitHubActionsService] AAD Correlation ID for this token request: Unknown
[2024-11-14 11:43:31Z INFO MessageListener] Session created.
[2024-11-14 11:43:31Z INFO Terminal] WRITE LINE: Current runner version: '2.317.0'
[2024-11-14 11:43:31Z INFO Terminal] WRITE LINE: 2024-11-14 11:43:31Z: Listening for Jobs
[2024-11-14 11:43:31Z INFO JobDispatcher] Set runner/worker IPC timeout to 30 seconds.
[2024-11-14 11:43:31Z INFO MessageListener] BrokerMigration message received. Polling Broker for messages...
[2024-11-14 11:43:31Z INFO RSAFileKeyManager] Loading RSA key parameters from file /home/bittide/Downloads/actions-runner-1/.credentials_rsaparams
[2024-11-14 11:43:31Z INFO GitHubActionsService] AAD Correlation ID for this token request: Unknown
[2024-11-14 11:44:22Z INFO MessageListener] BrokerMigration message received. Polling Broker for messages...
[2024-11-14 11:45:12Z INFO MessageListener] BrokerMigration message received. Polling Broker for messages...
[2024-11-14 11:46:02Z INFO MessageListener] BrokerMigration message received. Polling Broker for messages...
[2024-11-14 11:46:53Z INFO MessageListener] BrokerMigration message received. Polling Broker for messages...
[2024-11-14 11:47:43Z INFO MessageListener] BrokerMigration message received. Polling Broker for messages...
[2024-11-14 11:48:33Z INFO MessageListener] BrokerMigration message received. Polling Broker for messages...
[2024-11-14 11:49:23Z INFO MessageListener] BrokerMigration message received. Polling Broker for messages...
[2024-11-14 11:50:13Z INFO MessageListener] BrokerMigration message received. Polling Broker for messages...
[2024-11-14 11:51:04Z INFO MessageListener] BrokerMigration message received. Polling Broker for messages...

This Actionset controller is emitting metrics that we scrape in our Prometheus server. This is what we observe from the Grafana Dashboard

There is a gap between the assigned job and Kubernetes pods.

Nov 15 '24 07:11 sbrakl

We experience same issues. 2 jobs in a workflow, second job always runs into pending forever

Dec 05 '24 09:12 bettr-sanderhorst

Same issue. I have to rerun & immediately cancel some ancient old runs so that the recent/actual queued tasks actually start doing their job and not hang waiting for a runner to pickup job. Moving runner from org to repo-level has not fixed a thing, and its still happening almost on every single action.

Dec 05 '24 10:12 enginelesscc

This issue started again from yesterday without any changes from ourside. Randomly after runnign few jobs, the runner remains idle - but nothing gets assigned to the runner. We have to manually kill ./run.sh and run it again and jobs start. Is there any options to see detailed logs of run.sh ?

Dec 05 '24 11:12 vishwaspai

Same issue here... there are any update?

Jan 20 '25 20:01 johanntrigosinchcape

Same for me. It seems to be intermittent. It queues until I reboot the server, and then it only picks up a few and then gets stuck queuing again. Screenshots show before and after rebooting the server

Mar 11 '25 20:03 jack-goslin

Same here, and seems to be happening after running 1 or 2 jobs. Only a service restart fixes the issue (using repo runners).

Mar 12 '25 07:03 abrinkman

My problem is only with runner inside docker

For me it runs a workflow (one or more jobs) and that's it, and will only pick up a new job if a job is already queuing while I restart the runner. I'm using the official docker image and the compose setup described here: https://github.com/dorinclisu/devops-setup/tree/main/github_runner

github-runner-1  | [RUNNER 2025-03-13 05:53:04Z INFO ProcessInvokerWrapper] Finished process 35 with exit code 100, and elapsed time 00:00:10.6715035.
github-runner-1  | [RUNNER 2025-03-13 05:53:04Z INFO JobDispatcher] Worker finished for job 222fe9af-eb80-5509-9f7a-53ef6b651d8a. Code: 100
github-runner-1  | [RUNNER 2025-03-13 05:53:04Z INFO JobDispatcher] finish job request for job 222fe9af-eb80-5509-9f7a-53ef6b651d8a with result: Succeeded
github-runner-1  | [RUNNER 2025-03-13 05:53:04Z INFO Terminal] WRITE LINE: 2025-03-13 05:53:04Z: Job test completed with result: Succeeded
github-runner-1  | 2025-03-13 05:53:04Z: Job test completed with result: Succeeded
github-runner-1  | [RUNNER 2025-03-13 05:53:04Z INFO JobDispatcher] Stop renew job request for job 222fe9af-eb80-5509-9f7a-53ef6b651d8a.
github-runner-1  | [RUNNER 2025-03-13 05:53:04Z INFO JobDispatcher] job renew has been cancelled, stop renew job 222fe9af-eb80-5509-9f7a-53ef6b651d8a.
github-runner-1  | [RUNNER 2025-03-13 05:53:04Z INFO JobNotification] Entering JobCompleted Notification
github-runner-1  | [RUNNER 2025-03-13 05:53:04Z INFO JobNotification] Entering EndMonitor
github-runner-1  | [RUNNER 2025-03-13 05:53:04Z INFO MessageListener] Received job status event. JobState: Online

I don't have this issue with the runner installed on the host as per instructions in the Actions console:

√ Connected to GitHub
2025-03-13 10:05:15Z: Running job: test
2025-03-13 10:06:04Z: Job test completed with result: Canceled
2025-03-13 10:09:23Z: Running job: test
2025-03-13 10:10:34Z: Job test completed with result: Succeeded
2025-03-13 10:11:32Z: Running job: build-test
2025-03-13 10:12:25Z: Job build-test completed with result: Succeeded
2025-03-13 10:12:32Z: Running job: build-push
2025-03-13 10:19:27Z: Job build-push completed with result: Succeeded

Mar 13 '25 06:03 dorinclisu

I have the same issue. I uninstalled the services, reinstalled them, registered the runner again, and also changed the labels, but no luck. Every time, I need to restart for it to pick up the job. This has been happening for the last two weeks; otherwise, it has worked well for three years. Do we have a solution? I’m not keeping track of who made changes, and I have to restart the service. Thank you.

Mar 13 '25 06:03 5priyanka

Same error for me. After 1-2 jobs it doesn't pickup it again. Runner on Self Hosted - Hetzner ARM64

Mar 13 '25 12:03 Smurrlawa

Any Update for this? I have to press enter in the cmd box to run for new jobs all the time. If I restart the run.cmd, still having the same problems after picking up 1 to 2 jobs.

Mar 13 '25 13:03 TechThiha

+1

Mar 13 '25 14:03 AM-Xdev

Also experiencing this issue at the moment. It seems to be related to this incident from March 8th, that is seemingly resolved.

Similar issue with similar recent complaints: https://github.com/orgs/community/discussions/120813#discussioncomment-12435979

Does anyone from GH monitor these issues?

Mar 13 '25 15:03 ysalmi567

GH has apparently fixed this issue (can confirm on our side), here's their comment: https://github.com/actions/runner/issues/3609#issuecomment-2722340062

Mar 14 '25 12:03 ysalmi567

I haven't had any issues after the fix. Seems to be fixed ^^

Mar 22 '25 13:03 Smurrlawa

The same issue started for us yesterday. No errors in ARC controller or listener logs. Everything appears fine except for the fact that the queued job doesn't start.

Jun 06 '25 14:06 zarko-a

k get ephemeralrunner
NAME                        GITHUB CONFIG URL                        RUNNERID   STATUS    JOBREPOSITORY                   JOBWORKFLOWREF                                                                           AGE
nonprod-hpzjx-runner-6fqkj   https://github.com/enterprises/<name>   333337     Running   org/repo   org/repo/.github/workflows/workflow_name.yml@refs/pull/9048/merge                                              77s
nonprod-hpzjx-runner-mh5km   https://github.com/enterprises/<name>   332555     Running   org/repo       org/repo/.github/workflows/Pull_Request.yml@refs/pull/15174/merge                                              45h
nonprod-hpzjx-runner-qrg7c   https://github.com/enterprises/<name>   351555     Running   org/repo   org/repo/.github/workflows/workflow_name.yml@refs/pull/9070/merge                                              22m
nonprod-hpzjx-runner-tdkr4   https://github.com/enterprises/<name>   355555     Running   org/repo   org/repo/.github/workflows/workflow_name.yml@refs/pull/9064/merge                                              45h

Looking at ephemeral runners above, AGE suggests job was supposed to serve 45 hour ago, but in background on Github Action workflow keeps loading forever/stuck after picking up the jobs from queue. No errors in ARC controller or listener logs as stated in above comment.

using latest runner v2.325.0 arc runner set version - 0.11.0

Jul 02 '25 17:07 rr-bhavin-patel

Same issue on our end. The self-hosted runners shows status Idle, but the actions are all Queued.

Oct 27 '25 11:10 jcarlosroldan