runner icon indicating copy to clipboard operation
runner copied to clipboard

Action workflow stuck in "Job is about to start running on the hosted runner"

Open dgillman opened this issue 1 year ago • 12 comments

Describe the bug I am testing changes to the environment variables in a GitHub Workflow, but the workflow never starts. Two workflows are spawned in parallel. Both end up hung with the message Job is about to start running on the hosted runner, followed by the ID of a GitHub hosted runner. I have attempted multiple executions of the workflow. They all get hung in this spot. The workflow had been executing fine 15 minutes ago. This appears to be an issue with GH actions scheduling/orchestration. A clear and concise description of what the bug is.

dgillman avatar Sep 16 '24 21:09 dgillman

Same here.

SoftCreatR avatar Sep 16 '24 21:09 SoftCreatR

Same here, our workflows are completely stopped because of this.

aiell0 avatar Sep 16 '24 22:09 aiell0

For me, it's working again now.

SoftCreatR avatar Sep 16 '24 22:09 SoftCreatR

We've been observing this too with our self-hosted runners. The runners launch, become visible (but offline) on GitHub, and ultimately terminate. In the logs, there's messages about a 403 response (partially redacted):

[2024-09-16 21:51:37Z ERR  GitHubActionsService] GET request to https://pipelinesghubeus2.actions.githubusercontent.com/$SOME_ID/_apis/distributedtask/pools/1/messages?sessionId=a0c42d06-b074-4263-b8f5-1df8c5568e8f&status=Online&runnerVersion=2.317.0&os=Linux&architecture=ARM64&disableUpdate=true failed. HTTP Status: Forbidden

I've confirmed that there were no changes on our side.

caramcc avatar Sep 17 '24 00:09 caramcc

Same here. EphemeralRunners are stucked in failed status.

This is runner pod logs before termination.

[RUNNER 2024-09-17 01:26:32Z ERR  GitHubActionsService] GET request to https://pipelinesghubeus1.actions.githubusercontent.com/$SOME_ID/_apis/distributedtask/pools/1/messages?sessionId=$SESSION_ID&status=Online&runnerVersion=2.317.0&os=Linux&architecture=X64&disableUpdate=true failed. HTTP Status: Forbidden
[RUNNER 2024-09-17 01:26:32Z INFO JobDispatcher] Shutting down JobDispatcher. Make sure all WorkerDispatcher has finished.
[RUNNER 2024-09-17 01:26:32Z INFO Runner] Deleting Runner Session...
[RUNNER 2024-09-17 01:26:32Z ERR  Terminal] WRITE ERROR: An error occured: Runner version v2.317.0 is deprecated and cannot receive messages.
An error occured: Runner version v2.317.0 is deprecated and cannot receive messages.
[RUNNER 2024-09-17 01:26:32Z ERR  Listener] GitHub.DistributedTask.WebApi.AccessDeniedException: Runner version v2.317.0 is deprecated and cannot receive messages.
[RUNNER 2024-09-17 01:26:32Z ERR  Listener]    at GitHub.Services.WebApi.VssHttpClientBase.HandleResponseAsync(HttpResponseMessage response, CancellationToken cancellationToken)
[RUNNER 2024-09-17 01:26:32Z ERR  Listener]    at GitHub.Services.WebApi.VssHttpClientBase.SendAsync(HttpRequestMessage message, HttpCompletionOption completionOption, Object userState, CancellationToken cancellationToken)
[RUNNER 2024-09-17 01:26:32Z ERR  Listener]    at GitHub.Services.WebApi.VssHttpClientBase.SendAsync[T](HttpRequestMessage message, Object userState, CancellationToken cancellationToken)
[RUNNER 2024-09-17 01:26:32Z ERR  Listener]    at GitHub.Services.WebApi.VssHttpClientBase.SendAsync[T](HttpMethod method, IEnumerable`1 additionalHeaders, Guid locationId, Object routeValues, ApiResourceVersion version, HttpContent content, IEnumerable`1 queryParameters, Object userState, CancellationToken cancellationToken)
[RUNNER 2024-09-17 01:26:32Z ERR  Listener]    at GitHub.Runner.Listener.MessageListener.GetNextMessageAsync(CancellationToken token)
[RUNNER 2024-09-17 01:26:32Z ERR  Listener]    at GitHub.Runner.Listener.Runner.RunAsync(RunnerSettings settings, Boolean runOnce)
[RUNNER 2024-09-17 01:26:32Z ERR  Listener]    at GitHub.Runner.Listener.Runner.RunAsync(RunnerSettings settings, Boolean runOnce)
[RUNNER 2024-09-17 01:26:32Z ERR  Listener]    at GitHub.Runner.Listener.Runner.RunAsync(RunnerSettings settings, Boolean runOnce)
[RUNNER 2024-09-17 01:26:32Z ERR  Listener]    at GitHub.Runner.Listener.Runner.ExecuteCommand(CommandSettings command)
[RUNNER 2024-09-17 01:26:32Z ERR  Listener]    at GitHub.Runner.Listener.Program.MainAsync(IHostContext context, String[] args)
Runner listener exit with terminated error, stop the service, no retry needed.
Exiting runner...

flex-dongjin avatar Sep 17 '24 02:09 flex-dongjin

fyi that second WRITE ERROR looks to be a red herring according to this issue comment: https://github.com/actions/runner/issues/3381#issuecomment-2329370180 (we're seeing it as well)

jakeonfire avatar Sep 17 '24 02:09 jakeonfire

is anyone having this issue with self-hosted runners using a runner version > 2.317.0?

jakeonfire avatar Sep 17 '24 02:09 jakeonfire

Just changed the runner image to 2.319.1, and now EphemeralRunners work in Running status. @jakeonfire Thank you for the information.

+ But what exactly was the problem?

flex-dongjin avatar Sep 17 '24 02:09 flex-dongjin

Updating the image from 2.317.0 to 2.319.1 fixes this for me as well. Would be great to get an update from Github as to why this is happening. I shouldn't be forced to update my image if I don't want to.

aiell0 avatar Sep 17 '24 15:09 aiell0

Had the same issue here. It's not great seeing the runners be unceremoniously dumped when they don't match some arbitrary version requirement. There's also nothing in the UI about it - observability of this part of the Actions pipeline is extremely poor. I don't mind old versions being deprecated as long as there is some reasonable way to view it.

ohookins avatar Sep 17 '24 23:09 ohookins

Had the same issue here.

Sep 18 15:34:29 vmss-xxx-xx-xx-eastus000001 runsvc.sh[358641]: Started running service Sep 18 15:34:31 vmss-xxx-xx-xx-eastus000001 runsvc.sh[358641]: √ Connected to GitHub Sep 18 15:34:31 vmss-xxx-xx-xx-eastus000001 runsvc.sh[358641]: Current runner version: '2.317.0' Sep 18 15:34:31 vmss-xxx-xx-xx-eastus000001 runsvc.sh[358641]: 2024-09-18 15:34:31Z: Listening for Jobs Sep 18 15:34:31 vmss-xxx-xx-xx-eastus000001 runsvc.sh[358641]: An error occured: Runner version v2.317.0 is deprecated and cannot receive messages.

pavanpatilingenovis avatar Sep 18 '24 15:09 pavanpatilingenovis

If you have disabled automatic updates on your self-hosted runners, you will be responsible for manually updating them with the latest runner software as it's released.

For compatibility with the GitHub Actions service, you will need to manually update your runner within 30 days of a new runner version being available. For instructions on how to install the latest runner version, please see the installation instructions for the latest release in the runner repo.

ref: https://github.blog/changelog/2022-02-01-github-actions-self-hosted-runners-can-now-disable-automatic-updates/

gclhub avatar Sep 19 '24 16:09 gclhub

This issue is stale because it has been open 365 days with no activity. Remove stale label or comment or this will be closed in 15 days.

github-actions[bot] avatar Sep 22 '25 00:09 github-actions[bot]

Describe the bug I am testing changes to the environment variables in a GitHub Workflow, but the workflow never starts. Two workflows are spawned in parallel. Both end up hung with the message Job is about to start running on the hosted runner, followed by the ID of a GitHub hosted runner. I have attempted multiple executions of the workflow. They all get hung in this spot. The workflow had been executing fine 15 minutes ago. This appears to be an issue with GH actions scheduling/orchestration. A clear and concise description of what the bug is.

system.txt

Ebc44 avatar Nov 11 '25 05:11 Ebc44

system.txt

Ebc44 avatar Nov 11 '25 05:11 Ebc44