Rootless DIND runner set does not work following official documentation
Checks
- [X] I've already read https://docs.github.com/en/actions/hosting-your-own-runners/managing-self-hosted-runners-with-actions-runner-controller/troubleshooting-actions-runner-controller-errors and I'm sure my issue is not covered in the troubleshooting guide.
- [X] I am using charts that are officially provided
Controller Version
0.9.3
Deployment Method
Helm
Checks
- [X] This isn't a question or user support case (For Q&A and community support, go to Discussions).
- [X] I've read the Changelog before submitting this issue and I'm sure it's not due to any recently-introduced backward-incompatible changes
To Reproduce
1. Go to https://docs.github.com/en/enterprise-cloud@latest/actions/hosting-your-own-runners/managing-self-hosted-runners-with-actions-runner-controller/deploying-runner-scale-sets-with-actions-runner-controller#example-running-dind-rootless
2. Use the supplied snippet and use the runner set chart to create a runner set using rootless DIND.
3. Set the `minRunners` to 1 so you can have one runner pod be spawned and tested on.
4. The Pod should initialize but the DIND container will fail with an Error.
Describe the bug
The runner pods don't become Ready. When analyzing the logs of the DIND container, I get errors reporting "Permissions denied".
Describe the expected behavior
I expected the runner pod to get Ready and just work with pipelines.
Additional Context
githubConfigUrl: "https://github.com/enterprises/REDACTED"
githubConfigSecret:
github_token: "ghp_REDACTED"
maxRunners: 8
minRunners: 2
template:
spec:
initContainers:
- name: init-dind-externals
image: ghcr.io/actions/actions-runner:latest
command: ["cp", "-r", "-v", "/home/runner/externals/.", "/home/runner/tmpDir/"]
volumeMounts:
- name: dind-externals
mountPath: /home/runner/tmpDir
securityContext:
runAsUser: 0
- name: init-dind-rootless
image: docker:dind-rootless
command:
- sh
- -c
- |
set -x
cp -a /etc/. /dind-etc/
echo 'runner:x:1001:1001:runner:/home/runner:/bin/ash' >> /dind-etc/passwd
echo 'runner:x:1001:' >> /dind-etc/group
echo 'runner:100000:65536' >> /dind-etc/subgid
echo 'runner:100000:65536' >> /dind-etc/subuid
chmod 755 /dind-etc;
chmod u=rwx,g=rx+s,o=rx /dind-home
chown 1001:1001 /dind-home
securityContext:
runAsUser: 0
privileged: true
volumeMounts:
- mountPath: /dind-etc
name: dind-etc
- mountPath: /dind-home
name: dind-home
containers:
- name: runner
image: ghcr.io/actions/actions-runner:latest
command: ["/home/runner/run.sh"]
securityContext:
privileged: true
runAsUser: 1001
runAsGroup: 1001
env:
- name: DOCKER_HOST
value: unix:///var/run/docker.sock
volumeMounts:
- name: work
mountPath: /home/runner/_work
- name: dind-sock
mountPath: /var/run
- name: dind
image: docker:dind-rootless
args:
- dockerd
- --host=unix:///var/run/docker.sock
securityContext:
privileged: true
runAsUser: 1001
runAsGroup: 1001
volumeMounts:
- name: work
mountPath: /home/runner/_work
- name: dind-sock
mountPath: /var/run
- name: dind-externals
mountPath: /home/runner/externals
- name: dind-etc
mountPath: /etc
- name: dind-home
mountPath: /home/runner
volumes:
- name: work
ephemeral:
volumeClaimTemplate:
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: "gp2"
resources:
requests:
storage: 50Gi
- name: dind-sock
ephemeral:
volumeClaimTemplate:
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: "gp2"
resources:
requests:
storage: 5Gi
- name: dind-externals
ephemeral:
volumeClaimTemplate:
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: "gp2"
resources:
requests:
storage: 15Gi
- name: dind-etc
ephemeral:
volumeClaimTemplate:
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: "gp2"
resources:
requests:
storage: 5Gi
- name: dind-home
ephemeral:
volumeClaimTemplate:
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: "gp2"
resources:
requests:
storage: 20Gi
fsGroup: 1001
Controller Logs
https://gist.github.com/shodanwashere/72412fbeb9cae702847bf00c5e5037db
Runner Pod Logs
Describe: https://gist.github.com/shodanwashere/bac9e93a65084b4bb6e9efc5325cff99
Logs (dind): https://gist.github.com/shodanwashere/ab60c36e8166d35ed5847ba81e45ca62
Hello! Thank you for filing an issue.
The maintainers will triage your issue shortly.
In the meantime, please take a look at the troubleshooting guide for bug reports.
If this is a feature request, please review our contribution guidelines.
I'm also following the official example exactly, but I'm running into a different issue where the listener pod is stuck in a Terminating/Running loop with the following error message:
2024-07-08T13:21:21Z INFO listener-app app initialized
2024-07-08T13:21:21Z INFO listener-app Starting metrics server
2024-07-08T13:21:21Z INFO listener-app Starting listener
2024-07-08T13:21:21Z INFO listener-app refreshing token {"githubConfigUrl": "https://github.com/myorg"}
2024-07-08T13:21:21Z INFO listener-app getting access token for GitHub App auth {"accessTokenURL": "https://api.github.com/app/installations/52488617/access_tokens"}
2024-07-08T13:21:21Z INFO listener-app getting runner registration token {"registrationTokenURL": "https://api.github.com/orgs/myorg/actions/runners/registration-token"}
2024-07-08T13:21:21Z INFO listener-app getting Actions tenant URL and JWT {"registrationURL": "https://api.github.com/actions/runner-registration"}
2024-07-08T13:21:22Z INFO listener-app.listener Current runner scale set statistics. {"statistics": "{\"totalAvailableJobs\":0,\"totalAcquiredJobs\":0,\"totalAssignedJobs\":0,\"totalRunningJobs\":0,\"totalRegisteredRunners\":5,\"totalBusyRunners\":0,\"totalIdleRunners\":5}"}
2024-07-08T13:21:22Z INFO listener-app.worker.kubernetesworker Calculated target runner count {"assigned job": 0, "decision": 5, "min": 5, "max": 10, "currentRunnerCount": 5, "jobsCompleted": 0}
2024-07-08T13:21:22Z INFO listener-app.worker.kubernetesworker Compare {"original": "{\"metadata\":{\"creationTimestamp\":null},\"spec\":{\"replicas\":-1,\"patchID\":-1,\"ephemeralRunnerSpec\":{\"metadata\":{\"creationTimestamp\":null},\"spec\":{\"containers\":null}}},\"status\":{\"currentReplicas\":0,\"pendingEphemeralRunners\":0,\"runningEphemeralRunners\":0,\"failedEphemeralRunners\":0}}", "patch": "{\"metadata\":{\"creationTimestamp\":null},\"spec\":{\"replicas\":5,\"patchID\":0,\"ephemeralRunnerSpec\":{\"metadata\":{\"creationTimestamp\":null},\"spec\":{\"containers\":null}}},\"status\":{\"currentReplicas\":0,\"pendingEphemeralRunners\":0,\"runningEphemeralRunners\":0,\"failedEphemeralRunners\":0}}"}
2024-07-08T13:21:22Z INFO listener-app.worker.kubernetesworker Preparing EphemeralRunnerSet update {"json": "{\"spec\":{\"patchID\":0,\"replicas\":5}}"}
2024-07-08T13:21:22Z INFO listener-app.worker.kubernetesworker Ephemeral runner set scaled. {"namespace": "arc-runners", "name": "arc-runner-set-test-vlbt4", "replicas": 5}
2024-07-08T13:21:22Z INFO listener-app.listener Getting next message {"lastMessageID": 0}
2024-07-08T13:21:24Z ERROR listener-app Retryable client error {"error": "Get \"https://pipelinesghubeus8.actions.githubusercontent.com/ID1/_apis/runtime/runnerscalesets/10/messages?sessionId=ID2&api-version=6.0-preview\": context canceled", "method": "GET", "url": "https://pipelinesghubeus8.actions.githubusercontent.com/ID1/_apis/runtime/runnerscalesets/10/messages?sessionId=ID2&api-version=6.0-preview", "error": "request failed"}
github.com/actions/actions-runner-controller/github/actions.(*clientLogger).Error
github.com/actions/actions-runner-controller/github/actions/client.go:76
github.com/hashicorp/go-retryablehttp.(*Client).Do
github.com/hashicorp/[email protected]/client.go:718
github.com/hashicorp/go-retryablehttp.(*RoundTripper).RoundTrip
github.com/hashicorp/[email protected]/roundtripper.go:47
net/http.send
net/http/client.go:259
net/http.(*Client).send
net/http/client.go:180
net/http.(*Client).do
net/http/client.go:724
net/http.(*Client).Do
net/http/client.go:590
github.com/actions/actions-runner-controller/github/actions.(*Client).Do
github.com/actions/actions-runner-controller/github/actions/client.go:273
github.com/actions/actions-runner-controller/github/actions.(*Client).GetMessage
github.com/actions/actions-runner-controller/github/actions/client.go:577
github.com/actions/actions-runner-controller/cmd/ghalistener/listener.(*Listener).getMessage
github.com/actions/actions-runner-controller/cmd/ghalistener/listener/listener.go:272
github.com/actions/actions-runner-controller/cmd/ghalistener/listener.(*Listener).Listen
github.com/actions/actions-runner-controller/cmd/ghalistener/listener/listener.go:163
github.com/actions/actions-runner-controller/cmd/ghalistener/app.(*App).Run.func1
github.com/actions/actions-runner-controller/cmd/ghalistener/app/app.go:124
golang.org/x/sync/errgroup.(*Group).Go.func1
golang.org/x/[email protected]/errgroup/errgroup.go:78
2024-07-08T13:21:24Z INFO listener-app.listener Deleting message session
2024/07/08 13:21:24 Application returned an error: http: Server closed
Opening the link in the error shows the following:
{
"$id": "1",
"innerException": null,
"message": "The user 'System:PublicAccess;aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa' is not authorized to access this resource.",
"typeName": "Microsoft.TeamFoundation.Framework.Server.UnauthorizedRequestException, Microsoft.TeamFoundation.Framework.Server",
"typeKey": "UnauthorizedRequestException",
"errorCode": 0,
"eventId": 3000
}
Everything is also at 0.9.3 and the GitHub application has "Metadata:Read-only" and "Self-hosted runners:Read and write" permissions. Kubernetes is at 1.30.
I have same issues in eks 1.27 , containerMode: type: "dind"
2024-07-09T07:38:10.595179313Z stdout F [RUNNER 2024-07-09 07:38:10Z ERR JobDispatcher] Catch exception during renew runner jobrequest 79794.
2024-07-09T07:38:11.246446746Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR JobDispatcher] System.TimeoutException: The HTTP request timed out after 00:01:00.
2024-07-09T07:38:11.246461895Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR JobDispatcher] ---> System.Threading.Tasks.TaskCanceledException: A task was canceled.
2024-07-09T07:38:11.246467637Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR JobDispatcher] at System.Threading.Tasks.TaskCompletionSourceWithCancellation`1.WaitWithCancellationAsync(CancellationToken cancellationToken)
2024-07-09T07:38:11.246487934Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR JobDispatcher] at System.Net.Http.HttpConnectionPool.GetHttp11ConnectionAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
2024-07-09T07:38:11.246493242Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR JobDispatcher] at System.Net.Http.HttpConnectionPool.SendWithVersionDetectionAndRetryAsync(HttpRequestMessage request, Boolean async, Boolean doRequestAuth, CancellationToken cancellationToken)
2024-07-09T07:38:11.246497368Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR JobDispatcher] at System.Net.Http.AuthenticationHelper.SendWithAuthAsync(HttpRequestMessage request, Uri authUri, Boolean async, ICredentials credentials, Boolean preAuthenticate, Boolean isProxyAuth, Boolean doRequestAuth, HttpConnectionPool pool, CancellationToken cancellationToken)
2024-07-09T07:38:11.246500393Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR JobDispatcher] at System.Net.Http.DecompressionHandler.SendAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
2024-07-09T07:38:11.246503597Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR JobDispatcher] at GitHub.Services.Common.VssHttpMessageHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
2024-07-09T07:38:11.246506861Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR JobDispatcher] --- End of inner exception stack trace ---
2024-07-09T07:38:11.246509682Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR JobDispatcher] at GitHub.Services.Common.VssHttpMessageHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
2024-07-09T07:38:11.246512302Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR JobDispatcher] at GitHub.Services.Common.VssHttpRetryMessageHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
2024-07-09T07:38:11.246515406Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR JobDispatcher] at System.Net.Http.HttpClient.<SendAsync>g__Core|83_0(HttpRequestMessage request, HttpCompletionOption completionOption, CancellationTokenSource cts, Boolean disposeCts, CancellationTokenSource pendingRequestsCts, CancellationToken originalCancellationToken)
2024-07-09T07:38:11.246518675Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR JobDispatcher] at GitHub.Services.WebApi.VssHttpClientBase.SendAsync(HttpRequestMessage message, HttpCompletionOption completionOption, Object userState, CancellationToken cancellationToken)
2024-07-09T07:38:11.246521713Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR JobDispatcher] at GitHub.DistributedTask.WebApi.TaskAgentHttpClient.SendAsync[T](HttpRequestMessage message, Object userState, CancellationToken cancellationToken, Func`3 processResponse)
2024-07-09T07:38:11.246530257Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR JobDispatcher] at GitHub.DistributedTask.WebApi.TaskAgentHttpClient.SendAsync[T](HttpMethod method, IEnumerable`1 additionalHeaders, Guid locationId, Object routeValues, ApiResourceVersion version, HttpContent content, IEnumerable`1 queryParameters, Object userState, CancellationToken cancellationToken, Func`3 processResponse)
2024-07-09T07:38:11.246533658Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR JobDispatcher] at GitHub.Runner.Listener.JobDispatcher.RenewJobRequestAsync(IRunnerServer runnerServer, Int32 poolId, Int64 requestId, Guid lockToken, String orchestrationId, TaskCompletionSource`1 firstJobRequestRenewed, CancellationToken token)
2024-07-09T07:38:11.246537357Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR JobDispatcher] #####################################################
2024-07-09T07:38:11.246539905Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR JobDispatcher] System.Threading.Tasks.TaskCanceledException: A task was canceled.
2024-07-09T07:38:11.246542555Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR JobDispatcher] at System.Threading.Tasks.TaskCompletionSourceWithCancellation`1.WaitWithCancellationAsync(CancellationToken cancellationToken)
2024-07-09T07:38:11.246545359Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR JobDispatcher] at System.Net.Http.HttpConnectionPool.GetHttp11ConnectionAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
2024-07-09T07:38:11.246556749Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR JobDispatcher] at System.Net.Http.HttpConnectionPool.SendWithVersionDetectionAndRetryAsync(HttpRequestMessage request, Boolean async, Boolean doRequestAuth, CancellationToken cancellationToken)
2024-07-09T07:38:11.246559539Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR JobDispatcher] at System.Net.Http.AuthenticationHelper.SendWithAuthAsync(HttpRequestMessage request, Uri authUri, Boolean async, ICredentials credentials, Boolean preAuthenticate, Boolean isProxyAuth, Boolean doRequestAuth, HttpConnectionPool pool, CancellationToken cancellationToken)
2024-07-09T07:38:11.246562315Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR JobDispatcher] at System.Net.Http.DecompressionHandler.SendAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
2024-07-09T07:38:11.246565222Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR JobDispatcher] at GitHub.Services.Common.VssHttpMessageHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
2024-07-09T07:38:11.246568123Z stdout F [RUNNER 2024-07-09 07:38:11Z INFO JobDispatcher] Retrying lock renewal for jobrequest 79794. Job is valid until 07/09/2024 07:45:56.
2024-07-09T07:38:11.263080643Z stdout F [RUNNER 2024-07-09 07:38:11Z INFO RunnerServer] Refresh JobRequest VssConnection to get on a different AFD node.
2024-07-09T07:38:11.263095261Z stdout F [RUNNER 2024-07-09 07:38:11Z INFO RunnerServer] EstablishVssConnection
2024-07-09T07:38:11.26309838Z stdout F [RUNNER 2024-07-09 07:38:11Z INFO RunnerServer] Establish connection with 30 seconds timeout.
2024-07-09T07:38:11.338495992Z stdout F [RUNNER 2024-07-09 07:38:11Z INFO GitHubActionsService] Starting operation Location.GetConnectionData
==> test-qrg55-runner-xpbcx_gha-runner_runner-d190265e0bbcee2b286569cd99773b3fc37d5db6cd5949101cd5e0e69fa0e76c.log <==
2024-07-09T07:38:11.271259732Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR MessageListener] System.TimeoutException: The HTTP request timed out after 00:01:00.
2024-07-09T07:38:11.299984221Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR MessageListener] ---> System.Threading.Tasks.TaskCanceledException: The operation was canceled.
2024-07-09T07:38:11.299993176Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR MessageListener] ---> System.IO.IOException: Unable to read data from the transport connection: Operation canceled.
2024-07-09T07:38:11.299995401Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR MessageListener] ---> System.Net.Sockets.SocketException (125): Operation canceled
2024-07-09T07:38:11.299997642Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR MessageListener] --- End of inner exception stack trace ---
2024-07-09T07:38:11.299999709Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR MessageListener] at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.ThrowException(SocketError error, CancellationToken cancellationToken)
2024-07-09T07:38:11.300002274Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR MessageListener] at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.System.Threading.Tasks.Sources.IValueTaskSource<System.Int32>.GetResult(Int16 token)
2024-07-09T07:38:11.300004372Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR MessageListener] at System.Net.Security.SslStream.EnsureFullTlsFrameAsync[TIOAdapter](TIOAdapter adapter)
2024-07-09T07:38:11.300006737Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR MessageListener] at System.Net.Security.SslStream.ReadAsyncInternal[TIOAdapter](TIOAdapter adapter, Memory`1 buffer)
2024-07-09T07:38:11.300009195Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR MessageListener] at System.Net.Http.HttpConnection.SendAsyncCore(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
2024-07-09T07:38:11.300011656Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR MessageListener] --- End of inner exception stack trace ---
2024-07-09T07:38:11.300014109Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR MessageListener] at System.Net.Http.HttpConnection.SendAsyncCore(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
2024-07-09T07:38:11.300016727Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR MessageListener] at System.Net.Http.AuthenticationHelper.SendWithNtAuthAsync(HttpRequestMessage request, Uri authUri, Boolean async, ICredentials credentials, Boolean isProxyAuth, HttpConnection connection, HttpConnectionPool connectionPool, CancellationToken cancellationToken)
2024-07-09T07:38:11.300019463Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR MessageListener] at System.Net.Http.HttpConnectionPool.SendWithVersionDetectionAndRetryAsync(HttpRequestMessage request, Boolean async, Boolean doRequestAuth, CancellationToken cancellationToken)
2024-07-09T07:38:11.300022352Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR MessageListener] at System.Net.Http.AuthenticationHelper.SendWithAuthAsync(HttpRequestMessage request, Uri authUri, Boolean async, ICredentials credentials, Boolean preAuthenticate, Boolean isProxyAuth, Boolean doRequestAuth, HttpConnectionPool pool, CancellationToken cancellationToken)
2024-07-09T07:38:11.300044553Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR MessageListener] at System.Net.Http.DecompressionHandler.SendAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
2024-07-09T07:38:11.300047823Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR MessageListener] at GitHub.Services.Common.VssHttpMessageHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
2024-07-09T07:38:11.300050151Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR MessageListener] --- End of inner exception stack trace ---
2024-07-09T07:38:11.300052354Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR MessageListener] at GitHub.Services.Common.VssHttpMessageHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
2024-07-09T07:38:11.300054895Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR MessageListener] at GitHub.Services.Common.VssHttpRetryMessageHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
2024-07-09T07:38:11.300057337Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR MessageListener] at System.Net.Http.HttpClient.<SendAsync>g__Core|83_0(HttpRequestMessage request, HttpCompletionOption completionOption, CancellationTokenSource cts, Boolean disposeCts, CancellationTokenSource pendingRequestsCts, CancellationToken originalCancellationToken)
2024-07-09T07:38:11.300059662Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR MessageListener] at GitHub.Services.WebApi.VssHttpClientBase.SendAsync(HttpRequestMessage message, HttpCompletionOption completionOption, Object userState, CancellationToken cancellationToken)
2024-07-09T07:38:11.300061921Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR MessageListener] at GitHub.Services.WebApi.VssHttpClientBase.SendAsync[T](HttpRequestMessage message, Object userState, CancellationToken cancellationToken)
2024-07-09T07:38:11.30006677Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR MessageListener] at GitHub.Services.WebApi.VssHttpClientBase.SendAsync[T](HttpMethod method, IEnumerable`1 additionalHeaders, Guid locationId, Object routeValues, ApiResourceVersion version, HttpContent content, IEnumerable`1 queryParameters, Object userState, CancellationToken cancellationToken)
2024-07-09T07:38:11.300069372Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR MessageListener] at GitHub.Runner.Listener.MessageListener.GetNextMessageAsync(CancellationToken token)
2024-07-09T07:38:11.300071974Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR MessageListener] #####################################################
2024-07-09T07:38:11.300074526Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR MessageListener] System.Threading.Tasks.TaskCanceledException: The operation was canceled.
2024-07-09T07:38:11.300077009Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR MessageListener] ---> System.IO.IOException: Unable to read data from the transport connection: Operation canceled.
2024-07-09T07:38:11.300079781Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR MessageListener] ---> System.Net.Sockets.SocketException (125): Operation canceled
2024-07-09T07:38:11.300082378Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR MessageListener] --- End of inner exception stack trace ---
2024-07-09T07:38:11.300084805Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR MessageListener] at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.ThrowException(SocketError error, CancellationToken cancellationToken)
2024-07-09T07:38:11.300087555Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR MessageListener] at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.System.Threading.Tasks.Sources.IValueTaskSource<System.Int32>.GetResult(Int16 token)
2024-07-09T07:38:11.300089964Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR MessageListener] at System.Net.Security.SslStream.EnsureFullTlsFrameAsync[TIOAdapter](TIOAdapter adapter)
2024-07-09T07:38:11.300094473Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR MessageListener] at System.Net.Security.SslStream.ReadAsyncInternal[TIOAdapter](TIOAdapter adapter, Memory`1 buffer)
2024-07-09T07:38:11.300096911Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR MessageListener] at System.Net.Http.HttpConnection.SendAsyncCore(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
2024-07-09T07:38:11.300099741Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR MessageListener] --- End of inner exception stack trace ---
2024-07-09T07:38:11.300102172Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR MessageListener] at System.Net.Http.HttpConnection.SendAsyncCore(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
2024-07-09T07:38:11.300104583Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR MessageListener] at System.Net.Http.AuthenticationHelper.SendWithNtAuthAsync(HttpRequestMessage request, Uri authUri, Boolean async, ICredentials credentials, Boolean isProxyAuth, HttpConnection connection, HttpConnectionPool connectionPool, CancellationToken cancellationToken)
2024-07-09T07:38:11.300106415Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR MessageListener] at System.Net.Http.HttpConnectionPool.SendWithVersionDetectionAndRetryAsync(HttpRequestMessage request, Boolean async, Boolean doRequestAuth, CancellationToken cancellationToken)
2024-07-09T07:38:11.30010809Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR MessageListener] at System.Net.Http.AuthenticationHelper.SendWithAuthAsync(HttpRequestMessage request, Uri authUri, Boolean async, ICredentials credentials, Boolean preAuthenticate, Boolean isProxyAuth, Boolean doRequestAuth, HttpConnectionPool pool, CancellationToken cancellationToken)
2024-07-09T07:38:11.300109838Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR MessageListener] at System.Net.Http.DecompressionHandler.SendAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
2024-07-09T07:38:11.30011151Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR MessageListener] at GitHub.Services.Common.VssHttpMessageHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
2024-07-09T07:38:11.300113531Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR MessageListener] #####################################################
2024-07-09T07:38:11.300115232Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR MessageListener] System.IO.IOException: Unable to read data from the transport connection: Operation canceled.
2024-07-09T07:38:11.300116915Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR MessageListener] ---> System.Net.Sockets.SocketException (125): Operation canceled
2024-07-09T07:38:11.300118599Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR MessageListener] --- End of inner exception stack trace ---
2024-07-09T07:38:11.300120643Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR MessageListener] at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.ThrowException(SocketError error, CancellationToken cancellationToken)
2024-07-09T07:38:11.300134146Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR MessageListener] at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.System.Threading.Tasks.Sources.IValueTaskSource<System.Int32>.GetResult(Int16 token)
2024-07-09T07:38:11.300137117Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR MessageListener] at System.Net.Security.SslStream.EnsureFullTlsFrameAsync[TIOAdapter](TIOAdapter adapter)
2024-07-09T07:38:11.300139457Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR MessageListener] at System.Net.Security.SslStream.ReadAsyncInternal[TIOAdapter](TIOAdapter adapter, Memory`1 buffer)
2024-07-09T07:38:11.300141325Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR MessageListener] at System.Net.Http.HttpConnection.SendAsyncCore(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
2024-07-09T07:38:11.300145483Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR MessageListener] #####################################################
2024-07-09T07:38:11.30014886Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR MessageListener] System.Net.Sockets.SocketException (125): Operation canceled
2024-07-09T07:38:11.300150535Z stdout F [RUNNER 2024-07-09 07:38:11Z INFO MessageListener] Retriable exception: The HTTP request timed out after 00:01:00.
2024-07-09T07:38:11.311101566Z stdout F [RUNNER 2024-07-09 07:38:11Z ERR Terminal] WRITE ERROR: 2024-07-09 07:38:11Z: Runner connect error: The HTTP request timed out after 00:01:00.. Retrying until reconnected.
2024-07-09T07:38:11.331699233Z stderr F 2024-07-09 07:38:11Z: Runner connect error: The HTTP request timed out after 00:01:00.. Retrying until reconnected.
2024-07-09T07:38:11.334061151Z stdout F [RUNNER 2024-07-09 07:38:11Z INFO RunnerServer] Refresh MessageQueue VssConnection to get on a different AFD node.
2024-07-09T07:38:11.334073685Z stdout F [RUNNER 2024-07-09 07:38:11Z INFO RunnerServer] EstablishVssConnection
2024-07-09T07:38:11.334077469Z stdout F [RUNNER 2024-07-09 07:38:11Z INFO RunnerServer] Establish connection with 60 seconds timeout.
2024-07-09T07:38:11.338946708Z stdout F [RUNNER 2024-07-09 07:38:11Z INFO GitHubActionsService] Starting operation Location.GetConnectionData
Same issue here on k9s v1.29.2, running 0.9.3 showing:
"The user 'System:PublicAccess;aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa' is not authorized to access this resource."
this leads to the Runner stop working after some time
@nikola-jokic Any thoughts here? We recently updated to ARC 0.9.3 and have been hit with listener restarts and the above messaging in the runners as well. FWIW this started when we updated to 0.9.3 and we did not observe this behavior in 0.7.0. The error messaging appears to be intermittent and lasts ~15 minutes.
EKS Version 1.29.0
GHES Version: 3.9.15
ARC Version: 0.9.3
Runner Version: 2.317.0
Runner exception:
[RUNNER 2024-07-15 06:11:33Z ERR MessageListener] Catch exception during get next message.
[RUNNER 2024-07-15 06:11:33Z ERR MessageListener] System.TimeoutException: The HTTP request timed out after 00:01:00.
[RUNNER 2024-07-15 06:11:33Z ERR MessageListener] ---> System.Threading.Tasks.TaskCanceledException: The operation was canceled.
[RUNNER 2024-07-15 06:11:33Z ERR MessageListener] ---> System.IO.IOException: Unable to read data from the transport connection: Operation canceled.
[RUNNER 2024-07-15 06:11:33Z ERR MessageListener] ---> System.Net.Sockets.SocketException (125): Operation canceled
Listener Exception:
{
"error": "Get \"https://<GHES URL>/_services/pipelines/<UID>/_apis/runtime/runnerscalesets/8/messages?sessionId=<UID>&api-version=6.0-preview\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)",
"method": "GET",
"url": "https://<GHES URL>/_services/pipelines/<UID>/_apis/runtime/runnerscalesets/8/messages?sessionId=<UID>&api-version=6.0-preview"
}
Listener stack trace:
github.com/actions/actions-runner-controller/github/actions.(*clientLogger).Error
github.com/actions/actions-runner-controller/github/actions/client.go:76
github.com/hashicorp/go-retryablehttp.(*Client).Do
github.com/hashicorp/[email protected]/client.go:718
github.com/hashicorp/go-retryablehttp.(*RoundTripper).RoundTrip
github.com/hashicorp/[email protected]/roundtripper.go:47
net/http.send
net/http/client.go:259
net/http.(*Client).send
net/http/client.go:180
net/http.(*Client).do
net/http/client.go:724
net/http.(*Client).Do
net/http/client.go:590
github.com/actions/actions-runner-controller/github/actions.(*Client).Do
github.com/actions/actions-runner-controller/github/actions/client.go:273
github.com/actions/actions-runner-controller/github/actions.(*Client).GetMessage
github.com/actions/actions-runner-controller/github/actions/client.go:577
github.com/actions/actions-runner-controller/cmd/ghalistener/listener.(*Listener).getMessage
github.com/actions/actions-runner-controller/cmd/ghalistener/listener/listener.go:272
github.com/actions/actions-runner-controller/cmd/ghalistener/listener.(*Listener).Listen
github.com/actions/actions-runner-controller/cmd/ghalistener/listener/listener.go:163
github.com/actions/actions-runner-controller/cmd/ghalistener/app.(*App).Run.func1
github.com/actions/actions-runner-controller/cmd/ghalistener/app/app.go:124
golang.org/x/sync/errgroup.(*Group).Go.func1
golang.org/x/[email protected]/errgroup/errgroup.go:78
Thanks to everyone who has also brought up these issues. I'd like to clarify, however, that these issues we've had in 0.9.3 have also been present in prior versions of the GHARC, at least since version 0.9.1 by my testing.
Getting the same issue in k3s cluster
Bumping this as I'm getting the same exact issue @jb-2020 has on EKS. Do we need to use previous versions?
we tried our own luck with AKS and it never worked and then switched to Buildah rooless dind which also didn't worked. Hoping to find a solution for this issue.
Same issue here, I added the template.spec snippet recommended here to try and set requests/limits for the runner container, but the same error reported above happens. When reverting back to containerMode.type: "dind" the listener starts working again.
Hello, same problem here. Any news?
Closing since the documentation has been updated. Thank you for reporting this issue!