runner icon indicating copy to clipboard operation
runner copied to clipboard

Error messages when removing a self-hosted runner

Open wyphan opened this issue 4 years ago • 23 comments

Describe the bug When removing a self-hosted runner, I get the following error messages:

ldd: ./bin/libSystem.Security.Cryptography.Native.OpenSsl.so: No such file or directory
ldd: ./bin/libSystem.IO.Compression.Native.so: No such file or directory
# Runner removal
√ Runner removed successfully
√ Removed .credentials
√ Removed .runner
An error occurred: Access denied. System:ServiceIdentity;DDDDDDDD-DDDD-DDDD-DDDD-DDDDDDDDDDDD needs View permissions to perform the action.

The removal seems to succeed though, as refreshing the page removes the self-hosted runner from the list.

To Reproduce Steps to reproduce the behavior:

  1. Get the removal line from "Settings" tab of the repository, then "Actions" tab, then the three-dot menu next to the self-hosted runner name, and "Remove"
  2. Run the removal line ./config.sh remove --token XXX
  3. See error

Expected behavior A clear and concise description of what you expected to happen.

Runner Version and Platform

Version of your runner? Sorry I forgot to check, but in both cases they were downloaded from the official download links as given out in "Add runner"

OS of the machine running the runner? Linux x86_64. This has happened twice: with Ubuntu Linux 20.04 LTS and CentOS 8.

wyphan avatar Feb 05 '21 19:02 wyphan

image

@wyphan did you click the Force remove this runner button before running the command on the runner?

You should either click Force remove this runner or execute ./run.sh remove, and not doing both to remove the runner from service.

TingluoHuang avatar Feb 05 '21 19:02 TingluoHuang

No, I didn't click "Force remove this runner".

wyphan avatar Feb 05 '21 19:02 wyphan

@wyphan do you mind sharing a link to the repository or organization that you have this runner configured? and also the runner's name if you still remember?

TingluoHuang avatar Feb 05 '21 19:02 TingluoHuang

The two instances of the error message were for two different repositories:

  • https://github.com/wyphan/exciting-plus-gpu/ I think the runner was called escursionista-centos-amd64, or a permutation of those three words.

  • https://github.com/wyphan/wyphan.github.io/ For this one, the runner was simply called github-runner.

wyphan avatar Feb 08 '21 18:02 wyphan

@wyphan @TingluoHuang I'm also experiencing the same error message.

amenocal avatar Feb 09 '21 22:02 amenocal

Does anyone have the runner diag log available for me to check?

TingluoHuang avatar Feb 10 '21 04:02 TingluoHuang

I think I know what happened. @wyphan @amenocal did you guys start the runner interactively instead of configuring it as a service?

When the interactive runner auto-upgrade to a newer version, it got partially detached from the terminal. STDIN is gone, but STDOUT/ERR still hock to the terminal.

So, after the upgrade, the runner is still running in the background with & and its output will show up in the terminal.

If you run ./config.sh remove to remove the runner without stop the running one, you will see the error about An error occurred: Access denied

TingluoHuang avatar Feb 10 '21 16:02 TingluoHuang

There are two issues to fix with this:

  • We should fix the error message so it's not un-intelligible
  • This is likely caused by moving the runner to the background during auto-upgrade on linux. Should we just not background during upgrade or try to take back the window after upgrade? Seems like we should.

hross avatar Mar 30 '21 19:03 hross

Note that this issue is benign and the runner was still removed.

hross avatar Mar 30 '21 19:03 hross

@TingluoHuang That is correct. When I was still using them, usually I SSH into the machine, start GNU screen, then start the runner interactively, and detach from the GNU screen session.

Edit: typo

wyphan avatar Mar 30 '21 19:03 wyphan

I've started getting this error when an ephemeral runner on Windows finishes.

jeremyd2019 avatar Dec 21 '21 02:12 jeremyd2019

2021-12-22 03:16:40Z: Job CLANGARM64 completed with result: Canceled
An error occurred: Access denied. System:ServiceIdentity;DDDDDDDD-DDDD-DDDD-DDDD-DDDDDDDDDDDD needs View permissions to perform the action.

Also of note is that it reports that the job result was Canceled, but the job was not canceled.

[2021-12-22 03:15:55Z INFO JobDispatcher] Successfully renew job request 554, job is valid till 12/22/2021 3:25:55 AM
[2021-12-22 03:16:35Z ERR  GitHubActionsService] GET request to https://pipelines.actions.githubusercontent.com/yCmu0F2oGfbA9DkO6Byr4wKOkszFHnzaBFmAngWq8HAMcu3T9a/_apis/distributedtask/pools/1/messages?sessionId=a2055832-e3bc-4d2c-9226-ea321e364000&lastMessageId=1 failed. HTTP Status: Forbidden, AFD Ref: Ref A: 08A8A04A5F8C4272A9D3805928073430 Ref B: ASHEDGE1213 Ref C: 2021-12-22T03:16:35Z
[2021-12-22 03:16:35Z ERR  MessageListener] Catch exception during get next message.
[2021-12-22 03:16:35Z ERR  MessageListener] GitHub.DistributedTask.WebApi.AccessDeniedException: Access denied. System:ServiceIdentity;DDDDDDDD-DDDD-DDDD-DDDD-DDDDDDDDDDDD needs View permissions to perform the action.
   at GitHub.Services.WebApi.VssHttpClientBase.HandleResponseAsync(HttpResponseMessage response, CancellationToken cancellationToken)
   at GitHub.Services.WebApi.VssHttpClientBase.SendAsync(HttpRequestMessage message, HttpCompletionOption completionOption, Object userState, CancellationToken cancellationToken)
   at GitHub.Services.WebApi.VssHttpClientBase.SendAsync[T](HttpRequestMessage message, Object userState, CancellationToken cancellationToken)
   at GitHub.Services.WebApi.VssHttpClientBase.SendAsync[T](HttpMethod method, IEnumerable`1 additionalHeaders, Guid locationId, Object routeValues, ApiResourceVersion version, HttpContent content, IEnumerable`1 queryParameters, Object userState, CancellationToken cancellationToken)
   at GitHub.Runner.Listener.MessageListener.GetNextMessageAsync(CancellationToken token)
[2021-12-22 03:16:35Z INFO MessageListener] Non-retriable exception: Access denied. System:ServiceIdentity;DDDDDDDD-DDDD-DDDD-DDDD-DDDDDDDDDDDD needs View permissions to perform the action.
[2021-12-22 03:16:35Z INFO JobDispatcher] Shutting down JobDispatcher. Make sure all WorkerDispatcher has finished.
[2021-12-22 03:16:35Z INFO JobDispatcher] Ensure WorkerDispather for job d4d6bd6c-c8bd-59d0-ecca-16324ffb3d87 run to finish, cancel any running job.
[2021-12-22 03:16:35Z INFO JobDispatcher] Send job cancellation message to worker for job d4d6bd6c-c8bd-59d0-ecca-16324ffb3d87.
[2021-12-22 03:16:35Z INFO ProcessChannel] Sending message of length 0, with hash 'e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855'
[2021-12-22 03:16:40Z INFO ProcessInvokerWrapper] Scan all processes to find relationship between all processes.
[2021-12-22 03:16:40Z INFO ProcessInvokerWrapper] Find all child processes of process '6652'.
[2021-12-22 03:16:40Z INFO ProcessInvokerWrapper] Need kill all child processes trees before kill process '6652'.
[2021-12-22 03:16:40Z INFO ProcessInvokerWrapper] Child process '6740' needs be killed first.
[2021-12-22 03:16:40Z INFO ProcessInvokerWrapper] Find all child processes of process '6740'.
[2021-12-22 03:16:40Z INFO ProcessInvokerWrapper] Kill process '6740'.
[2021-12-22 03:16:40Z INFO ProcessInvokerWrapper] Kill process '6652'.
[2021-12-22 03:16:40Z INFO ProcessInvokerWrapper] Finished process 6652 with exit code 100, and elapsed time 02:24:48.8044470.
[2021-12-22 03:16:40Z INFO JobDispatcher] finish job request for job d4d6bd6c-c8bd-59d0-ecca-16324ffb3d87 with result: Canceled
[2021-12-22 03:16:40Z INFO Terminal] WRITE LINE: 2021-12-22 03:16:40Z: Job CLANGARM64 completed with result: Canceled
[2021-12-22 03:16:40Z INFO JobDispatcher] Stop renew job request for job d4d6bd6c-c8bd-59d0-ecca-16324ffb3d87.
[2021-12-22 03:16:40Z INFO JobDispatcher] job renew has been canceled, stop renew job request 554.
[2021-12-22 03:16:40Z INFO JobNotification] Entering JobCompleted Notification
[2021-12-22 03:16:40Z INFO JobNotification] Entering EndMonitor
[2021-12-22 03:16:40Z INFO JobDispatcher] Fire signal for one time used runner.
[2021-12-22 03:16:40Z ERR  GitHubActionsService] DELETE request to https://pipelines.actions.githubusercontent.com/yCmu0F2oGfbA9DkO6Byr4wKOkszFHnzaBFmAngWq8HAMcu3T9a/_apis/distributedtask/pools/1/sessions/a2055832-e3bc-4d2c-9226-ea321e364000 failed. HTTP Status: Forbidden, AFD Ref: Ref A: 9082B93F2F8D401B97858A82DA5ACD1B Ref B: ASHEDGE1213 Ref C: 2021-12-22T03:16:40Z
[2021-12-22 03:16:40Z INFO Runner] Ignore any exception during DeleteSession for an ephemeral runner. GitHub.DistributedTask.WebApi.AccessDeniedException: Access denied. System:ServiceIdentity;DDDDDDDD-DDDD-DDDD-DDDD-DDDDDDDDDDDD needs View permissions to perform the action.
   at GitHub.Services.WebApi.VssHttpClientBase.HandleResponseAsync(HttpResponseMessage response, CancellationToken cancellationToken)
   at GitHub.Services.WebApi.VssHttpClientBase.SendAsync(HttpRequestMessage message, HttpCompletionOption completionOption, Object userState, CancellationToken cancellationToken)
   at GitHub.Services.WebApi.VssHttpClientBase.SendAsync(HttpMethod method, Guid locationId, Object routeValues, ApiResourceVersion version, HttpContent content, IEnumerable`1 queryParameters, Object userState, CancellationToken cancellationToken)
   at GitHub.DistributedTask.WebApi.TaskAgentHttpClientBase.DeleteAgentSessionAsync(Int32 poolId, Guid sessionId, Object userState, CancellationToken cancellationToken)
   at GitHub.Runner.Listener.MessageListener.DeleteSessionAsync()
   at GitHub.Runner.Listener.Runner.RunAsync(RunnerSettings settings, Boolean runOnce)
[2021-12-22 03:16:40Z ERR  Terminal] WRITE ERROR: An error occurred: Access denied. System:ServiceIdentity;DDDDDDDD-DDDD-DDDD-DDDD-DDDDDDDDDDDD needs View permissions to perform the action.
[2021-12-22 03:16:40Z ERR  Listener] GitHub.DistributedTask.WebApi.AccessDeniedException: Access denied. System:ServiceIdentity;DDDDDDDD-DDDD-DDDD-DDDD-DDDDDDDDDDDD needs View permissions to perform the action.
   at GitHub.Services.WebApi.VssHttpClientBase.HandleResponseAsync(HttpResponseMessage response, CancellationToken cancellationToken)
   at GitHub.Services.WebApi.VssHttpClientBase.SendAsync(HttpRequestMessage message, HttpCompletionOption completionOption, Object userState, CancellationToken cancellationToken)
   at GitHub.Services.WebApi.VssHttpClientBase.SendAsync[T](HttpRequestMessage message, Object userState, CancellationToken cancellationToken)
   at GitHub.Services.WebApi.VssHttpClientBase.SendAsync[T](HttpMethod method, IEnumerable`1 additionalHeaders, Guid locationId, Object routeValues, ApiResourceVersion version, HttpContent content, IEnumerable`1 queryParameters, Object userState, CancellationToken cancellationToken)
   at GitHub.Runner.Listener.MessageListener.GetNextMessageAsync(CancellationToken token)
   at GitHub.Runner.Listener.MessageListener.GetNextMessageAsync(CancellationToken token)
   at GitHub.Runner.Listener.Runner.RunAsync(RunnerSettings settings, Boolean runOnce)
   at GitHub.Runner.Listener.Runner.RunAsync(RunnerSettings settings, Boolean runOnce)
   at GitHub.Runner.Listener.Runner.RunAsync(RunnerSettings settings, Boolean runOnce)
   at GitHub.Runner.Listener.Runner.ExecuteCommand(CommandSettings command)
   at GitHub.Runner.Listener.Program.MainAsync(IHostContext context, String[] args)

The ephemeral runner is removed from the org, but the .runner and .credentials are still present on the runner itself, whereas before this started happening those were removed when the ephemeral runner shut down.

jeremyd2019 avatar Dec 22 '21 03:12 jeremyd2019

Facing the same issue via clicking the Force removal of self-runner

kartikv11 avatar Apr 22 '22 12:04 kartikv11

Is there any plan to fix this issue? We're facing several of these errors daily as we're relying on ephemeral self-hosted runners..

jgutierrezglez avatar Aug 22 '22 11:08 jgutierrezglez

Any update on this issue? I'm facing this issue daily with enterprise-level ephemeral self-hosted runners (containerized)... It is blocking us to implement proper runner autoscaling

maartengryp-liantis avatar Mar 15 '23 12:03 maartengryp-liantis

I am seeing runners not picking up jobs, staying idle, then exiting with this

An error occurred: Access denied. System:ServiceIdentity;DDDDDDDD-DDDD-DDDD-DDDD-DDDDDDDDDDDD needs View permissions to perform the action.
Runner listener exit with retryable error, re-launch runner in 5 seconds.
Restarting runner...

√ Connected to GitHub

Failed to create a session. The runner registration has been deleted from the server, please re-configure.
Runner listener exit with terminated error, stop the service, no retry needed.
Exiting runner...
2023-04-02 04:27:45.294  NOTICE --- Runner init exited. Exiting this process with code 0 so that the container and the pod is GC'ed Kubernetes soon.

Nuru avatar Apr 02 '23 04:04 Nuru

I am experiencing same issue with actions-runner-controller on AWS EKS after trying force remove the runner. All of my runner pods are keep created and terminated itself within 2 minutes. Any updates on this or workaroud to avoid terminating the runner?

iv0rish avatar Aug 07 '23 09:08 iv0rish

we are seeing this issue too

pdeva avatar Sep 26 '23 22:09 pdeva

Any update on this?

matanbaruch avatar Mar 12 '24 09:03 matanbaruch

+1

joaoluiznaufel avatar Apr 01 '24 18:04 joaoluiznaufel

This is still relevant.

[RUNNER 2024-04-09 19:46:59Z INFO Runner] Deleting Runner Session...
[RUNNER 2024-04-09 19:46:59Z ERR  GitHubActionsService] DELETE request to https://pipelinesghubeus2.actions.githubusercontent.com/7aXbNwB1hnEgXD7F3ryv46BCYhHdYXwKwdh/_apis/distributedtask/pools/1/sessions/1f3eea89-41ab-4dc4-afa1-c6hf438dh3685 failed. HTTP Status: Forbidden
[RUNNER 2024-04-09 19:46:59Z INFO Runner] Ignore any exception during DeleteSession for an ephemeral runner. GitHub.DistributedTask.WebApi.AccessDeniedException: Access denied. System:ServiceIdentity;DDDDDDDD-DDDD-DDDD-DDDD-DDDDDDDDDDDD needs View permissions to perform the action.
[RUNNER 2024-04-09 19:46:59Z INFO Runner]    at GitHub.Services.WebApi.VssHttpClientBase.HandleResponseAsync(HttpResponseMessage response, CancellationToken cancellationToken)
[RUNNER 2024-04-09 19:46:59Z INFO Runner]    at GitHub.Services.WebApi.VssHttpClientBase.SendAsync(HttpRequestMessage message, HttpCompletionOption completionOption, Object userState, CancellationToken cancellationToken)
[RUNNER 2024-04-09 19:46:59Z INFO Runner]    at GitHub.Services.WebApi.VssHttpClientBase.SendAsync(HttpMethod method, Guid locationId, Object routeValues, ApiResourceVersion version, HttpContent content, IEnumerable`1 queryParameters, Object userState, CancellationToken cancellationToken)
[RUNNER 2024-04-09 19:46:59Z INFO Runner]    at GitHub.DistributedTask.WebApi.TaskAgentHttpClientBase.DeleteAgentSessionAsync(Int32 poolId, Guid sessionId, Object userState, CancellationToken cancellationToken)
[RUNNER 2024-04-09 19:46:59Z INFO Runner]    at GitHub.Runner.Listener.MessageListener.DeleteSessionAsync()
[RUNNER 2024-04-09 19:46:59Z INFO Runner]    at GitHub.Runner.Listener.Runner.RunAsync(RunnerSettings settings, Boolean runOnce)
[RUNNER 2024-04-09 19:46:59Z INFO Listener] Runner execution been cancelled.```

nickyfoster avatar Apr 09 '24 19:04 nickyfoster

Why is this not fixable for three years now?

is it possible that you only kill the run.sh process which does not affect the other two?

when i check the processes, i can see 3 runner processes.

actions-runner$ ps ax
    PID TTY      STAT   TIME COMMAND
      1 pts/0    Ss     0:00 bash
   1671 pts/0    S      0:00 /bin/bash ./run.sh
   1675 pts/0    S      0:00 /bin/bash /azp/actions-runner/run-helper.sh
   1679 pts/0    Sl     0:01 /azp/actions-runner/bin/Runner.Listener run
   1695 pts/0    R+     0:00 ps ax

Try 1 - using ./config.sh remove only

actions-runner$ kill 1671
actions-runner$ ps ax
    PID TTY      STAT   TIME COMMAND
      1 pts/0    Ss     0:00 bash
   1675 pts/0    S      0:00 /bin/bash /azp/actions-runner/run-helper.sh
   1679 pts/0    Sl     0:01 /azp/actions-runner/bin/Runner.Listener run
   1696 pts/0    R+     0:00 ps ax
[1]+  Terminated              ./run.sh
actions-runner$ ./config.sh remove --token AAAAADYADRIFYHCKPHAO7F3GC246I

# Runner removal


√ Runner removed successfully
√ Removed .credentials
√ Removed .runner

actions-runner$ An error occurred: Access denied. System:ServiceIdentity;DDDDDDDD-DDDD-DDDD-DDDD-DDDDDDDDDDDD needs View permissions to perform the action.
Runner listener exit with retryable error, re-launch runner in 5 seconds.

actions-runner$ ps ax
    PID TTY      STAT   TIME COMMAND
      1 pts/0    Ss     0:00 bash
   1752 pts/0    R+     0:00 ps ax
actions-runner$

Try 2 killing two processes before ./config.sh remove

actions-runner$ ps ax
    PID TTY      STAT   TIME COMMAND
      1 pts/0    Ss     0:00 bash
   1811 pts/0    S      0:00 /bin/bash ./run.sh
   1815 pts/0    S      0:00 /bin/bash /azp/actions-runner/run-helper.sh
   1819 pts/0    Sl     0:01 /azp/actions-runner/bin/Runner.Listener run
   1835 pts/0    R+     0:00 ps ax
actions-runner$ kill 1811 1815
actions-runner$ ps ax
    PID TTY      STAT   TIME COMMAND
      1 pts/0    Ss     0:00 bash
   1819 pts/0    Sl     0:01 /azp/actions-runner/bin/Runner.Listener run
   1838 pts/0    R+     0:00 ps ax
actions-runner$ ./config.sh remove --token AAAAAD52TWEY7ILELITGOGLGC25IM

# Runner removal


√ Runner removed successfully
√ Removed .credentials
√ Removed .runner

actions-runner$ An error occurred: Access denied. System:ServiceIdentity;DDDDDDDD-DDDD-DDDD-DDDD-DDDDDDDDDDDD needs View permissions to perform the action.

Try 3 killing all three processes before ./config.sh remove - works

actions-runner$ ps ax
    PID TTY      STAT   TIME COMMAND
      1 pts/0    Ss     0:00 bash
   1948 pts/0    S      0:00 /bin/bash ./run.sh
   1952 pts/0    S      0:00 /bin/bash /azp/actions-runner/run-helper.sh
   1956 pts/0    Sl     0:01 /azp/actions-runner/bin/Runner.Listener run
   1972 pts/0    R+     0:00 ps ax
actions-runner$ kill 1948 1952 1956
actions-runner$ ps ax
    PID TTY      STAT   TIME COMMAND
      1 pts/0    Ss     0:00 bash
   1976 pts/0    R+     0:00 ps ax
actions-runner$ ./config.sh remove --token AAAAAD2HSSZ3TDDA3YGSLGTGC25MY

# Runner removal


√ Runner removed successfully
√ Removed .credentials
√ Removed .runner

actions-runner$

MmAtBosch avatar Apr 10 '24 15:04 MmAtBosch

I'm still seeing this error, in this case when an idle runner is terminated because the Node it is on is being deleted as part of autoscaling (down) the Kubernetes cluster:

Logs look approximately like this (ANSI color codes, timestamps, and some other stuff removed)

NOTICE --- Executing actions-runner-controller's SIGTERM handler.
NOTICE --- Note that if this takes more time than terminationGracePeriodSeconds, the runner will be forcefully terminated by Kubernetes, which may result in the in-progress workflow job, if any, to fail.
NOTICE --- Ensuring dockerd is still running.
CONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES
/runner /
NOTICE --- Waiting for the runner to register first.
NOTICE --- Observed that the runner has been registered.
# Runner removal
An error occurred: Access denied. System:ServiceIdentity;DDDDDDDD-DDDD-DDDD-DDDD-DDDDDDDDDDDD needs View permissions to perform the action.
Runner listener exit with retryable error, re-launch runner in 5 seconds.
Does not exist. Skipping Removing runner from the server
√ Removed .credentials
√ Removed .runner
/
NOTICE --- The actions runner process exited.
NOTICE --- Holding on until runner init (pid 9) exits, so that there will hopefully be no zombie processes remaining.
Restarting runner...
An error occurred: Not configured. Run config.(sh/cmd) to configure the runner.
Runner listener exit with terminated error, stop the service, no retry needed.
Exiting runner...
NOTICE --- Graceful stop completed.
NOTICE --- Runner init exited. Exiting this process with code 0 so that the container and the pod is GC'ed Kubernetes soon.

Nuru avatar Jun 29 '24 07:06 Nuru