action icon indicating copy to clipboard operation
action copied to clipboard

Error: Command failed with exit code 1: ssh-agent

Open mikebrandl opened this issue 3 years ago • 7 comments

Hi

I have a standard laravel deployment working locally. Because the server I want to deploy to is behind a strict firewall I want to run a self hosted github runner to deploy to it.

However when I attempt to deploy I get the following:

Error: Command failed with exit code 1: ssh-agent -a /tmp/ssh-auth.sock unix_listener: cannot bind to path /tmp/ssh-auth.sock: Address already in use

I understand what the errors means, but is there something I can change as a workaround?

Thanks

Mike

Upvote & Fund

  • We're using Polar.sh so you can upvote and help fund this issue.
  • We receive the funding once the issue is completed & confirmed by you.
  • Thank you in advance for helping prioritize & fund our backlog.
Fund with Polar

mikebrandl avatar Aug 31 '21 22:08 mikebrandl

We fixed this in our pipeline by adding another step to the workflow job:

- name: Clean-up
  if: always()
  run: killall ssh-agent

While this works, I think the proper solution would be to add a similar clean-up step to the deployer action itself.

ngrie avatar Feb 01 '22 09:02 ngrie

I didn’t get it. Action works for me. Do you have some special steps?

antonmedv avatar Feb 01 '22 10:02 antonmedv

@antonmedv The important difference is:

I want to run a self hosted github runner to deploy to it

GitHub's hosted runners are VMs that are destroyed after every run. For self-hosted runners this is not the case, so that processes and files remain from previous runs. That is the reason why it is best practice for actions to do proper clean up.

Especially for this action it might be a common need to run it self-hosted to deploy to environments with IP based restrictions.

ngrie avatar Feb 01 '22 10:02 ngrie

Make sense. Is there a way to automate this?

antonmedv avatar Feb 01 '22 18:02 antonmedv

@antonmedv

Since you're using execa to run ssh-agent, it looks like you should be able to pass a AbortController as a parameter when executing a new process.

const abortController = new AbortController();
const subprocess = execa('node', [], {signal: abortController.signal});

setTimeout(() => {
	abortController.abort();
}, 1000);

try {
	await subprocess;
} catch (error) {
	console.log(subprocess.killed); // true
	console.log(error.isCanceled); // true
}

This should allow you to trigger the abort after running the deploy action, thus killing the SSH agent.

It looks like as well you can simply send a SIGTERM signal to the process:

const subprocess = execa('node');

setTimeout(() => {
	subprocess.kill('SIGTERM', {
		forceKillAfterTimeout: 2000
	});
}, 1000);

One other recommendation I can give is maybe ensuring you're not modifying anything related to the home directory of a user. All self-hosted runners run under a single user that is re-used for all actions on that runner.

So, while GitHub Actions in the cloud are run on uniquely spawned instances, self-hosted runners are constant, so actions should aim to be as idempotent as possible to ensure compatibility with self-hosted runners where possible.

So while running ssh-agent is probably fine, maybe randomising the temporary socket name, as well as avoiding modifying the default known_hosts should also be done as a way to improve compatibility with self-hosted runners.

EDIT:

Another issue I found is running multiple times causes the ~/.ssh/config file to become invalid, as the StrictHostKeyChecking setting gets added on the same line, which ends up making it look like this:

StrictHostKeyChecking onStrictHostKeyChecking on

My current workaround is also removing the ~/.ssh/config file post-deployment as well since I don't rely on it at all.

Sn0wCrack avatar Aug 08 '22 05:08 Sn0wCrack

Faced the same issue trying to handle GHA workflow cancellations: you can't run two Deployer steps, second one will fail with the same issue. Would be great to have this part more customizable.

asychev avatar Oct 31 '22 14:10 asychev

I was able to fix the problem by setting skip-ssh-setup: true

      - name: deployphp/action
        uses: deployphp/action@v1
        with:
          dep: deploy
          # Private key for connecting to remote hosts. To generate private key:
          # `ssh-keygen -o -t rsa -C '[email protected]'`.
          # Optional
          private-key: ${{ secrets.PRIVATE_KEY }}
          ssh-config: |
            Host *
              StrictHostKeyChecking no
          # Option to skip over the SSH setup/configuration.
          # Self-hosted runners don't need the SSH configuration or the SSH agent
          # to be started.
          # Optional.
          skip-ssh-setup: true
          # You can specify the output verbosity level.
          # Optional. Defaults to -v.
          verbosity: -vvv

gutschik avatar Apr 20 '23 15:04 gutschik