github-action icon indicating copy to clipboard operation
github-action copied to clipboard

Urgent: intermittent connection to remote host through Tailscale action

Open eman-cirrusgo opened this issue 8 months ago • 3 comments

We are using the Tailscale action in our GitHub Actions workflow. Recently, we've started receiving an error stating that we cannot reach the remote host through Tailscale, as shown in the logs below generated by the following commands:

Commands:

tailscale status
tailscale ping $SSH_HOST

Errors:

  tailscale status
100.69.57.42    github-fv-az1726-261 github-fv-az1726-261.taild7c8e.ts.net linux   -  
# Health check:
#     - no DERP home
# Update available: 1.52.0 -> 1.82.5, run `tailscale update` or `tailscale set --auto-update` to update.
ping "100.112.202.141" timed out
ping "100.112.202.141" timed out
ping "100.112.202.141" timed out
ping "100.112.202.141" timed out
pong from *** (100.112.202.141) via DERP(dbi) in 233ms
pong from *** (100.112.202.141) via DERP(dbi) in 266ms
pong from *** (100.112.202.141) via DERP(dbi) in 240ms
pong from *** (100.112.202.141) via DERP(dbi) in 358ms
pong from *** (100.112.202.141) via DERP(dbi) in 234ms
pong from *** (100.112.202.141) via DERP(dbi) in 249ms
direct connection not established

Run ping -c 5 $SSH_HOST
PING ***.taild7c8e.ts.net (100.112.202.141) 56(84) bytes of data.

--- ***.taild7c8e.ts.net ping statistics ---
5 packets transmitted, 0 received, 100% packet loss, time 4091ms

We have retried the action jobs multiple times. Sometimes, we are able to access the remote host, and other times we cannot, even though we try rerunning the jobs after varying intervals.

eman-cirrusgo avatar Apr 28 '25 09:04 eman-cirrusgo

We really struggle with the same. I suspect it might be caused by the DERP server selection.

tpanum avatar Apr 28 '25 14:04 tpanum

Also seeing 100% packet loss

switz avatar May 03 '25 19:05 switz

Same seeing the same issue, it's intermittent, eventually after a few reruns it works. This is on Github's Public runners

rmb938 avatar May 10 '25 21:05 rmb938

Also started seeing this. Seems to be somehow specific to DERP servers mostly used by GitHub Actions. Probably congestion?

Is the only workaround to run a custom DERP server?

igorbdl avatar Aug 25 '25 09:08 igorbdl

There can be a delay between the time that your GitHub Action's Tailscale client joins your tailnet and the destination Tailscale client learns of its presence and that it's allowed to connect, which manifests as lack of connectivity.

v4 of the GitHub action now includes a ping parameter that you can use to wait for connectivity before proceeding. We hope that this will resolve your issue. If it does not, please feel free to reopen this ticket.

oxtoacart avatar Oct 20 '25 18:10 oxtoacart