rancher-redeploy-workload icon indicating copy to clipboard operation
rancher-redeploy-workload copied to clipboard

i/o timeout error

Open troyfitzwater opened this issue 1 year ago • 7 comments

My redeployment workload is hitting a timeout error when it gets to the Post step. Here's what I'm seeing in my logs:

Workloads to redeploy:
* test-repo
Staring to redeploy...
❌ test-repo
Post "***/v3/project/***:***/workloads/deployment:default:test-repo?action=redeploy": dial tcp ***: i/o timeout

This is on a self-hosted runner because I'm running Rancher on-prem. I can ping the host successfully without any timeouts.

Here's my config:

steps:
      -
        name: Deploy to Dev
        uses: th0th/[email protected]
        with:
          debug: true
          rancher_bearer_token: ${{ secrets.RANCHER_BEARER_TOKEN_DEV }}
          rancher_cluster_id: ${{ secrets.RANCHER_CLUSTER_ID_DEV }}
          rancher_namespace: ${{ vars.RANCHER_NAMESPACE }}
          rancher_project_id: ${{ secrets.RANCHER_PROJECT_ID_DEV }}
          rancher_url: ${{ secrets.RANCHER_URL }}
          rancher_workloads: ${{ vars.RANCHER_WORKLOAD }}

Let me know if I can provide any additional information.

troyfitzwater avatar Feb 22 '23 20:02 troyfitzwater

hey @troyfitzwater,

This action uses docker and AFAIK self-hosted runners can't run docker stuff, yet: https://github.com/actions/runner/issues/406

Might that be the issue?

th0th avatar Feb 22 '23 20:02 th0th

Hmm, I'm not sure how to confirm if that's what the issue is.

Its running on a Linux VM that has Docker installed. I might be misunderstanding, but I thought this last bullet confirmed that it should be possible in this particular scenario: https://docs.github.com/en/actions/hosting-your-own-runners/about-self-hosted-runners#requirements-for-self-hosted-runner-machines

troyfitzwater avatar Feb 22 '23 20:02 troyfitzwater

Yeah, I get the same idea from that last bullet, too.

Looking at the output, it feels like an actual connectivity issue in the container. And you know, most of the time it is about DNS.

Can you try creating a container on the VM that runs the action and try to send a request from the container manually yourself? Something like this should work:

$ docker run --rm -it th0th/rancher-redeploy-workload:0.9.2 bash
# apk update && apk add curl
# curl <rancher_url>

th0th avatar Feb 22 '23 21:02 th0th

Yeah, looks like a connectivity issue. Was unable to curl or ping Rancher. Running that container with --network="host" I can at least ping rancher, but I'm running into SSL issues when trying to curl it.

curl: (60) SSL certificate problem: unable to get local issuer certificate

So we've narrowed it down to a connectivity issue, but I'm not quite sure where to go from here. What are your thoughts?

troyfitzwater avatar Feb 22 '23 23:02 troyfitzwater

Hmm, is it an issue only within this container? Or any container on that VM? Or maybe even the VM can't connect to the rancher?

  1. Can you try to curl directly from VM, without any container?

  2. Can you please trying creating another container on the VM (with a different image), and try connecting from there?

    $ docker run --rm -it ubuntu bash
    # curl <rancher_url>
    

th0th avatar Feb 23 '23 09:02 th0th

Looks like the VM itself can't connect to Rancher. I'm unable to curl Rancher directly from the VM, and wasn't able to from other containers, either.

At this point, it looks like the issue isn't with this Action, so you could go ahead and close this, if you want. Although, any insight into what I should look into next would be much appreciated, because I would love to get this working :)

troyfitzwater avatar Feb 23 '23 21:02 troyfitzwater

Let's figure it out together :)

  1. First, make sure that DNS resolves correctly. Run this on the VM and your own computer, too. And compare the outputs.
$ dig <rancher_domain>

The outputs should be the same. If they are different, it might mean the DNS on the VM is misconfigured.

  1. If the DNS checks, there might be a firewall issue. Is it possible that access to rancher instance is restricted to access from certain IP addresses?

th0th avatar Feb 23 '23 21:02 th0th