actions-runner-controller icon indicating copy to clipboard operation
actions-runner-controller copied to clipboard

feat: Add container to propagate host network MTU

Open tiagoblackcode opened this issue 2 years ago • 21 comments

Some network environments use non-standard MTU values. In these situations, the DockerMTU setting might be used to specify the MTU setting for the bridge network created by Docker. However, when the Github Actions workflow creates networks, it doesn't propagate the bridge network MTU which can lead to connection reset by peer messages.

To overcome this, I've created a new docker image called summerwind/actions-runner-mtu that shims the docker binary in order to propagate the MTU setting to networks created by Github workflows.

This is a follow-up on the discussion in #1046 and uses a separate image since there might be some unintended side-effects with this approach.

tiagoblackcode avatar Mar 10 '22 13:03 tiagoblackcode

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Apr 09 '22 15:04 stale[bot]

Can this be done on same image itself, because MTU is different on every public cloud. This value is important and user must be aware about it.

Always set MTU value in image with default 1400 and user has flexibility to override it via spec.

1400 will be should be compatible with most cloud providers

arpitjindal97 avatar Apr 26 '22 09:04 arpitjindal97

This is done on a separate image because there might be some unintended consequences of using a shell script instead of the original docker binary. The shell script is required to propagate the MTU to custom networks when the creation of those networks does not explicitly set the MTU. You will be able to use the official modified image once this is released.

Instead of hard-coding the MTU to 1400, this leaves it up to the user to configure the MTU however he/she finds best. I'm not sure about cloud providers, but in self-hosted situations with non-virtualised networking, the MTU can probably stay at 1500 since there is no IP-over-IP wrapping.

tiagoblackcode avatar Apr 26 '22 10:04 tiagoblackcode

I mean to say, default value can be 1400 and user can modify it anytime. Reason of 1400 is because GCP provides 1460 where setting MTU will become mandatory to set it

Ref: https://cloud.google.com/network-connectivity/docs/vpn/concepts/mtu-considerations

arpitjindal97 avatar Apr 26 '22 10:04 arpitjindal97

I'm not sure I'm following, but I don't think we should have a customisation specifically for a cloud provider. There are situations where the MTU can stay at 1500 and we should not decrease it implicitly.

tiagoblackcode avatar Apr 29 '22 10:04 tiagoblackcode

@tiagoblackcode Sorry for the delayed repsonse! I just merged https://github.com/actions-runner-controller/actions-runner-controller/pull/1257 which adds some documentation on the MTU issue worth being revised to also cover this work.

Also, I think we can include the shim in our standard container image while enhancing entrypoint.sh so that it toggles the use of the shim behind a feature flag(a runner container envvar specified via RunnerDeployment/RunnerSet specs). WDYT?

mumoshu avatar May 16 '22 02:05 mumoshu

There really is no need for a separate image, this can all be done in the same image we already have. Wrapping the original binary in a shell script is also no problem, if we exec the original binary and thus replace the bash process with the docker process.

Docker should not be installed to /usr/local/bin but rather to /usr/bin since it is not a user binary, but one we provide as part of this project to users. (It should also not be installed manually in the first place, but that is a different story we are going to fix as part of https://github.com/actions-runner-controller/actions-runner-controller/issues/1137.) Once that is done we can override the executable in /usr/local/bin. The following script would result in a safe override:

#!/usr/bin/env bash
set -Eeuo pipefail

if [[ ${ARC_DOCKER_MTU_PROPAGATION:-false} == true ]] &&
  (($# >= 2)) && [[ $1 == network && $2 == create ]] &&
  mtu=$(/usr/bin/docker network inspect bridge --format '{{index .Options "com.docker.network.driver.mtu"}}' 2>&-); then
  shift 2
  set -- network create --opt com.docker.network.driver.mtu="$mtu" "$@"
fi

exec /usr/bin/docker "$@"

Fleshgrinder avatar May 16 '22 18:05 Fleshgrinder

I'm okay with using the official image and wrapping the docker binary with the shell script. Should we wait until the docker binary is moved to /usr/bin as suggested by @Fleshgrinder or shall we go ahead?

Once docker binary is moved:

  • /usr/bin/docker is the docker binary
  • /usr/local/bin/docker is the docker shim (the shell script which propagates the MTU)

Before the docker binary is moved:

  • /usr/local/bin/docker-unwrapped becomes the official docker binary (perhaps the naming isn't the best?)
  • /usr/local/bin/docker is the docker shim

In this case we don't need to add further directories to the PATH.

tiagoblackcode avatar May 18 '22 07:05 tiagoblackcode

You can move as part of this PR, these two lines need to be adjusted:

https://github.com/actions-runner-controller/actions-runner-controller/blob/8a8ec43364f8c7335dbd0dca65c215d6f1aa6fa5/runner/actions-runner.dockerfile#L61 https://github.com/actions-runner-controller/actions-runner-controller/blob/8a8ec43364f8c7335dbd0dca65c215d6f1aa6fa5/runner/actions-runner-dind.dockerfile#L71

Fleshgrinder avatar May 18 '22 07:05 Fleshgrinder

Updated the PR with the changes. Haven't had the chance to test it properly yet, but it's still missing the flag to enable the MTU propagation.

@Fleshgrinder in your shell script, where does the ARC_DOCKER_MTU_PROPAGATION come from? Is that something that should be configurable in the Runner CRD?

Also, AFAIK the dind runner doesn't have this problem so I just placed the shim in the standard runner. If you wish I can added to the dind runner as well.

tiagoblackcode avatar May 18 '22 11:05 tiagoblackcode

The ARC_DOCKER_MTU_PROPAGATION in my example script is an environment variable that the user can set as part of their deployment.

Fleshgrinder avatar May 18 '22 12:05 Fleshgrinder

Wouldn't it be better to have a configuration on the Runner CRD along with dockerMTU called dockerMTUPropagation? e.g.

---
apiVersion: actions.summerwind.dev/v1alpha1
kind: Runner
spec:
  dockerMTU: 1420
  dockerMTUPropagation: true

instead of having to provide an environment variable?

tiagoblackcode avatar May 18 '22 12:05 tiagoblackcode

These are also turned into environment variables so that they can be passed to the Bash scripts. I leave the API design decisions here to @mumoshu.

Fleshgrinder avatar May 18 '22 12:05 Fleshgrinder

I understand, but in terms of API I'd rather configure the runner using spec than an environment variable. I'll wait for @mumoshu 's input before committing to a solution.

Thanks for your input.

tiagoblackcode avatar May 18 '22 12:05 tiagoblackcode

@tiagoblackcode In this specific case, letting the user define an envvar to enable this feature would be nicer.

I believe this propagation should be enabled by default in the future, without no way to turn it off, once we are confident with the solution.

If we added dockerMTUPropagation once, it's a lot of effort to remove it because it requires us to bump the API version. An envvar can be easily made no-op without breaking API, which is great in terms of maintenance effort! (Actually that's one of the reasons we used envvars for some runner settings, like RUNNER_FEATURE_FLAG_EPHEMERAL and RUNNER_FEATURE_FLAG_ONCE)

mumoshu avatar May 19 '22 00:05 mumoshu

That makes sense. I'll add some documentation about the flag and do some testing. I'll ping back when it's ready.

tiagoblackcode avatar May 19 '22 07:05 tiagoblackcode

That makes sense. I'll add some documentation about the flag and do some testing. I'll ping back when it's ready.

I've ran into the mtu issue. Built image with this update. Containers run with no connection issues. Thank you.

tomsherrod avatar Jul 14 '22 23:07 tomsherrod

@mumoshu @Fleshgrinder I just picked this up again, sorry for the delay.

Made some tests with the ARC_DOCKER_MTU_PROPAGATION flag and it seems to be working correctly. Updated the TROUBLESHOOTING.md doc to mention the flag wrt to the MTU issue.

I can setup a test similar to those in test/entrypoint/should_* for the docker shim if you wish. Otherwise I think this is ready to merge.

tiagoblackcode avatar Jul 18 '22 09:07 tiagoblackcode

@tiagoblackcode Thanks for the update! This looks good to me overall. I'll merge this after I give it a try on my own cluster(although I don't have a quick access to a cluster with a non standard MTU but at least verifying it to work on my cluster would make it more confident that we won't break anything for a standard env 😄

I can setup a test similar to those in test/entrypoint/should_* for the docker shim if you wish. Otherwise I think this is ready to merge.

I'd greatly appreciate it if you could do so, in this PR or in another PR depending on the situation!

mumoshu avatar Jul 19 '22 00:07 mumoshu

@tiagoblackcode Thanks for the update! This looks good to me overall. I'll merge this after I give it a try on my own cluster(although I don't have a quick access to a cluster with a non standard MTU but at least verifying it to work on my cluster would make it more confident that we won't break anything for a standard env 😄

You can actually make some kind of test:

# lower the main interface MTU in the k8s host(s)
# WARNING: this may cause issues in other pods
for host in "<host>" "<host>"
do 
  ssh "root@$host" "ip link set <interface> mtu 1420"
done

# update the Runner to use a lower MTU for the docker container and keep the original image
kubectl patch -n <namespace> runners.actions.summerwind.dev <runner> --type='merge' \
  -p '{"spec": {"dockerMTU": 1420}}'

# force the runners to recreate
for p in $(kubectl get pod -n <namespace> -o custom-columns="NAME:.metadata.name,CONTROLLER:.metadata.ownerReferences[].kind" | grep Runner | awk '{ print $1 }')
do
  kubectl -n <namespace> delete pod $p
done

# perform the test inside one of the runners

runner=$(kubectl get pod -n <namespace> -o custom-columns="NAME:.metadata.name,CONTROLLER:.metadata.ownerReferences[].kind" | grep Runner | awk '{ print $1 }' | head -n 1)

kubectl exec -n <namespace> -it $runner -c runner -- bash

# this is similar to what GitHub Actions does when running a workflow
docker network create no-mtu

# this is similar to what the `docker-shim` does when proxying the network creation to docker
docker network create mtu --opt com.docker.network.driver.mtu=1420

# this should not end correctly with the container using the original image
docker run --rm --network no-mtu curlimages/curl --max-time 10 https://www.google.com

# this should work with both patched and unpatched images
docker run --rm --network mtu curlimages/curl --max-time 10 https://www.google.com

# To tear the test down: 
# - restore the host MTU
# - patch the runner to use/remove the `dockerMTU` configuration

The test above should use the original image to assert the current image has issues on docker networks created with the default configuration when the underlying network infra-structure has a lower MTU.

To test the new image, patch the runner CRD to use an image created with this PR, and re-run the test above. The test using the no-mtu network should now work properly. This should assert the new image propagates the MTU to docker networks created with the default configuration.

tiagoblackcode avatar Jul 19 '22 08:07 tiagoblackcode

I can setup a test similar to those in test/entrypoint/should_* for the docker shim if you wish. Otherwise I think this is ready to merge.

I'd greatly appreciate it if you could do so, in this PR or in another PR depending on the situation!

Regarding this, I need to explore a bit how to mock calls to /usr/bin/docker without interfering with the host and/or the script. Doing a chroot into a mocked FS might be an option. So I'd say we merge without the test if it's okay with you.

tiagoblackcode avatar Jul 19 '22 09:07 tiagoblackcode

Oops, you already touched dind runner's Dockerfile! This doesn't work because you missed updating supervisor.conf which is used to run the dind dockerd within the runner container, which points to the inexistent /usr/local/bin/dockerd (Notice the trailing "d"- the dockerd binary moved to /usr/bin along with the docker command). Let me add a commit to fix it. Otherwise the users of dind runners will encounter their runners start failing after this gets merged.

mumoshu avatar Sep 23 '22 07:09 mumoshu

I'll follow up with rootless dind runners later.

mumoshu avatar Sep 23 '22 08:09 mumoshu