continuous-integration icon indicating copy to clipboard operation
continuous-integration copied to clipboard

feat: build CI Docker images with GH Actions

Open jjmaestro opened this issue 1 year ago • 1 comments

Fixes #2020

Workflow to build and deploy Bazel CI Docker images

This workflow builds the images, pushes them to the GHCR registry and links them with this repo.

For each Dockerfile it builds different types of images, each corresponding with a named build stage (... AS <NAME>) in the Dockerfile.

The workflow is triggered when a push to the main/master or testing branch contains changes to one or more of the CI Dockerfiles (buildkite/docker/*/Dockerfile). It can also be triggered manually via the Actions web UI, the GH REST API or the GH CLI tool, e.g.:

gh workflow run build-ci-docker-images

When triggered by a push event the workflow will:

  1. Determine which Dockerfiles were changed.
  2. For those Dockerfiles, determine its build context (the Dockerfile directory) and the build targets (the named build stages in it).
  3. Filter the build targets:
  • first with RE_TARGET_INCLUDE, which defaults to empty so it will match all the named build stages.
  • then with RE_TARGET_EXCLUDE: set by default to remove some of the build stages, e.g. the deprecated images like centos7 and some targets that we don't want to build as images like the nojdk ones.
  1. Then, it will also exclude all testimage targets because that image is only used for manually testing the workflow.

  2. Finally, it will spawn a docker/build-push-action job for each of the build targets. For every image built, it will push to the registry three image tags:

  • a sha tag with the short hash of the commit that triggered the push
  • a date tag with the current date in ISO format (YYYYMMDD)
  • a latest tag

When triggered manually (workflow_dispatch event) the workflow will default to "running in test mode": it will follow the same steps as a push run but with different default values (see workflow_dispatch.inputs):

  • RE_TARGET_INCLUDE set to testimage (a very simple image to exercise the build that doesn't take much compute) and
  • RE_TARGET_EXCLUDE set to the same pattern as in the push event.

This effectively limits the build targets to only those in testimage not excluded by RE_TARGET_EXCLUDE (e.g. nojdk).

The "test run" also limits the PLATFORMS to linux/amd64, to further reduce the cost and time of a test run.

Finally, it will build those testimage targets but it won't tag latest or push any of the image tags to the registry.

This "test mode" behavior can be changed by setting the workflow_dispatch.inputs variables: RE_TARGET_EXCLUDE, RE_TARGET_INCLUDE, PLATFORMS, TAG_DATE, TAG_LATEST and PUSH, e.g.:

gh workflow run build-ci-docker-images \
  -f RE_TARGET_INCLUDE=ubuntu2404 -f TAG_DATE=20241101

jjmaestro avatar Nov 13 '24 20:11 jjmaestro

@meteorcloudy Here's a proposal for a GH Actions workflow that will build the CI images for linux/amd64 and linux/arm64. It could easily be adapted to also authenticate to GCR and push the images there, I think it would be as easy as generating an auth token in GCR and adding it as a secret to the repo.

I would then probably modify one or some of the CI images to have a slim version and use one of those as the base public Bazel image, and delete bazel/oci.

Also, as I mentioned in #2020 this action can initially run in parallel to the official builds, until you decide if this will be the master way of building them (or not :).

Plus, GH Actions also support external workers so maybe it could also be adapted to run against capacity in GCP, if running in GH is a problem (although I think the free minutes should be enough to build the images, they are heavy and there's a bunch of them but I think it will be ok).

What do you think?

jjmaestro avatar Nov 13 '24 20:11 jjmaestro