feat: build CI Docker images with GH Actions
Fixes #2020
Workflow to build and deploy Bazel CI Docker images
This workflow builds the images, pushes them to the GHCR registry and links them with this repo.
For each Dockerfile it builds different types of images, each corresponding with a named build stage (... AS <NAME>) in the Dockerfile.
The workflow is triggered when a push to the main/master or testing branch contains changes to one or more of the CI Dockerfiles (buildkite/docker/*/Dockerfile). It can also be triggered manually via the Actions web UI, the GH REST API or the GH CLI tool, e.g.:
gh workflow run build-ci-docker-images
When triggered by a push event the workflow will:
- Determine which
Dockerfiles were changed. - For those
Dockerfiles, determine its build context (theDockerfiledirectory) and the build targets (the named build stages in it). - Filter the build targets:
- first with
RE_TARGET_INCLUDE, which defaults to empty so it will match all the named build stages. - then with
RE_TARGET_EXCLUDE: set by default to remove some of the build stages, e.g. the deprecated images likecentos7and some targets that we don't want to build as images like thenojdkones.
-
Then, it will also exclude all
testimagetargets because that image is only used for manually testing the workflow. -
Finally, it will spawn a
docker/build-push-actionjob for each of the build targets. For every image built, it will push to the registry three image tags:
- a
shatag with the short hash of the commit that triggered the push - a
datetag with the current date in ISO format (YYYYMMDD) - a
latesttag
When triggered manually (workflow_dispatch event) the workflow will default to "running in test mode": it will follow the same steps as a push run but with different default values (see workflow_dispatch.inputs):
RE_TARGET_INCLUDEset totestimage(a very simple image to exercise the build that doesn't take much compute) andRE_TARGET_EXCLUDEset to the same pattern as in thepushevent.
This effectively limits the build targets to only those in testimage not excluded by RE_TARGET_EXCLUDE (e.g. nojdk).
The "test run" also limits the PLATFORMS to linux/amd64, to further reduce the cost and time of a test run.
Finally, it will build those testimage targets but it won't tag latest or push any of the image tags to the registry.
This "test mode" behavior can be changed by setting the workflow_dispatch.inputs variables: RE_TARGET_EXCLUDE,
RE_TARGET_INCLUDE, PLATFORMS, TAG_DATE, TAG_LATEST and PUSH, e.g.:
gh workflow run build-ci-docker-images \
-f RE_TARGET_INCLUDE=ubuntu2404 -f TAG_DATE=20241101
@meteorcloudy Here's a proposal for a GH Actions workflow that will build the CI images for linux/amd64 and linux/arm64. It could easily be adapted to also authenticate to GCR and push the images there, I think it would be as easy as generating an auth token in GCR and adding it as a secret to the repo.
I would then probably modify one or some of the CI images to have a slim version and use one of those as the base public Bazel image, and delete bazel/oci.
Also, as I mentioned in #2020 this action can initially run in parallel to the official builds, until you decide if this will be the master way of building them (or not :).
Plus, GH Actions also support external workers so maybe it could also be adapted to run against capacity in GCP, if running in GH is a problem (although I think the free minutes should be enough to build the images, they are heavy and there's a bunch of them but I think it will be ok).
What do you think?