[FR] gcr.io/bazel-public/bazel images for linux/arm64
Following the "Getting Started with Bazel Docker Container" tutorial, I wanted to check the Bazel Docker image and when I tried it, I got the following warning:
$ docker run --rm gcr.io/bazel-public/bazel version
WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested
WARNING: Invoking Bazel in batch mode since it is not invoked from within a workspace (below a directory having a WORKSPACE file).
Extracting Bazel installation...
OpenJDK 64-Bit Server VM warning: Options -Xverify:none and -noverify were deprecated in JDK 13 and will likely be removed in a future release.
Build label: 7.2.1
Build target: @@//src/main/java/com/google/devtools/build/lib/bazel:BazelServer
Build time: Tue Jun 25 15:53:05 2024 (1719330785)
Build timestamp: 1719330785
Build timestamp as int: 1719330785
Would it be possible to generate gcr.io/bazel-public/bazel images for linux/arm64 to avoid having to run the image with Rosetta?
Thanks!
@meteorcloudy just wanted to know if there's any blocker and/or "dependency" blocking this.
From reading https://github.com/bazelbuild/continuous-integration/blob/9db28ad30766734e1b1ec2fb166b66d85c47a6a9/bazel/oci/README.md?plain=1#L1-L10
I understand that the images are built anywhere and the only requirement is that whoever builds it should have push permissions to the registry.
If so, would it be possible to do this? I understand it's labeled asP2 but it's really doable in minutes! 😬
Also, would you be open to automatically build the Docker image(s) with GitHub Actions? That way it would be fully automated.
Thanks!
One major blocker is that we don't currently have Linux arm64 machine on Bazel CI.
One question, can we actually push different images for different arch under gcr.io/bazel-public/bazel? Or it has to be named to something like gcr.io/bazel-public/bazel/arm64?
One major blocker is that we don't currently have Linux arm64 machine on Bazel CI.
But even if you don't use it to run Bazel CI jobs in arm64, people can still use the image as the "official Bazel Docker image". As I mentioned when I opened the issue, the example in the Bazel doc to build using a Docker image does use this image :)
Can you confirm that the images are built by hand by one of you? Or you actually use some remote build servers and you don't have arm64 servers there to produce the image?
One question, can we actually push different images for different arch under
gcr.io/bazel-public/bazel? Or it has to be named to something likegcr.io/bazel-public/bazel/arm64?
When I build images, I usually build multi-platform images with --platform linux/amd64,linux/arm64. I think those would just show in the registry as one image and then docker pull would pull the right image by default matching the host architecture (unless you specify it with --platform IIRC).
If you have access to an M{1,2,3,4} mac, you can try building a multi-platform image there. If so, you should use Docker Desktop because it already comes with muti-platform QEMU VMs that are needed to build such images.
I'm happy to help with this, as I'd like to have the image so I can use it in my Mac! :)
We are actually building those image in a CI pipeline on a GCP Linux VM, do you mean we can actually build the arm64 image on a amd64 machine by adding --platform linux/amd64,linux/arm64?
We are actually building those image in a CI pipeline on a GCP Linux VM
Out of curiosity, is the config for this setup public? I haven't checked the repo thoroughly, maybe it's already here :-?
do you mean we can actually build the arm64 image on a amd64 machine by adding --platform linux/amd64,linux/arm64?
You should probably be able to set it up, yeah. Docker Desktop in Mac already comes with the right QEMU VMs and all the "plumbing" so it's as easy as just using the --platform flag. In your case, my guess is that you'll have to setup things manually.
From the link I pasted re. building multi-platform images, there are some prerequisites (mostly, enable the containerd image store) and different strategies to build the images.
One of those is using build clusters with multiple native nodes but the other one is the one I mentioned, using QEMU VMs.
Since you already run in a VM, I guess it'll all depend on being able to use nested virtualization and then, following those steps in the Docker docs to setup QEMU inside the VM and then setup Docker manually to run with QEMU.
I've just checked and it looks like you are already using GH Actions to build the release for the bazelci-agent so... I insist, why not just do away with the GCP VM and use the Docker GitHub Action? 😝
It does support multi-platform images out-of-the-box: 😄 https://docs.docker.com/build/ci/github-actions/multi-platform/ And it'll be 10x easier than setting up a nested VM in GCP, I think!
The pipeline config is at https://github.com/bazelbuild/continuous-integration/blob/master/pipelines/docker-update.yml
why not just do away with the GCP VM and use the Docker GitHub Action?
Hmm, the problem is that we still need to push the image to GCP's artifact registry, there is some security implications if we want to do it in GitHub Action. We currently don't have time to properly set that up.
Hmm, the problem is that we still need to push the image to GCP's artifact registry, there is some security implications if we want to do it in GitHub Action. We currently don't have time to properly set that up.
Ah, I see... then, here's another idea:
You could setup and use the GH Action to create the Docker image and push it to GH's ghcr.io registry. That should work out-of-the-box and, once published in the ghcr.io registry, you can simply change the external pipeline to pull from there and push it in to the gcr.io registry.
Also, if the image is in the GH registry, you can link the image to the repo.
IMHO, this way, you get the best of both worlds: you get to build multi-platform images easily, the images are clearly associated to the repo and you get two mirrors for the images 🚀
Plus, the setup can be done in parallel, so that you keep the current external pipeline and only if everything looks good on the GH side, you can decide to change it to a "pull from ghcr.io" pipeline instead of building the image there.
What do you think? 😄
Sorry, I just don't see much value to create & maintain another pipeline for now. In fact, I don't quite understand why people need those docker image for Bazel at all. Why not just download the Bazel binary with Bazelisk? It's much easier.
So then, why do you have Bazel CI images that don't use Bazelisk? And why are you basing your image off the official Ubuntu image? 😄
IMO people definitely will benefit from having official images in many ways. Here are a few off the top of my head:
- First and foremost, having an official Docker image is good in and out of itself. It's the standard from where people can base their own custom images. Also, you can do some "heavy lifting" in the base image like e.g. setup Bazel autocompletion, etc, so that others can benefit by just extending your image without having to maintain all of those parts.
- Reproducibility (which is one of the big selling points of Bazel itself), it's nice to have a guarantee that everybody is running the exact same image when e.g. reporting a bug with a repro so that everybody can easily test it within the exact same environment.
- Extending on that point, people will want to quickly run something without having to whip a whole image from scratch. For example, someone beginning to play with Bazel can just run the "Getting started with Bazel" tutorial, which as I mentioned when I opened the issue, was my exact case. As easy as it is to run Bazelisk, forcing people to install a binary that will, in turn, download another binary and/or create a whole Dockerfile that won't be just a one-liner VS
docker run gcr.io/bazel-public/bazelis a higher barrier for adoption IMHO. - Some people like me will be working in MacOS while developing for Linux, so again, not having to maintain my own Docker image is nice.
- Now, to your specific point of "just download the Bazel binary with Bazelisk", why aren't you doing it yourselves in the official CI Image? :) Because (1) you want to pin it to one version for reproducibility (2) you don't want to download it at runtime (this one is "easily" avoided by running bazelisk at build time and caching the result in the Docker image, but still)
Finally, re. maintaining two pipelines, you are already maintaining a bunch more, plus you already maintain a Github Workflow to publish the release artifact. Sure, it's one more pipeline, but it would take most of the actions off the current one, so you would be trading. The current build pipeline would just become a simple "sync pipeline".
For the cost of "maintaining another pipeline", you would be gaining (1) another mirror for the images and more importantly (2) multiple platforms that people can then test against, using the official Bazel CI image. Currently, people can only test one architecture. In my book, it's a good trade-off.
Anyway, thanks for all the info and explaining everything. If you change your mind, I'm happy to help with this! I'll probably build my own Bazel image and will probably setup the build pipeline, so it should be easy to copy it in here. If I do, I'll keep this issue up-to-date so we can talk more in the future about this.
Thanks!!
We are actually not using gcr.io/bazel-public/bazel on our CI, instead we are using docker images like gcr.io/bazel-public/ubuntu2404 built from https://github.com/bazelbuild/continuous-integration/blob/b05134e3758ff0fd0bdd504de1cdd0c7333d76fc/buildkite/docker/ubuntu2404/Dockerfile#L71-L72
which only has Bazelisk installed.
gcr.io/bazel-public/bazel was only built because some users requested it and is never used by the Bazel team. In general, it's a maintenance burden for us. Since we now have https://github.com/bazel-contrib, I think it would be nice if the community could take over and own this.
Here's my take on Bazel images being built with GH Actions: https://github.com/jjmaestro/bzldocker 😄
We are actually not using
gcr.io/bazel-public/bazelon our CI, instead we are using docker images likegcr.io/bazel-public/ubuntu2404built from (...)
Oh, I see! Cool, thanks! I'll definitely have a look and compare with my approach!
gcr.io/bazel-public/bazelwas only built because some users requested it and is never used by the Bazel team. In general, it's a maintenance burden for us. Since we now have https://github.com/bazel-contrib, I think it would be nice if the community could take over and own this.
Happy to kick it off! I can whip up something based off what I've been playing with in my repo and what you already have / already use.
What do you think?
@jjmaestro Sounds good! Maybe you can start a new proposal here https://github.com/bazel-contrib/SIG-rules-authors/discussions
OK, I've checked bazel/oci/ VS buildkite/docker/ and I see what you mean.
IMHO
-
I'd still keep "something like
bazel/oci/" as a sort-of "minimal image", something equivalent to the "baseflavor" in mybzldocker. -
I'd kill the
build.shandpush.shscripts and turn them into GH actions that run like the one I wrote inbzldocker: build and push toGHCRand, potentially, push toGCRas well (I think it could be as easy as adding a GRC token to the secret vault in thebazelbuild/continuous-integrationrepo). -
Also, maybe consider refactoring the
Dockerfiles since there's probably a lot of repeated code (e.g. between Debian and Ubuntu, and Ubuntu versions)
@jjmaestro Sounds good! Maybe you can start a new proposal here https://github.com/bazel-contrib/SIG-rules-authors/discussions
Cool! Will do! 🙌
Are there any guidelines about how to write a proposal? I'll go over open and closed ones to learn, but if there's any documentation, please let me know :)
Will these images be maintained at bazel-contrib? Is there a proposal / discussion already for this?
Will these images be maintained at
bazel-contrib?
IMHO that would be the desired outcome, I think.
Is there a proposal / discussion already for this?
I haven't done anything yet, sorry :S I'll try to write something soon! 😅
I did play with the GH Action a bit and I think I got it mostly working,check jjmaestro/bzldocker. I just have to sit down and write a proposal... I'll ping back when I do, hopefully before the end of the month!