concourse icon indicating copy to clipboard operation
concourse copied to clipboard

Add pre-compiled ARM binaries for releases

Open zaolin opened this issue 7 years ago • 59 comments

Hey guys,

Feature Request

I wanted to use concourse ci workers on my raspberry pi 3 for doing QA with x86 firmware. The QA should be really cheap < 100$ so that people can attach their own test stand. Would it be possible to deliver pre-compiled ARM binaries for releases ?

Best Regards, Zaolin


Edit from @taylorsilva April 2022

No plans for official ARM images yet because it's hard to add this to our process. AWS is the only cloud provider with ARM instances and we do everything on GCP currently. Adding a single ARM instance is therefore really hard. This will be more feasible once more cloud providers add ARM instances to their offerings.

Currently this comment, farther down in this issue, is your best option: https://github.com/concourse/concourse/issues/1379#issuecomment-929625288

zaolin avatar Jul 14 '17 12:07 zaolin

What is the status on this ?

Niraj-Fonseka avatar Mar 16 '18 21:03 Niraj-Fonseka

@Niraj-Fonseka its currently in our icebox under the Operations project. Unfortunately its not a priority for the team right now

jama22 avatar Mar 19 '18 19:03 jama22

@jama-pivotal I'm also interested in this, and open to building my own binaries, but I can't find any documentation anywhere about how to build the set of concourse binaries. Could you please point me to documentation?

brownjohnf avatar Apr 02 '18 17:04 brownjohnf

@brownjohnf we have some getting started with contributing under our quickstart guide for engineer contributors: https://github.com/concourse/concourse/blob/master/CONTRIBUTING.md

Not sure if that has enough specific detail to get you started though...

/cc @vito to see if he has any more resources

jama22 avatar Apr 03 '18 21:04 jama22

I also tried to take a look at that and to be honest I struggled quite a bit.

I looked both at the existing pipelines at https://github.com/concourse/pipelines to run with my own concourse setup and tried to build things locally on my machine (and various combinations thereof). And didn't get very far... I have a local concourse setup I can experiment with, it just seems to duplicate the whole pipeline touches on so many custom things (versioning in s3, all the bosh stuff maybe?) that shouldn't be necessary just to build and yet they seem so integrated.

So any help would really be appreciated. I might be able to help out with various things in that context since we do have a new requirement for building arm stuff here. The concrete thing I need really is a concourse binary built with GOARCH=arm.

neumayer avatar Jul 25 '18 07:07 neumayer

tl;dr

There are armv7 and aarch64 builds available that work with caveats.

Building

I had a go on this and I went down a deep rabbit hole but I finally managed to cross compile both an armv7 and aarch64 concourse binary. This was a chicken and an egg problem because the normal build process of concourse uses concourse itself and the build process pulls in various prebuilt resources from other build processes.

So I took the path of building everything from scratch using a bash script that bootstraps a concourse binary. The bash script more or less follows what the pipeline does. The idea is that this stage0 concourse binary can then be used on an actual arm machine to run the normal build pipeline and produce the concourse binary.

There are some build issues with both armv7 and aarch64 of which I think I fixed all of them and some of them are already upstreamed:

https://github.com/concourse/baggageclaim/pull/11 https://github.com/cloudfoundry/guardian/pull/118 https://github.com/concourse/time-resource/pull/31

You can find the bootstrap repo here https://github.com/resin-io/concourse-arm. To cross compile it all you need is a linux system with a working Go and Docker installation. You can also find pre-compiled binaries in the Github releases page.

Running

I haven't yet tested the arm64 version but I believe it will work without issues.

Unfortunately there are still some runtime issues for armv7 builds which are due to a bug of golang's syscall module. If you attempt to run concourse on 32bit hardware with user namespaces enabled you'll get:

panic: integer overflow on token 4294967295 while parsing line "         0          0 4294967295"

goroutine 1 [running]:
github.com/concourse/baggageclaim/uidgid.must(0x0, 0xc3050c0, 0x1d7d9650, 0xc3050c0)
	/home/petrosagg/projects/concourse-arm-stage0/workdir/concourse/src/github.com/concourse/baggageclaim/uidgid/max_valid_uid.go:81 +0x40
github.com/concourse/baggageclaim/uidgid.MustGetMaxValidUID(0x0)
	/home/petrosagg/projects/concourse-arm-stage0/workdir/concourse/src/github.com/concourse/baggageclaim/uidgid/max_valid_uid.go:22 +0x40
github.com/concourse/baggageclaim/uidgid.NewPrivilegedMapper(0x1659e01, 0x5)
	/home/petrosagg/projects/concourse-arm-stage0/workdir/concourse/src/github.com/concourse/baggageclaim/uidgid/mapper_linux.go:14 +0x14
github.com/concourse/baggageclaim/baggageclaimcmd.
[...]

This is because the entries in /proc/<pid>/uid_map are unsigned 32bit integers but baggageclaim uses golang's syscall module which defines them as ints.

A potential way forward (fixing go aside) would be to use runc's libraries that has already fixed this issue https://github.com/opencontainers/runc/pull/1819 . It's still the wrong datatype but at least it doesn't overflow.

Resource types

The official concourse binaries currently ship with the following resource types embedded:

  • bosh-deployment
  • bosh-io-release
  • bosh-io-stemcell
  • cf
  • docker-image
  • git
  • github-release
  • hg
  • pool
  • s3
  • semver
  • time
  • tracker

Of those I only cross-compiled:

  • docker-image
  • git
  • s3
  • time

This means that the current binaries are not yet able to run the full concourse pipeline and build itself since it uses more resource types but you can definitely use it if your pipelines don't need those resource types. Depending on the resource type, cross compiling could be as simple as switching the base image. I expect bosh to be the most tricky one.

Upstreaming

Currently the build process changes the projects in a way that breaks normal amd64 builds. So there is work left to be done to have a multi-arch build process that can be upstreamed.

petrosagg avatar Aug 11 '18 20:08 petrosagg

@petrosagg Thanks for looking into this!

vito avatar Aug 13 '18 15:08 vito

Yeah, thanks a lot. I'll try to set this up myself and will report back on how it goes :-)

neumayer avatar Aug 14 '18 07:08 neumayer

I just pushed a couple of PRs for the int overflow issue

https://github.com/concourse/baggageclaim/pull/12 https://github.com/cloudfoundry/idmapper/pull/3

and published a new arm binary as v3.14.1-rc2 :)

https://github.com/resin-io/concourse-arm/releases/tag/v3.14.1-rc2

It now initialises correctly and I can load the UI. Haven't done further testing or actual builds yet

petrosagg avatar Aug 16 '18 12:08 petrosagg

I managed to successfully run an aarch64 worker (before your update).

Now I just need to figure out how to bootstrap the aarch64 image used to build another aarch64 image in my pipeline :-)

neumayer avatar Aug 16 '18 14:08 neumayer

@neumayer nice! What do you mean by bootstrapping the aarch64 image? We (resin) build the cross platform base images that I used for this so if you have any questions about how the cross-compilation works or how to run them natively I can help you

petrosagg avatar Aug 16 '18 18:08 petrosagg

The actual task I'm trying to solve is to make concourse build (and publish) docker images for me. In my normal (x86_64) workflow I use a dind image to build docker images (well build and run inspec tests). Naturally that that one dind image that I usually use is not aarch64. It is built with concourse itself in another pipeline. So I think what I will look at next is how to create multi-arch images in a nice way in concourse.

This is a bit new to me but I'll keep you posted once I get that working (and especially if I don't get it working :-)). Thanks for the offer, I just need to find the time to read up on the multi arch docker thing a bit first.

neumayer avatar Aug 17 '18 08:08 neumayer

Hi!

I have some updates on my multi arch journey :-)

One thing I thought would be great for this whole thing is to build all the images involved multi-arch aware. I.e. the images I want to use in concourse should exist for both amd64 and aarch64 architectures and this is easily accomplished by using alpine based images. Wherever they are built the architecture is properly propagated and the right repos are set up and the packages for the right architecture are installed. I don't know how this works for other base images (I do expect some issues here though, and at some point I think it'll be inevitable to have one Dockerfile per architecture or lots of if statements in the Dockerfiles).

I build a modified dind image that supports multi arch which in turn can be used to build other docker images. It's a bit of a chicken and egg problem, but once the initial image is bootstrapped you're ready to go.

Then I made a concourse pipeline to use this image with to tasks, one for amd64 (i.e. the normal job) building a xxx:amd64 image and one that has a tag specified so the right worker is used (aarch64 is the tag I use for the aarch64 workers) to build a xxx:aarch64 image. There's a third task building a manifest and pushing the manifest for both images (xxx -> xxx:amd64, xxx:aarch64).

I'm quite new to the multi-arch stuff in docker, but it seems to me that the only viable way forward is to add architecture tags to all images anyway (so the manifests can refer to them later).

There's one thing I noticed: when concourse pulls the image for the aarch64 task, i.e. the one that is run on the aarch64 host, it seems to ask for the amd64 image explicitly (via the architecture setting in docker). So my assumption is that the code somewhere has some custom logic that falls back to the amd64 architecture rather than propagating the architecture of the worker in a generic way. I'll double check that if I find the time, it's just a theory so far. When I pull the image without tags from inside the build container the right image is pulled (via the docker binary). The easy workaround for now is to add aarch64 tags to both the task and to the image used in the task.

Ideally this would be integrated in the docker image resource somehow, but I don't see how the manifest stuff can fit there, maybe a docker-manifest resource. If both docker-image resource and docker-manifest resources had the same set of architecture tags for which they can check this might work nicely (the docker-image resource has a multi-arch flag that adds these tags and the manifest resource uses the outputs and builds manifests for all known architectures). If my assumption from before holds the actual code changes would be minimal (just adjust the architecture to be propagated properly).

I'll try to post updates when I have time to look for the architecture propagation code.

neumayer avatar Sep 04 '18 07:09 neumayer

I have a short update on my previous ramblings about the architecture propagation code being buggy. That is not true. It works as intended, I had mistagged my images. It happens a lot with building multi-arch images :-), it seems to be the main challenge to keep things consistent.

Anyway. I've been using this rogue worker for aarch64 for almost three months from the binary and so far it's been very stable and worked well together with the concourse-web instance (4.0.0).

So is there any chance of producing official arm 64 binaries? And is there anything I can do to make that easier (even with my limited understanding of the concourse build process)?

neumayer avatar Oct 31 '18 10:10 neumayer

Hey @neumayer, I followed your journey with great interest. Currently I think of using concourse on my raspberry pi 3. So stumbled over this issue. Can you give use a short description or mini tutorial how you achieved your goal? Perhaps we could manage to create a PR to integrate the changes into the build pipeline.

arrkiin avatar Nov 08 '18 09:11 arrkiin

I just used @petrosagg's binary.

neumayer avatar Nov 08 '18 10:11 neumayer

Ok, thanks for the info. I will look into https://github.com/resin-io/concourse-arm and try to get to the point of using it on my raspi.

arrkiin avatar Nov 08 '18 10:11 arrkiin

The question of #arm64 builds of Concourse came up yesterday at Dockercon, I am looking forward to seeing this support from my point of view of the @worksonarm project.

vielmetti avatar Dec 05 '18 06:12 vielmetti

Just here to give my +1 for having arm build.

Bo0mer avatar May 30 '19 16:05 Bo0mer

I took another look after concourse 5 came out and was hoping the new build structure would be easier to deal with when it comes to arm. Which was true, creating the binary was super easy, just since the resource images are packaged separately now those of course were missing. And I did not investigate how they are built now. At least that's what I remember it's been a couple of months I last looked at this.

But at the latest when the worker protocol changes I'll have to take another look.

neumayer avatar May 31 '19 10:05 neumayer

Hey,

I was working on https://github.com/cirocosta/concourse-arm some time ago, which builds Concourse v5.2.0 with Armv7 and Arm64 support, maybe that'd be interesting for you - it's not well polished yet (more of a proof of concept before bringing a better structured multi-arch support for our builds), but at the moment, it works well (despite the lack of seccomp - see the guardian fork as a submodule in the repo).

Thanks!

cirocosta avatar May 31 '19 12:05 cirocosta

Any update on this? Given AWS is pushing Graviton2 as next gen it'd be really handy if Concourse came with arm64 support out of the box.

analytically avatar Nov 12 '20 10:11 analytically

Definitely see this getting higher priority in the future but it's currently not on the road map

taylorsilva avatar Nov 12 '20 20:11 taylorsilva

@taylorsilva @vito given Apple's M1 chip is also arm64, I think it should definitely be added to the road map

analytically avatar Nov 12 '20 21:11 analytically

In the meantime, I can offer this.

We've been using native arm workers via docker for quite some time now, and containerd on arm has been working well here for months now. I have uploaded my almost self-contained build setup to create a multi platform docker image to run workers with. I say almost self contained because the registry-image (the only resource needed to run build jobs on arm) must be built first, then zipped. The reason for this really is to get to use the alpine image that is used in the registry-image resource, the arm version of that.

https://github.com/neumayer/concourse-arm-worker

It does not cross-compile, so you need to have a working arm machine with docker to build (so it uses the correct arm versions of the alpine images used). It is not 100 percent up to date, but uses the git sources for concourse v6.5.1 (roughly). The way I build it is with a concourse task that runs on the existing arm worker with multiarch docker in docker build (not shared, but it basically calls the build script in a dind-arm container).

I wanted to update to the newest release soonish, if I remember I can try updating here and on github (I remember reading some new guardian versions and parameter changes from the release pages, but should be fine to integrate with this).

neumayer avatar Nov 13 '20 06:11 neumayer

Given AWS is pushing Graviton2 as next gen it'd be really handy if Concourse came with arm64 support out of the box.

and yeah, the numbers tell a quite nice story for Graviton2 when it comes to performance/$:

https://www.phoronix.com/scan.php?page=article&item=amazon-graviton2-benchmarks&num=7

"To little surprise, the M6g instances were generally offering the best performance-per-dollar."

cirocosta avatar Nov 13 '20 12:11 cirocosta

Our team shrank by quite a bit recently, so I don't really know when we'll have the bandwidth for this, but I'll optimistically put it at the bottom of the (pretty shallow) backlog so we don't forget about it since it definitely makes sense to at least have this on the radar.

Realistically, this seems like it might involve a lot of pipeline plumbing and infrastructure support, so it's hard to square something like this with the other high-priority items on the roadmap (v10, native K8s runtime) given that we can only really do 3 things in parallel at the moment.

If anyone is interested in attempting to move this forward from the outside, e.g. by forking our pipelines and submitting PRs, we'd be happy to help.

/cc @scottietremendous @matthewpereira

vito avatar Nov 26 '20 16:11 vito

One more +1 for official aarch64 binaries!

martin-g avatar Jan 05 '21 13:01 martin-g

Adding support for ARM64 would be awesome! I hope it would be done sooner! Great work @concourse/team !

emiliofernandes avatar Jan 12 '21 07:01 emiliofernandes

I've written a blog article how to run a worker node on ARM64 based on https://github.com/neumayer/concourse-arm-worker - https://martin-grigorov.medium.com/concourseci-on-arm64-229833883ca9 Thanks to @neumayer and @taylorsilva for their help!

martin-g avatar Jan 20 '21 11:01 martin-g