obi
obi copied to clipboard
add explicit queues to buildkite agents
Right now, we use the default Buildkite queue for every job in the infrastructure. A big bulk build will use up all the slots for an architecture, resulting in everything else being queued up.
To solve this, we need the following queues:
-
bulk
: for bulk builds -
infra
: for infrastructure containers -
default
: misc stuff
Unfortunately, there is no way to prioritise queues, so we need to overcommit them in the agents. This might mean that if everything is maxxed out, we are pushing heavy load averages. However, this will be slow, but not a fatal error.
There is some discussion on https://github.com/buildkite/feedback/issues/147 about improving this situation upstream in Buildkite, but for now overcommit should be fine for us.
We addressed this by auto-scaling our agent nodes up and down according to metrics from buildkite/buildkite-metrics.