Yggdrasil icon indicating copy to clipboard operation
Yggdrasil copied to clipboard

Setup buildkite as an alternative to AZP

Open vchuravy opened this issue 2 years ago • 11 comments

TODO:

  • [ ] Nail down artifact upload story
    • Using buildkite upload is neat since we get something in the UI
    • Ideally we should upload directly to our own buckets, but need to make sure not to leak keys
      • Should be good citizens and not cause tremendous traffic to BK buckets
      • Can we generate ephermal keys? We could inject those from the generator script
  • [ ] Test jll_init and register steps
  • [x] CCACHE/distcc
  • [x] Limit concurreny of registration
  • [ ] @giordano hook up remarks from binary verifier to BK annotations (maybe a custom logger?)
  • [ ] Party like it's the 90s

vchuravy avatar Apr 16 '22 21:04 vchuravy

Ah great we can also do logs extraction to create warnings https://buildkite.com/docs/agent/v3/cli-annotate

vchuravy avatar Apr 16 '22 22:04 vchuravy

Some comments about what we currently do in Azure Pipelines. In terms of race conditions:

  • the JLL repo init job is not safe: in some cases we can build multiple versions of the same package (when they're split in different directories, e.g. CUDA, OpenBLAS or LLVM)
  • the build and package registration jobs are safe, however in some cases registration may cause merge conflicts in the registry (but we have relatively little control over that). Note that currently we do registrations in sequence (#208)

Note that buildkite is currently running, and failing, also on master: https://buildkite.com/julialang/yggdrasil/builds/4#d85857dd-92f6-43cd-82b5-2827d84e5602. It'd be nice to disable it until this is working

giordano avatar Apr 17 '22 19:04 giordano

It'd be nice to disable it until this is working

Done.

vchuravy avatar Apr 18 '22 07:04 vchuravy

So the current failure is real and I don't know why and how. ERROR: LoadError: failed process: Process(git merge-base --fork-point remotes/origin/master HEAD, ProcessExited(1)) [1]

It seems to occur always when master is ahead of this feature branch, but git does not give a useful error.

vchuravy avatar Apr 19 '22 21:04 vchuravy

I guess that's because this doesn't run on the merge commit (which is my gripe with buildkite) :upside_down_face:

giordano avatar Apr 19 '22 21:04 giordano

When I have some free time, I want to get https://github.com/JuliaCI/pull-request-merge-commit-buildkite-plugin working. Then you would just stick that plugin at the top of your list of plugins, and it would automatically take care of checking out the merge commit.

Don't use it yet though, it doesn't work.

DilumAluthge avatar Apr 19 '22 21:04 DilumAluthge

I guess that's because this doesn't run on the merge commit (which is my gripe with buildkite) 🙃

It's particularly weird since I can't reproduce locally

vchuravy avatar Apr 19 '22 21:04 vchuravy

Alright, the minimum functionality of https://github.com/JuliaCI/merge-commit-buildkite-plugin is now there.

Basically, you just add this to the top of your plugin list:

- JuliaCI/merge-commit: ~

So e.g.

steps:
  - plugins:
    - JuliaCI/merge-commit: ~

You'll want to make sure you do this for every step.

Note: eventually, you'll want to replace JuliaCI/merge-commit: ~ with JuliaCI/merge-commit#v1: ~, but that'll have to wait until we first make a tag on the https://github.com/JuliaCI/merge-commit-buildkite-plugin repo.

DilumAluthge avatar Apr 20 '22 01:04 DilumAluthge

You should run the generator inside Sandbox.jl.

DilumAluthge avatar Apr 20 '22 07:04 DilumAluthge

You should run the generator inside Sandbox.jl.

How can I tell if I am? The cp error just turned out to be a user error

vchuravy avatar Apr 20 '22 08:04 vchuravy

It looks like we are now checking out the merge commit correctly!

DilumAluthge avatar Apr 21 '22 02:04 DilumAluthge

@giordano sadly NUMA has no warnings, do you have suggestions for a package that is fast and does emit some warnings?

vchuravy avatar Jan 07 '23 16:01 vchuravy

GR has some hundreds of warnings and it shouldn't take too long: https://dev.azure.com/JuliaPackaging/Yggdrasil/_build/results?buildId=24439&view=results

giordano avatar Jan 07 '23 18:01 giordano

From Elliot:

  1. Generator job that analyzes the diff, and launches pipelines based on that.
  2. Builder jobs that do the actual building. We want these to be skipped if they’re building the same thing as was built previously, so we’ll use coppermind (with cryptic, to store the S3 access key) to skip the actual build step if a treehash match is found
  3. Registration job that downloads the binaries from buildkite artifacts again (since coppermind’s caching on S3 is an implementation detail, not the true storage location), then uploads them to GH releases

vchuravy avatar Jan 14 '23 20:01 vchuravy

So the last steps are:

  • Use a token for jlbuild instead one of mine
  • Improve https://github.com/JuliaPackaging/BinaryBuilder.jl/pull/1255 (but we can do that separatly and I can revert the BB bump here)

vchuravy avatar Jan 16 '23 13:01 vchuravy

Looking at https://github.com/JuliaBinaryWrappers/HelloWorldC_jll.jl/commit/bd36c4c02a9eb2fb017c95abe647d2c36feee5d9 I believe you aren't exporting the environment variable

YGGDRASIL=true

are you? That's currently set in https://github.com/JuliaPackaging/Yggdrasil/blob/8e7a9c4f1761fcb71c8d2e33688d36ff6fb9f9d2/.ci/azp_agent/.env, so you may have missed it.

giordano avatar Jan 16 '23 17:01 giordano