community
community copied to clipboard
Set up bare metal GiHub self-hosted runner on Oracle Cloud
Our current bare metal self-hosted runners are going away, so we are working on setting up a bare metal instance in Oracle Cloud as a GitHub self-hosted runner.
So far
- [x] Set up the Bare Metal instance in Oracle Cloud
- [x] Set it up as a GitHub self-hosted runner as (but need to leave it disabled for now, see below)
Todo:
- [ ] Migrate all existing usages of
runs-on: self-hostedtoruns-on: equinix-bare-metal- This is needed because all self-hosted runners have the
self-hostedlabel, and so as soon as we enable the new self-hosted runner, those workflows will sometimes get one and sometimes get the other. - Usages of the new self-hosted runner should use
runs-on: oracle-bare-metal-64cpu-512gb-x86-64to differentiate.
- This is needed because all self-hosted runners have the
- [ ] Decide if we want to use container-based workflows in order to isolate environments, or do we want to run the jobs directly on the host, in which case what software do we need to install on the host?
Related to
- #2701 (cc @scottgerring)
- #2789 (cc @cijothomas)
I’ll let @cijothomas and @clhain explicitly answer the questions mentioned in this issue. From my perspective, a key criterion is selecting a solution that maximizes our chances of obtaining benchmark results with minimal external interference and dependencies, ensuring the results remain as stable as possible over time.
Migrate all existing usages of
runs-on: self-hostedtoruns-on: equinix-bare-metal
PRs have been sent to all the repos that are using runs-on: self-hosted.
@open-telemetry/python-maintainers @open-telemetry/javascript-maintainers can you review the two PRs below? we need all existing repos to migrate to the new label before we can move forward with adding a second self-hosted runner. thanks!
- https://github.com/open-telemetry/opentelemetry-python/pull/4622
- https://github.com/open-telemetry/opentelemetry-js/pull/5747
Hey @trask cool!
Decide if we want to use container-based workflows in order to isolate environments, or do we want to run the jobs directly on the host, in which case what software do we need to install on the host?
We'd want unzip, rustup, and build-essential(assuming ubuntu) at least. If you have the inclination, you could also cloneopentelemetry-rustonto the node and runcargo criterion` from the root of the project; this should make it clear if we've missing anything else.
Once you've done that I can switch our benchmark build on pushes to main back to run on the shared workers!
https://github.com/open-telemetry/opentelemetry-rust/blob/1f0d9a9f62a3f7829e6065191fa1c3d4065b269c/.github/workflows/benchmark.yml#L29-L32
@scottgerring @cijothomas the new self-hosted runner is available now
you can see an example here: https://github.com/open-telemetry/sig-project-infra/pull/43/files
@scottgerring using a container-based workflow will give you the chance to pick a container that has the tools you need and will be more portable if we spin up more self-hosted runners in the future, let me know if that works
btw, I've given access to opentelemetry-rust and otel-arrow repos, with no restrictions on workflows / git refs for now so you can test it out via PRs like in the example above
once you have things working and merged then we can further restrict it
Hey @trask - thanks for slogging away at this! I'm having a play with it now. I can see that the worker runs and will try get the job going :)
Hey @trask it lives! Here's runs from main on the dedicated workers:
https://github.com/open-telemetry/opentelemetry-rust/actions/workflows/benchmark.yml?query=event%3Apush
Thanks for your help, and feel free to remove the special blessed branches rule :)