tools icon indicating copy to clipboard operation
tools copied to clipboard

Test GHA self-hosted runners for events

Open apeltzer opened this issue 3 years ago • 3 comments

Description of feature

We should have some way of setting up custom runners for events such as hackathons to speed up CI on such days.

This is fairly easy, just requires machines with linux + the setup scripts by Github to be started - then we have to figure out whether we have to fix labels in the github actions scripts or have to change that in our labels.

apeltzer avatar Oct 12 '22 10:10 apeltzer

Basic configuration description (only visible with permissions):

https://github.com/organizations/nf-core/settings/actions/runners

Then can configure that runner following the instructions there, add a label ubuntu-20.04 to it to get jobs scheduled on it too. Also need to install docker on the machine running the runner as we are using docker to run jobs.

apeltzer avatar Oct 12 '22 10:10 apeltzer

Ok, documenting all learning on this one:

a.) Need to enable that the jobs can run on public repositories as we all have public in nf-core b.) Labels of the runner should be set to ubuntu-latest to be able to run other jobs without changing labels in GHA yaml files

The EC2 needs some preinstallation to make things work:

a.) Docker installed + usable for the default user, could also run the runner as root to circumvent that (probably not the best idea) b.) nodejs installed & directories writeable for everyone -- for prettier etc pp c.) Java JDK > 13 installed for running Nextflow tests

Python installation works out of the box for a Ubuntu AMI, so should not need any configuration.

apeltzer avatar Oct 12 '22 11:10 apeltzer

This is a small "HowTo" create our own self-hosted EC2 runner on AWS for GHA execution (date 2023-03-23).

Note This is only required for setting up things from start - there is also an AMI set up with the ID ami-0203602079711f367 (accessible for core members) to just boot up an arbitrary EC2 instance that already connects to the self-hosted backend of Github for nf-core repositories.

Steps:

a.) Use "ubuntu" latest for EC2 instance, boot up machine with sufficient local storage (256GB is enough) b.) Start machine, log in as standard user c.) Install required stuff and setup groups as below code shows

sudo apt-get install apt-transport-https ca-certificates curl software-properties-common
sudo apt-get update
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu  $(lsb_release -cs)  stable"
sudo apt update
sudo apt-get install docker-ce
sudo systemctl start docker
sudo groupadd docker
sudo usermod -aG docker ubuntu
sudo apt install nodejs
sudo apt install npm
sudo chmod -R 777 /usr/local/lib
sudo chmod -R 777 /usr/local/bin/
#install conda
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh && bash Miniconda3-latest-Linux-x86_64.sh
#install singularity
sudo apt-get update && sudo apt-get install -y \
    build-essential \
    uuid-dev \
    libgpgme-dev \
    squashfs-tools \
    libseccomp-dev \
    wget \
    pkg-config \
    git \
    cryptsetup-bin \
    uidmap
wget https://github.com/sylabs/singularity/releases/download/v3.11.1/singularity-ce_3.11.1-jammy_amd64.deb
sudo dpkg -i singularity-ce_3.11.1-jammy_amd64.deb

d.) Go to https://github.com/organizations/nf-core/settings/actions/runners (only works for core members), then select instructions on how to set up runners by copying code line by line and executing it ;-)

Warning Make sure that you run the ./configure ... step in a way that you add the label ubuntu-latest so that any nf-core pipeline repository can run their jobs on the created runnner. Otherwise your runner will not take any jobs from repositories in nf-core, thus being unuseful ;-) You will be asked whether you want to add labels interactively, so this is easy!

e.) Manually set up service to autostart when machine boots up like here https://docs.github.com/en/actions/hosting-your-own-runners/configuring-the-self-hosted-runner-application-as-a-service

sudo ./svc.sh install
sudo ./svc.sh start

d.) You may check if the runner is up and running and taking up jobs here: https://github.com/organizations/nf-core/settings/actions/runners

apeltzer avatar Oct 12 '22 11:10 apeltzer

By the way, if you want a no-maintenance method to achieve that, you can use cirun.io for the same.

aktech avatar Nov 17 '22 19:11 aktech

Once we start the runner, need to assign it the same label as we use everywhere ubuntu-latest, that way we will have it available

https://docs.github.com/de/actions/hosting-your-own-runners/using-labels-with-self-hosted-runners

apeltzer avatar Feb 28 '23 20:02 apeltzer

https://github.com/philips-labs/terraform-aws-github-runner autoscaling with spot instances!

The module will scale down to zero runners by default

I wonder if it will kill jobs that go up to 6 hour limit 🤔

Main docs on the topic https://docs.github.com/en/actions/hosting-your-own-runners/autoscaling-with-self-hosted-runners

edmundmiller avatar Mar 08 '23 16:03 edmundmiller

Just looked at cirun.io, it's free for open source and seems pretty easy to set up with spot instances, and the best part IMO could be the "triggered by org" so we could just run it for maintainers/core members.

edmundmiller avatar Mar 08 '23 16:03 edmundmiller

Hm, I'm happy to give it a try to some extent - how much effort would this be?

apeltzer avatar Mar 10 '23 15:03 apeltzer

We'll take a look at cirun fairly soon - hosting for now manually on EC2, as we'd need a special config file.

apeltzer avatar Mar 23 '23 20:03 apeltzer

I took a look at cirun a couple of days ago. I don't think it'll work for us:

  • Requires each repo to be switched on manually
    • We want the runners to be available to all repos on the org all the time, we add new repos very frequently and don't want to have to go and click buttons each time
  • Requires a config file in the repo
    • Need to update all repos, plus more dotfile bloat in the repo root
  • Not 100% sure on this, but I think it'll mean we can only use EC2 and won't make use of the free GitHub runners we have

In our case, we don't really care where the jobs are running or what hardware they're using - we just want more capacity. I think that having one or two manual EC2 instances running which drain the default Actions job queues is much easier than setting up cirun and actually works better in our use case.

Happy to be corrected on any and all points 😀 (especially as we have the author on the thread! 😅 )

ewels avatar Mar 23 '23 22:03 ewels

Closing this in favour of https://github.com/nf-core/actions-runners

apeltzer avatar Mar 24 '23 19:03 apeltzer