modules icon indicating copy to clipboard operation
modules copied to clipboard

Handle GitHub runners possibly running out of space

Open edmundmiller opened this issue 1 year ago • 5 comments

          sorry this feels very draft status. move it into a another PR before merging this one

Originally posted by @mashehu in https://github.com/nf-core/modules/pull/6286#discussion_r1846901512

  # get-number-of-shards:
  #   runs-on: ubuntu-latest
  #   outputs:
  #     # Needs to be a json array
  #     shards: ${{ steps.shards.outputs.shards }}
  #     total_shards: ${{ steps.shards.outputs.total_shards }}
  #   steps:
  #     - name: Install nf-test
  #       uses: nf-core/setup-nf-test@v1
  #       with:
  #         version: ${{ env.NFT_VER }}

  #     - id: shards
  #       run: |
  #         nftest_output=$(nf-test test --dry-run --changed-since HEAD^ --filter process --follow-dependencies)
  #         number_of_shards=$(echo $nftest_output | grep -o 'Found [0-9]* related test' | tail -1 | awk '{print $2}')
  #         three_tests_per_shard=$(echo $(($number_of_shards / 3)) | awk '{print int($1+0.5)}')
  #         shards_array=$(for shard in $(seq 1 $number_of_shards); do echo $shard; done | tr ' ' '\n' | jq -R . | jq -s .)
  #         echo "shards=${shards_array}" >> $GITHUB_OUTPUT
  #         echo "total_shards=${number_of_shards}" >> $GITHUB_OUTPUT

WIP Code

edmundmiller avatar Nov 18 '24 17:11 edmundmiller

Tested in #6716 with some examples. We'll see how many issues we run into with it.

edmundmiller avatar Nov 18 '24 17:11 edmundmiller

Wondering if we could use Fusion with S3 locally to avoid this 🤔

edmundmiller avatar Nov 19 '24 15:11 edmundmiller

To document this idea somewhere...

The GitHub-hosted runners have relatively little available space (18G on ubuntu-latest) on the root partition due to the size of the default runner image, but have a separate relatively-unused partition (66G available space on ubuntu-latest) mounted at /mnt; e.g.:

runner@runnervmg1sw1:~/work/modules/modules$ df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/root        72G   55G   18G  76% /
tmpfs           7.9G   84K  7.9G   1% /dev/shm
tmpfs           3.2G  1.1M  3.2G   1% /run
tmpfs           5.0M     0  5.0M   0% /run/lock
/dev/sdb16      881M   62M  758M   8% /boot
/dev/sdb15      105M  6.2M   99M   6% /boot/efi
/dev/sda1        74G  4.1G   66G   6% /mnt
tmpfs           1.6G   12K  1.6G   1% /run/user/1001
runner@runnervmg1sw1:~/work/modules/modules$ ls -ld /mnt
drwxr-xr-x 3 root root 4096 Nov 17 21:38 /mnt
runner@runnervmg1sw1:~/work/modules/modules$ ls -lh /mnt
total 4.1G
-rw-r--r-- 1 root root  333 Nov 17 21:38 DATALOSS_WARNING_README.txt
drwx------ 2 root root  16K Nov 17 21:38 lost+found
-rw------- 1 root root 4.0G Nov 17 21:38 swapfile

Relocating the Nextflow & nf-test workdirs, the Docker daemon data directory, and conda environments to /mnt can reduce the likelihood that a GitHub-hosted runner runs out of space. E.g., in .github/actions/nf-test-action/action.yml, modifying the following job step:

    - name: Run nf-test
...
      run: |
        sudo mkdir -m 777 -p /mnt/runner
        echo '{"data-root": "/mnt/docker"}' | sudo tee /etc/docker/daemon.json
        sudo systemctl restart docker

        export NFT_WORKDIR=/mnt/runner/nf-test
        export NXF_WORK=/mnt/runner/work 
        export NXF_SINGULARITY_CACHEDIR=/mnt/runner/singularity-cachedir
        export CONDA_ENVS_DIRS=/mnt/runner/conda/envs

        conda config --prepend pkgs_dirs /mnt/runner/conda/pkgs_dirs

        nf-test test \
...

I've used this method to run the nf-test workflow in a fork (substituting runs-on: ubuntu-latest)

nathanweeks avatar Nov 18 '25 15:11 nathanweeks

interesting, would you mind opening a PR with this changes to the action?

mashehu avatar Nov 18 '25 15:11 mashehu

I haven't tested this approach with the AWS-hosted runners that nf-core/modules uses---do they have only one large root filesystem instead? (FWIW, I used mxschmitt/action-tmate to interactively poke around a GitHub-hosted runner)

nathanweeks avatar Nov 18 '25 17:11 nathanweeks