ci: set up platform/OS matrix of builds
ARM64 architecture wasn't the problem in https://github.com/filecoin-project/ref-fvm/issues/599, the ancient Linux kernel was. However, we should run unit, integration and conformance tests (as well as the build, obviously) on CI on the platforms and OS we support. Currently this is limited to the following by Wasmtime:
- Linux x86_64 (already tested)
- Linux aarch64 (no coverage)
- macOS x86_64 (no coverage)
- Windows x86_64 (no coverage)
While Windows is not supported by Lotus: https://lotus.filecoin.io/lotus/install/prerequisites/ (and it's safe to assume that other Filecoin clients like Forest, Venus and Fuhon don't aim to support it either), developers building on the FVM will run Windows. Therefore, we can't discard this OS.
@galargh has agreed to help out in setting up Linux on aarch64 via self-hosted runners. From a conversation with him:
I’m able to bring up graviton machines as gh self-hosted runners now. Here’s where this setup lives: https://github.com/pl-strflt/tf-aws-gh-runner These are the instructions on how to start using runners defined in that repo in other repositories: https://github.com/pl-strflt/tf-aws-gh-runner#how-to-use-an-existing-self-hosted-runner-type-in-your-repository (please see the note about security of using self-hosted runners too). The runner type you’re after is linux-arm64-default. Here’s how it can be used in a workflow: https://github.com/pl-strflt/tf-aws-gh-runner/blob/7f94e9c19d62104de01a2f0df915d72d9d71164a/.github/workflows/playground.yml#L9 And here’s an example workflow run that confirms the setup is OK: https://github.com/pl-strflt/tf-aws-gh-runner/runs/6753443481?check_suite_focus=true
On setting it up on ref-fvm:
See the installation instructions here https://github.com/pl-strflt/tf-aws-gh-runner#how-to-use-an-existing-self-hosted-runner-type-in-your-repository TLDR, it requires installing a GitHub App for the org changing org settings to allow self-hosted runners usage in public repos changing repo settings to require approval for ALL workflow runs from outside contributors or changing org settings to allow self-hosted runners usage only in specific workflows (the latter cannot be applied to pull_request workflows; doing both is a valid choice too)
(Oops, fat fingers made me accidentally close this)
I'll get up to speed with the implications of using self-hosted runners via the method you propose @galargh. In the meantime, do you have some bandwidth to refactor the GitHub Actions workflow and the Rust actions so they compose well? IMO they use matrix builds incorrectly, which now obstructs the actual use case that actually calls for matrix builds (i.e. this one).
I'm looking to put this on the top of my work list next week, since the current setup doesnt allow for easy special-casing that generating code coverage for integration tests requires (aka its blocking that thread of work). @galargh I will be aiming for an MVP and would love to collaborate and get feedback on the design of the workflows.
A bit delayed but I've finally managed to make the CI workflow pass on ARM64 runners - see https://github.com/galorgh/ref-fvm/pull/1. I did it in a fork because it requires setting up self-hosted runners for the org and I cannot do that for filecoin-project. Also, in this version of the PR I changes runner config from ubuntu-latest to the ARM one while here, we'd want to do both. It won't be too hard to achieve though.
I'm looking to put this on the top of my work list next week, since the current setup doesnt allow for easy special-casing that generating code coverage for integration tests requires (aka its blocking that thread of work). @galargh I will be aiming for an MVP and would love to collaborate and get feedback on the design of the workflows.
Of course! Happy to help :) Ping me if you want to sync on this or just tag me on whatever you want my input on.
related PRs #616 #637 (sorry I didnt reference these earlier)
@galargh #652 fixes cache size growth and ends the fixes I wanted to make with CI, setting up arm runners would be next. The CI changes also added macos into the matrix, so adding the arm runners should be easier.