nixpkgs icon indicating copy to clipboard operation
nixpkgs copied to clipboard

github-runner: fix package layout and script patches

Open cmoog opened this issue 1 year ago • 3 comments

Description of changes
  1. Fix fatal startup crashes caused by recent changes to the source layout.
  2. Add a default value for RUNNER_ROOT.

How to test:

nix-build -A github-runner
./result/bin/config.sh --url https://github.com/xxx/xxxx --token xxxx --ephemeral
./result/bin/run.sh
Things done
  • Built on platform(s)
    • [x] x86_64-linux
    • [x] aarch64-linux
    • [ ] x86_64-darwin
    • [ ] aarch64-darwin
  • [ ] For non-Linux: Is sandbox = true set in nix.conf? (See Nix manual)
  • [ ] Tested, as applicable:
  • [ ] Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • [x] Tested basic functionality of all binary files (usually in ./result/bin/)
  • 22.11 Release Notes (or backporting 22.05 Release notes)
    • [ ] (Package updates) Added a release notes entry if the change is major or breaking
    • [ ] (Module updates) Added a release notes entry if the change is significant
    • [ ] (Module addition) Added a release notes entry if adding a new NixOS module
    • [ ] (Release notes changes) Ran nixos/doc/manual/md-to-db.sh to update generated release notes
  • [ ] Fits CONTRIBUTING.md.

cmoog avatar Aug 12 '22 19:08 cmoog

cc @veehaitch @newAM

cmoog avatar Aug 12 '22 19:08 cmoog

  1. Can you explain what crashes you saw? I tested the last pull-request on aarch64-linux and had no issues. If you experienced problems I would like to know what I can test in the future to catch this :)
  2. What problem are you trying to solve by setting RUNNER_ROOT? I tried to find what this variable does, but the only relevant result was this nixpkgs issue: https://github.com/NixOS/nixpkgs/issues/158970 cc @newAM
  1. Here is what happens on master on x86_64-linux. I haven't looked too closely at the NixOS module wrapper, but it looks like some of that logic may be causing different behavior for your test workflow.
$ nix-build -A github-runner
this path will be fetched (22.55 MiB download, 73.57 MiB unpacked):
  /nix/store/5ksdp3f5pkdgn3pq83s8k6361yiwy3j9-github-runner-2.295.0
copying path '/nix/store/5ksdp3f5pkdgn3pq83s8k6361yiwy3j9-github-runner-2.295.0' from 'https://cache.nixos.org'...
/nix/store/5ksdp3f5pkdgn3pq83s8k6361yiwy3j9-github-runner-2.295.0
$ ./result/bin/config.sh
touch: cannot touch '.env': Permission denied
./env.sh: line 37: .path: Permission denied
./env.sh: line 32: .env: Permission denied
./env.sh: line 32: .env: Permission denied
Unhandled exception. System.UnauthorizedAccessException: Access to the path '/nix/store/5ksdp3f5pkdgn3pq83s8k6361yiwy3j9-github-runner-2.295.0/_diag' is denied.
 ---> System.IO.IOException: Permission denied
   --- End of inner exception stack trace ---
   at System.IO.FileSystem.CreateDirectory(String fullPath)
   at System.IO.Directory.CreateDirectory(String path)
   at GitHub.Runner.Common.HostTraceListener..ctor(String logFileDirectory, String logFilePrefix, Int32 pageSizeLimit, Int32 retentionDays)
   at GitHub.Runner.Common.HostContext..ctor(String hostType, String logFile)
   at GitHub.Runner.Listener.Program.Main(String[] args)
/nix/store/5ksdp3f5pkdgn3pq83s8k6361yiwy3j9-github-runner-2.295.0/lib/config.sh: line 81: 3054140 Aborted                 (core dumped) /nix/store/5ksdp3f5pkdgn3pq83s8k6361yiwy3j9-github-runner-2.295.0/lib/Runner.Listener configure "$@"
$ ./result/bin/run.sh
cp: cannot stat '/nix/store/5ksdp3f5pkdgn3pq83s8k6361yiwy3j9-github-runner-2.295.0/lib/run-helper.sh.template': No such file or directory
/nix/store/5ksdp3f5pkdgn3pq83s8k6361yiwy3j9-github-runner-2.295.0/lib/run.sh: line 16: /nix/store/5ksdp3f5pkdgn3pq83s8k6361yiwy3j9-github-runner-2.295.0/lib/run-helper.sh: No such file or directory
Exiting runner...

$ RUNNER_ROOT=$(pwd) ./result/bin/run.sh
cp: cannot stat '/nix/store/5ksdp3f5pkdgn3pq83s8k6361yiwy3j9-github-runner-2.295.0/lib/run-helper.sh.template': No such file or directory
/nix/store/5ksdp3f5pkdgn3pq83s8k6361yiwy3j9-github-runner-2.295.0/lib/run.sh: line 16: /nix/store/5ksdp3f5pkdgn3pq83s8k6361yiwy3j9-github-runner-2.295.0/lib/run-helper.sh: No such file or directory
Exiting runner...
  1. RUNNER_ROOT is an environment variable added only in the nix build of github-runner, relevant patch code here. In the regular runner distribution, it writes runtime configuration state to a child directory of the source code. Given that /nix/store is readonly at runtime, we need to specify an alternative directory for local state. At present, RUNNER_ROOT is only set in the NixOS module for github-runner and isn't really documented anywhere else. For users trying to use the github-runner pkg derivation directly, this causes a broken-by-default experience. (When RUNNER_ROOT is unset the config script crashes immediately when it tries to write to the store). In addition to the default value provided in this PR, we should also probably document its behavior in the longDescription.

cmoog avatar Aug 13 '22 15:08 cmoog

Unfortunately with this pull-request I can no longer start a runner on aarch64-linux.

I'm realizing now I should also test that these changes work with the NixOS module. I'll update this PR with the relevant fixes as soon as possible.

cmoog avatar Aug 13 '22 15:08 cmoog

I undid the change that broke compatibility with the NixOS service module. I've verified that a basic usage of the service module now works again on x86_64-linux.

cmoog avatar Aug 15 '22 02:08 cmoog

@veehaitch Like I mentioned here, all of those proposed changes would cause breakage due to their escape/quote handling. Making those proposed changes would break this PR.

cmoog avatar Aug 21 '22 13:08 cmoog

Something I'm confused about is why these errors seemingly happen inconsistently (see https://github.com/NixOS/nixpkgs/issues/187009 and https://github.com/NixOS/nixpkgs/pull/182189#pullrequestreview-1046678010). Does anyone (@cmoog maybe) have any clue why this is?

winterqt avatar Aug 22 '22 07:08 winterqt

Something I'm confused about is why these errors seemingly happen inconsistently (see https://github.com/NixOS/nixpkgs/issues/187009 and https://github.com/NixOS/nixpkgs/pull/182189#pullrequestreview-1046678010). Does anyone (@cmoog maybe) have any clue why this is?

That is likely due to the code paths introduced by the service module wrapper. This PR fixes the derivation itself, which is broken on master 100% of the time.

cmoog avatar Aug 22 '22 13:08 cmoog

Successfully created backport PR #187909 for release-22.05.

github-actions[bot] avatar Aug 22 '22 18:08 github-actions[bot]