nix icon indicating copy to clipboard operation
nix copied to clipboard

error: not an absolute path: 'nix-archive-1' with `--store` builds

Open kevincox opened this issue 3 years ago • 3 comments
trafficstars

Describe the bug

Occasionally builds fail with the error:

error: not an absolute path: 'nix-archive-1'

This appears to occur when copying paths back from the builder when using --store.

I'm not sure but it also appears to occur most often when multiple builds are running in parallel. Maybe it has something to do with both builds waiting on one derivation and this somehow breaks the copy?

The issue is transient. I haven't see an issue where the first retry didn't fix it.

Steps To Reproduce

nix-build --store builder.example
# or
sudo nixos-rebuild boot --build-host builder.example

Expected behavior

The builds succeed reliably.

% nix-env --version
nix-env (Nix) 2.6.1

kevincox avatar Mar 13 '22 23:03 kevincox

I have recently parallelized my deployments. This often involves concurrent nix copy operations that are either copying drvs/outs and/or evaling at the same time.

After the parallelization, I'm regularly seeing "sporadic" failures, and this is one that occurs pretty frequently.

==>> nix copy --derivation /nix/store/31r8a1vg9hk4n62y7h0jrd78824i735r-nixos-system-rpizerotwo1-22.11.20220530.35f6d41.drv --eval-store auto --to ssh-ng://[email protected] --no-check-sigs
copying 15 paths...
copying path '/nix/store/15ndjvv7mshgwznjm3c361spgzz3aa51-stage-1-init.sh.drv' to 'ssh-ng://[email protected]'...
copying path '/nix/store/daykpcdggpzp2hj62zq7s350f9p5p3yr-tow-boot-update.drv' to 'ssh-ng://[email protected]'...
copying path '/nix/store/1m7d6pyr4lkrwlnvhwzf6259xbhnl038-system-path.drv' to 'ssh-ng://[email protected]'...
copying path '/nix/store/2yv9pdkzmrv5d7d86rsgw95jic4g2na0-initrd-linux-5.18.drv' to 'ssh-ng://[email protected]'...
copying path '/nix/store/abcfrlwlsbw1dpni5ni98jf264yql4w3-etc-os-release.drv' to 'ssh-ng://[email protected]'...
copying path '/nix/store/3i6ywdrkzmph9j7wjndvjaya6wy7dbdp-unit-systemd-fsck-.service.drv' to 'ssh-ng://[email protected]'...
copying path '/nix/store/7ga729wkcsx9hj8r6qnrshl2765kh6ma-unit-polkit.service.drv' to 'ssh-ng://[email protected]'...
copying path '/nix/store/mf0sj6q327kvxcm4qls1iydw66j8qchk-dbus-1.drv' to 'ssh-ng://[email protected]'...
copying path '/nix/store/8x3dj795wy4f450vzqc39yk6hnwpw0xh-unit-dbus.service.drv' to 'ssh-ng://[email protected]'...
copying path '/nix/store/hv794k5ry1c7ch7pj88wpgjbaggx6h56-system-units.drv' to 'ssh-ng://[email protected]'...
copying path '/nix/store/l64h3x5w5bmqzl40qfd92kq17x5ak71f-issue.drv' to 'ssh-ng://[email protected]'...
copying path '/nix/store/ilwn6qqfip0vqb0f2li0kjlbypxl34j3-unit-dbus.service.drv' to 'ssh-ng://[email protected]'...
copying path '/nix/store/p8gvqjvkzbggmd2fnmbr9kj98ij10gin-user-units.drv' to 'ssh-ng://[email protected]'...
copying path '/nix/store/qjsa0fjrn7zvc26makyk9272ynw15fvl-etc.drv' to 'ssh-ng://[email protected]'...
copying path '/nix/store/31r8a1vg9hk4n62y7h0jrd78824i735r-nixos-system-rpizerotwo1-22.11.20220530.35f6d41.drv' to 'ssh-ng://[email protected]'...
error: not an absolute path: 'nix-archive-1'

colemickens avatar May 30 '22 20:05 colemickens

this sounds strongly like a NAR parsing failure, although I don't think the same NAR should be parsed by multiple threads in parallel, or that a stream of NARs gets corrupted => so I guess missing synchronization of NAR stream parsing/handling, leading to interleaved NARs?

fogti avatar Sep 15 '22 21:09 fogti

We found that #6612 fixes the issue, see https://github.com/NixOS/nix/issues/6730#issuecomment-1189300423. I don't believe we've seen the issue since in our environment.

gbpdt avatar Sep 16 '22 07:09 gbpdt

Correct me if I am wrong, but to me it looks like the path it takes inside LocalStore::addToStore could cause it to consider the path valid (if it was created by a parallel daemon process) and never actually consume the NAR data. This would make it error with this message when trying to parse the next ValidPathInfo here. I am going to try to drain the NAR archive and see if it fixes the issue.

This would not happen when a single copy process is active client-side since it would be preceded by a QueryValidPaths and only paths that are not valid would be part of the data sent from the client but it reproduces 100% if running two parallel nix copy.

abbec avatar Jan 31 '23 08:01 abbec