nix
nix copied to clipboard
ca-derivations: warning: output out of input /nix/store/xyz.drv missing, aborting the resolving
Describe the bug
When enabling ca-derivations
, if I build a bunch of drvs that use it or are downstream from one that does, I frequently get a crash like:
warning: output out of input /nix/store/096ihak5gdyh2x88r0ywjgh1hmjylfkg-Groq.Compiler.test.end_to_end.resize_linear_align_corners_float16_1x1x20x30_1x1x40x60_slt2x2.test.drv missing, aborting the resolving
error: unexpected end-of-file
This seems to preceded by this in the nix-daemon journal:
src/libstore/build/derivation-goal.cc:463: void nix::DerivationGoal::inputsRealised(): Assertion `attempt' failed.
From the source it looks like tryResolve
warns about the failure, but it's actually an error not a warning, because next thing an assert trips over it.
However, it is incorrect about the drv failing, because it is in fact either generated. I can no longer check that non-invasively due to ca-derivations breaking drv->out and (apparently?) not expose its internal lookup mechanism, but nix-store -r
instantly gives me the output so it must have worked. So if I just continually retry the build until the error stops happening, then it is able to get all the way through.
Steps To Reproduce
I only started getting this when I started doing remote builds, so it's probably related to either that or a certain amount of parallelism.
nix-env --version
output
2.4.1. I'm not sure if newer versions have fixed this, but I couldn't find any references in closed issues. I'll try updating to 2.8, but it's tricky due to nix's lack of cross version support and tendency to introduce new bugs.
I see (roughly) the same thing with nix 2.9.1.
error: derivation '/nix/store/h3m67kyhkwbdg795nj5ra5j2pqdy5k4a-libpng-1.2.59.drv' doesn't have expected output 'dev' (derivation-goal.cc/resolvedFinished,realisation)
Redoing the build typically fixes it. I made a script to just keep retrying until status code == 0 :stuck_out_tongue_winking_eye:
I would have expected that post 2.7 would have been more robust wrt this kind of breakages (because of https://github.com/NixOS/nix/pull/6221), but apparently it's not :/
Unfortunately I can't really reproduce, but I think both issues might be different since the first one appears at the beginning of the build and the second one at the end.
ca-derivations breaking drv->out and (apparently?) not expose its internal lookup mechanism Maybe
nix realisation info --json
could help?
Yeah there errors I'm seeing are a bit different, not sure if they're the same underlying problem.
It may be hard to reproduce without building a lot of things in parallel across different builders... I don't know if that's what triggers it but that's always where I've seen it. I could try doing a full build locally and without parallelism and see if it comes up but it will take long time.
I already have a framework for retrying on certain kinds of crashes, may be able to plug that new error in.
But sounds like I should try upgrading to 2.8 also? That may take a while but we should do it eventually anyway. The upgrade to 2.4 took about a year but hopefully things are better now.
ca-derivations breaking drv->out and (apparently?) not expose its internal lookup mechanism
Maybe nix realisation info --json could help?
Oh interesting, I didn't know it existed. When run on a drv it does seem to give the outPath, which is exactly what I was looking for! So I guess this is the new nix show-derivation
? Or maybe stuff that used to be in show-derivation but now may not be? I wouldn't have guessed from the help because it's about flakes and and building things and installables (not sure what that is) which doesn't make it sound like it's intended to be a "info about drv" tool.
BTW, I can now pretty easily reproduce this. Even after implementing a retry, it'll fail so much that on every single build it exceeds the 3 try maximum every time. Each time it makes a bit of progress, but somehow I stumbled across something that causes the error very frequently. Is there anything I can do to gather more data?
This is still on nix 2.4 BTW. An alternate path would be to work on upgrading to 2.10 just to eliminate old version as a possibility.
Just to be clear, this is the problem with output out of input /nix/store/blah.drv missing, aborting the resolving
. Notably, blah.drv itself is not ca, but it's now descended from one, so maybe this is related to a non-CA derivation with a CA parent?
I can also confirm this behaviour on 2.10.3; let me know if there is any debugging that would be helpful
This issue has been mentioned on NixOS Discourse. There might be relevant details there:
https://discourse.nixos.org/t/tweag-nix-dev-update-40/23480/1
Can you try with https://github.com/NixOS/nix/pull/7390 ? Hopefully it will fix that problem in the same way that #7283 fixed #6572
Wow, this is good news. Unfortunately my old code to use ca-derivations has surely gotten quite obsolete, so it'll take some time to bring it back to test this. I'll put it on the queue though!