Locking issues
With my current version of nix, I observe the following behaviour after some calls to nix. The command hangs forever.
$ nix-build ...
these derivations will be built:
/nix/store/<hash>-<name>.drv
waiting for locks or build slots...
This occurs after cancelling some other builds through the daemon.
When I cancel a user nix-build (or nix-env) command, it shows me an error: interrupted by the user.
This error also appears in the logs of the nix-daemon:
aoû 07 23:05:58 ankh-morpork systemd[1]: Started Nix Daemon.
aoû 07 23:05:58 ankh-morpork nix-daemon[2971]: accepted connection from pid 2970, user layus
aoû 07 23:06:41 ankh-morpork nix-daemon[2971]: accepted connection from pid 3070, user layus
aoû 07 23:07:06 ankh-morpork nix-daemon[2971]: unexpected Nix daemon error: interrupted by the user
aoû 07 23:08:38 ankh-morpork nix-daemon[2971]: accepted connection from pid 3442, user layus
aoû 07 23:09:29 ankh-morpork nix-daemon[2971]: unexpected Nix daemon error: interrupted by the user
...
and so on. My guess is that after some such cancellations, there is no build user available anymore.
Do you have any idea of
- How to clear the locks without rebooting ?
- How to investigate the issue ?
Needless to say that not being able to build anything is a huge annoyance :-).
- Are the
nix-daemonprocesses unkillable? - If you run
df, does that also block indefinitely?
Thanks @rvl for your hint about df. nix-daemon are pretty much unkillable, and yes, df also blocks. This made me think that I have a mounted samba share over an (Open)VPN connection. Restarting the systemd automount unblocks df and nix, so this is clearly related to my mount point being somehow "stale".
So my problem is half-solved. Why does nix block on a mountpoint that is does not have to use ? (And, for my own sanity, why does that mountpoint fail without really failing ?)
Anyway, thanks again @rvl.
I can't remember what I fixed exactly when I had the same issue. Maybe I had configured a dodgy mount point.
Nix probably needs to enumerate mount points to check if /nix/store is mounted correctly, or something like that.
I'm not sure how to rescue Linux out of that state unfortunately. Maybe use of lazy umount -l or finding the right process to kill.
nix-collect-garbage -d also helps with this
I marked this as stale due to inactivity. → More info
I also have locking issues when connection to a remote builder fails. There does not seem to be a timeout.
A suggestion: Nix should implement a subcommand such as nix break-lock <store path> (modeled after BorgBackup) to automatically find and kill the associated daemon that holds the lock for that store path.
I'm currently in this situation, despite collecting garbage, restarting -, killing the nix-daemon. I have no clue how to realise this one derivation that has a lock on it.
happened to me after ^c-ing a nix-shell that was building mongodb unexpectedly. nix-collect-garbage -d didn't help. no solution yet.
EDIT: for some reason killing nix-shell didn't stop the build and it's still running. So the solution is just to wait for the mongodb built to finish. Rebooting or systemctl restart nix-daemon.service would stop the build but in my case i don't want to.
FWIW, in order to remove a stuck lock, I had to exit one of the nix-shell instances that was using the nix expression that wanted the thing to be built (in my case chromium). I couldn't kill any nix processes until all instances of that nix-shell were exited.
killing all the nix processes (in my case I just killed all the processes :P of my $USER using killall --user $USER ) and then sudo systemctl restart nix-daemon.service helped
On macOS, this worked for me:
killall nix
sudo launchctl kickstart -k system/org.nixos.nix-daemon
nix-collect-garbage -d