nix-collect-garbage fails due to 'delete from ValidPaths where path = ?;': constraint failed
Describe the bug
Due to another issue where nix 2.3.11 and nix 2.5.1 usage with the same nix store corrupted the nix sqlite db, we found restored our db but it ended up with foreign key constraint violations.
When garbage collection runs, it eventually hits function invalidatePathChecked which begins a transaction, checks if number of referrers is zero (ignoring self-references), and then attempts to delete the ValidPath. At first glance, it seems impossible to fail the constraint because there is a transaction that wraps the reading of Refs and writing to ValidPath.
https://github.com/NixOS/nix/blob/280543933507839201547f831280faac614d0514/src/libstore/local-store.cc#L1533-L1540
However, if there is a row in Ref that references the ValidPath but its referrer doesn't have a corresponding ValidPath (due to missing data), then the query's INNER JOIN will return 0 (excluding self-references) and then error out with error: executing SQLite statement 'delete from ValidPaths where path = ?;': constraint failed because there is an ON DELETE RESTRICT violation.
https://github.com/NixOS/nix/blob/280543933507839201547f831280faac614d0514/src/libstore/local-store.cc#L355
This seems to be also the root cause of issue (running out of space during nix build): https://github.com/NixOS/nix/issues/2218#issuecomment-618410588
I know we should fix our sqlite db, but curious what makes sense going forward:
- Expect user to fix sqlite db and nothing else (we should at least make error less obscure, e.g.
pragma foreign_key_check) -
nix-store --verify --repairto fix these (perhaps by deletingRefsthat don't have a correspondingValidPath). Currently this command doesn't help.
Steps To Reproduce
- Copy db.sqlite as db2.sqlite
- sqlite3 ./db2.sqlite
- Don't enable
pragma foreign_keys = 1; - Delete a ValidPath that has a Ref
- Run
nix-collect-garbage - See
constraint failed
Expected behavior
It seems there are multiple ways to reaching this invalid state with the db. I would expect nix to maybe check pragma foreign_key_check or repair it.
Here's the workaround scripts to fix the database automatically, in case it's in severe disrepair: indiscipline/nix-db-repair.