Confusing error when cache doesn’t match source
If I have two files:
- ./A
\(x : {}) -> x
- ./B
./A {=}
And I freeze them, they get stored in the hash, but then I locally edit them to
- ./A
\(x : Bool) -> x
- ./B
./A sha256:...
True
(where ... is still the original hash from freeze).
Now, when I run dhall <<< "./B" I get an error containing
↳ ./B
↳ ./A
but when I look at the contents of ./A, it should compile.
The problem is that it actually retrieved ./A from the cache, and didn’t really look at the file it says it did.
I think, when there’s an error that comes from a file from the cache, dhall should then hash the uncached import, only to improve the error message:
↳ ./B
↳ ./A (from cache, which doesn’t match import)
And in the case where there’s a legit error in ./A, it could still make sense to do
↳ ./B
↳ ./A (from cache, matching import)
(since fixing ./A without updating/removing the hash from ./B will just result in the “doesn’t match import” error the next time around).
NB: I don’t think my error messages are particularly good, but hopefully they get the idea across.
@sellout: This is the behavior specified by the standard. You can even replace ./A sha256:... with missing sha256:... and it will still retrieve the import from cache and not complain about the missing.
This is also the same way Nix behaves, too: if you:
- add a fixed output hash to a derivation
- build the derivation
- change the derivation without changing the hash
- rebuild the derivation
... you will get a cache hit for the old build product even though it no longer matches the new derivation.
For both Dhall and Nix, the hash takes priority over the instructions for how to retrieve/compute the cached product. The "Intensional Model" section of Eelco Dolstra's Thesis explains the advantages of doing so.
@sellout: Oh, I see. Yeah, it would be a bit difficult, currently, since the AST doesn't currently preserve which parts of the subexpression originated from imports.
Maybe this is particular to how I'm using Dhall, but this ends up being a significant usability problem, as I have:
- An application that does a large number of Dhall evaluations,
- Running on embedded systems whose disks sometimes fail for various reasons, or they lose power at an inopportune moment, etc.
The end result is that users sometimes encounter extremely confusing error messages. Suppose I have:
$ cat a.dhall
42
$ cat b.dhall
./a.dhall + 42
$ dhall --file ./b.dhall
84
Suppose something bad happened to my disk, or the program doing Dhall evaluation was killed at an unlucky time, something like that:
$ : > ~/.cache/dhall-haskell/1220c39cde2e11e3d5a57cccbc06f6599256ece67b3d16d1bc1df1d0cfa79d9be605
Now the user will see this very confusing error:
$ dhall --file ./b.dhall
dhall:
↳ ./a.dhall
Error: Cannot decode CBOR to Dhall
The following bytes do not encode a valid Dhall expression
↳ 0x
1│ ./a.dhall
./b.dhall:1:1
What's more, the program doing the Dhall evaluation also uses CBOR to shuttle the expressions around, separately from the cache, so developers that see this error but don't know about dhall evaluation caching go on a wild goose chase.
This is tricky; I'm not sure what might make this better. It might be nice to have the option to disable the evaluation cache, but that could also make evaluation unacceptably slow. Maybe having some equivalent of nix-store --check-contents could help (I could have my program run that operation when evaluation fails, before showing any error message to the user)?
You can implement something like that check by looping over all files in the cache and doing dhall decode on them, like this:
$ for file in ~/.cache/dhall-haskell/*; do dhall decode --file "${file}" 1>/dev/null; done