Should we standardize error messages?
There's a quote I can't find anymore that goes something like
You think a compiler is a program that takes good code and produces an executable. I say a compiler is a program that takes bad code and produces diagnostics.
Diagnostics (aka error messages) are hard, but they make a profound difference in the experience of a user. A good message helps the user find the source of the error, or the relevant documentation. It can also help avoid common pitfalls.
As far as I see, dhall-haskell's error messages are pretty good, dhall-rust's are appalling, and I have no idea about other implementations. Yet I think this should be a major distinguishing factor for dhall, that makes it truly great (and that was one of it's selling points early on IIRC).
So I was thinking: could error messages be part of the dhall standard? Currently anything that does not match the typing rules is considered incorrect, but there's nothing saying what to do with such code. Also, effort is duplicated between implementations.
Also, some features could greatly improve the helpfulness of error messages, and I'm thinking of bidirectional type inference: even if we don't allow omitting any more types than we currently do, I believe it's way better at finding where the error "really" comes from, compared to the current approach.
I’m a weak +1 to this idea. I agree that error messages are really important and I think standardising on them is a good idea. Dhall-golang has literally copypasted a bunch of error messages from dhall-haskell. So I can see benefits from moving error messages into a machine-readable format inside the dhall-lang repo. If we were feeling cute, the error messages could even be in Dhall format.
I’m just a little hesitant because this could raise the barrier to entry for new implementations (though I can also see how it might lower it). Maybe we can make it an optional part of the standard?
Finally, there is the related issue of dhall-haskell’s detailed error messages (from the —explain command line switch) which I think we should leave out of the standard for now (but we could revisit this later).
I would have very much appreciated standardized error messages (and tests for them) when working on Dhall for Java (I also did a lot of copy-pasting from dhall-haskell, and am not terribly confident everything matched).
Yeah, I would be fine with this
Oh yeah, tests for error messages would be very helpful
dhall-haskell has an --explain flag for more verbose error messages. Should we standardize those as well? If so, tests files could be named with suffices like:
A.dhall, for sourceError.dhall, for brief error messagesErrorVerbose.dhall, for long error messages
I actually was also interested in the typechecking part of the error messages, in particular which bit of the source should we reference to explain the error, or which type should we mention where. Yet again probably not something we can do before #959 :/ Standardizing actual error outputs is nice too!
I would add that the more you standardize, the less freedom you give to implementers. I think rather than standardising error messages, it would be better to provide a repository of templates or something. If people have an implementation that can use them, great, but it still lets people do other things.
Another option would be to standardize an error code instead of an error message. What I have in mind is something similar to what shellcheck or rust do. (something like DH0001 or any format that would be easily google-able)
Then we could have a wiki / page on the website to index all the existing error codes with detailed explanations.
This way, implementers would still have some freedom in how they want to present the errors while being able to re-use all or parts of the reference error message if they want.
When standardizing error codes, we can do so incrementally and let the various dhall implementations error as they please for non standard errors (as they currently do). Once an error code is standardized, an implementation that is complying with the standard should at least expose an API (CLI flag, env var, library function) that outputs only the error code when encountering such an error (that would be useful for testing error code compliance), and ideally also show it to the user as part of their usual error message flow.
I like it and I think it gets more useful as Dhall is pushed further down the stack. Error messages in logs would be concise and the diagnostic procedure would be the same regardless of implementation.
Hello, i'm writing this as an end-user configuring cheogram-muc-bridge using the dhall language (haskell implementation). In about an hour of manual configuration and templating (from jinja2) i've encountered two situations where the error messages were useless (tickets: 1 2) and it took some time and a fresh pair of eyes to spot the mistake.
Please standardize error messages. Standard error codes and messages are useful to people like me, and given a proper test suite they're useful to people writing an implementation for the file format to detect the corner cases they're not handling well. They make the entire ecosystem better, and enable localization when required to deliver better feedback to the user.
Contrary to what i read above, having standards does not mean less freedom for implementers. Everyone is free to deviate from a standard. However, if you intend to have a thriving ecosystem of compatible implementations, error codes/messages should not be an afterthought. There's nothing more confusing than a "standard" format that behaves differently from one library to another, see for example the Markdown interoperability problem and why nowadays most people uses/implements CommonMark specification instead of just doing random stuff according to their varying tastes.
@southerntofu: I'm in favor of standardizing error messages, but I believe those particular error messages you mentioned would not have been fixed by this. Those are parsing errors and parsing errors are much much harder to standardize on because you'd have to standardize on a very specific parsing algorithm to ensure uniformity of error messages.
My understanding is that the original proposal here is to standardize on type errors, which are much easier to bring into alignment between implementations.