serde_with
serde_with copied to clipboard
Consider adding helpers for better deserialize error messages for untagged enums
As described in https://github.com/jonasbb/serde_with/pull/586, serde
outputs unsatisfactory messages when an error occurs during untagged enum deserialization.
Recently, the serde
project explicitly rejected improving error messages for this situation by collecting deserialization errors, suggesting instead that applications use custom visitors to achieve better diagnostics. Incidentally, serde_with
has been identified as a candidate crate to provide reusable functionality to make it easier to create these kinds of custom visitors.
As a result, I'd love to hear if such a feature would be in scope for serde_with
. In my view, user-facing applications that parse configuration files will benefit most from this feature, as good error messages enable users to more easily troubleshoot and fix their configuration file syntax.
I would love to have support for that, but for now, I do not see a way how I can support that. The problem is that the error messages are generated by the Deserialize
implementations for each type. That means the only place where improvements can be made is in the derive macro that generates them. Creating a converter type/wrapper, like what most of this crate does, will not work here.
This means, fixing the issue requires forking serde_derive
. I would love to help with something like that, but that task of creating that and keeping it close to the serde upstream is too much for me to maintain.
There are a bunch of issues with the serde crate which can be addressed by changing/adding to the derive macros. serde_with
could benefit from it too, for example by eliminating the need for the serde_as
macro, instead the derive macros could directly support the new attributes.
Maybe such a fork will manifest at some point. There are some contributors to the serde repo and I saw some talk about the idea on Discord too. Or maybe serde will implement enough features. It was more activity there in the last weeks.
Thanks for your prompt reply!
This means, fixing the issue requires forking
serde_derive
. I would love to help with something like that, but that task of creating that and keeping it close to the serde upstream is too much for me to maintain.
I have been maintaining such a fork for over a year now, and I would gladly welcome help with that. However, I don't think that forking is a tenable long-term solution.
Right now, my fork does not build when third-party crates patch the serde_derive
dependency in their Cargo.toml
to use it, due to recent changes to use precompiled serde_derive
binaries, and those changes are so large and young that, quite frankly, I'm on the verge of running out of patience to bother anyone with this tedious work.
So far, serde
has not been particularly communicative, supportive or thoughtful about the ease of maintaining this fork, and I don't expect this to improve, especially given their now explicit unwillingness to support collecting error messages when deserializing untagged enums out of the box.
The problem is that the error messages are generated by the
Deserialize
implementations for each type. That means the only place where improvements can be made is in the derive macro that generates them. Creating a converter type/wrapper, like what most of this crate does, will not work here.
For the reasons stated above, I want to focus my efforts on ditch a fork and build a less brittle, long-term solution instead. This means that we have to play by the rules of serde
's API and somehow come up with clever ways to wrap the opaque Deserialize
implementation for an untagged enum type in a way that shows better error messages.
I gave this angle a try on this Rust Playground snippet, where I managed to replace the default data did not match any variant of untagged enum
error message with a custom one. With some improvements (gathering type info to build error messages, applying or creating the wrapper type when and only when the Deserialize
implementation is automatically derived on an untagged enum...), I can see a somewhat clear path to improve error messages with the current serde
API.
If we were to implement this functionality, what small, likely acceptable by upstream things would we need serde
to add to achieve it? I think that answering this question would lead to a more productive dialogue to achieve the goal of better error messages.
I have been maintaining such a fork for over a year now, and I would gladly welcome help with that. However, I don't think that forking is a tenable long-term solution.
Are you trying to fork serde+serde_derive? That would be more powerful, but I am uncertain if requiring patching the dependency for every user is worth it. I was more thinking or providing a separate crate with new proc-macros, similar to how serde_repr
, serde_tuple
, serde-indexed
, serde_query
, or serde_with
all provide proc-macros for implementing Serialize
or Deserialize
.
I gave this angle a try on this Rust Playground snippet, where I managed to replace the default
data did not match any variant of untagged enum
error message with a custom one. With some improvements (gathering type info to build error messages, applying or creating the wrapper type when and only when theDeserialize
implementation is automatically derived on an untagged enum...), I can see a somewhat clear path to improve error messages with the currentserde
API.
I am not sure what you are trying to do in the snippet. Your goal is to replace data did not match any variant of untagged enum UntaggedEnum
with cannot parse either `A` or `B` for UntaggedEnum
? But that still doesn't tell you why deserialization has failed. Overall, PrettyErrorWrapper
looks like a complicated variant of serde(expecting = "...")
:
#[derive(Deserialize, Debug)]
#[serde(untagged, expecting = "cannot parse either `A` or `B` for UntaggedEnum")]
enum UntaggedEnum {
A(u64),
B(String),
}
If a different (static) error message is all you want, you can create a proc-macro, that runs before the derive(Deserialize)
and adds the serde(expecting = "...")
attribute. All without changes to serde.
Are you trying to fork serde+serde_derive? That would be more powerful, but I am uncertain if requiring patching the dependency for every user is worth it. I was more thinking or providing a separate crate with new proc-macros, similar to how
serde_repr
,serde_tuple
,serde-indexed
,serde_query
, orserde_with
all provide proc-macros for implementingSerialize
orDeserialize
.
Yes, that's exactly what I did. Your idea of providing custom proc-macros for deriving Deserialize
in separate crates sounds much better, though!
I am not sure what you are trying to do in the snippet. Your goal is to replace
data did not match any variant of untagged enum UntaggedEnum
withcannot parse either `A` or `B` for UntaggedEnum
? But that still doesn't tell you why deserialization has failed. Overall,PrettyErrorWrapper
looks like a complicated variant ofserde(expecting = "...")
: [...]
With that snippet I was trying to bring forward a general way to improve error messages. The fact that it replaces the error message with a constant string is incidental: the code could easily be modified to display a non-constant string, unlike with the expecting
attribute (which, by the way, I didn't know about because it's undocumented, so thank you a bunch for bringing it to my attention!).
Ideally, I would like serde-with
(or some other crate) to provide an ergonomic way for applications to generate more helpful, context-aware error messages when data can't be deserialized into any variant of an untagged enum. The way I see it, this can be achieved by:
- Collecting the deserialization errors for each variant into a single error for the entire enum, like several PRs for
serde
attempted. (This does not seem possible to do with the currentserde
API without customDeserialize
implementations, though. Custom proc-macro derivedDeserialize
implementations may also leverage custom attributes to offer the customization described in the next point.) - Introducing builders or macros to create custom
Deserialize
wrappers to customize error messages with better context information (e.g., the possible variants, choosing whether to show the error for the variant where most fields were deserialized...). In other words, "productionizing" the snippet I shared.
As a result, I suppose this issue boils down to whether it would be nice for serde-with
to either offer a new proc-macro to derive Deserialize
or new wrapper types so that it shows better errors for untagged enums. For either of these approaches (or other approaches you might come up with!), what's the missing piece to get this done? I would love to help in some way with this.
I do not see a way forward with wrappers that goes beyond a static message. There is no way for wrappers to get any information about why deserialization fails. Not without changing the Deserialize
implementation of the untagged enum. You can already get perfect error messages with a handwritten Deserialize
implementation. Writing a proc-macro to do the same is possible, but will either be limited (no support for many of the serde attributes) or it ends up a fork of serde_derive. As already explained, I am unable to maintain such a derive macro.
If you find a way to make the wrappers emit a dynamic message explaining what failed, then I would very much like to see that.
serde attributes are indeed not always documented, but expecting
is listed under container attributes.
If you find a way to make the wrappers emit a dynamic message explaining what failed, then I would very much like to see that.
I understand. I will let you know if I find some way to do it!
serde attributes are indeed not always documented, but
expecting
is listed under container attributes.
That attribute has just been documented moments ago thanks to a PR I submitted :wink: