serde icon indicating copy to clipboard operation
serde copied to clipboard

Collect errors when deserializing untagged enums

Open killercup opened this issue 5 years ago • 18 comments

This also adds a verbose-debug feature that enables the new behavior.

Previously:

data did not match any variant of untagged enum ReferenceOr at line 6 column 4

Now, with verbose-debug feature enabled:

data did not match any variant of untagged enum ReferenceOr

  • attempted to deserialize Reference but failed with: missing field $ref
  • attempted to deserialize Item but failed with: invalid value: string "lol", expected expected format \dXX at line 6 column 4

cf. #773

killercup avatar Jun 01 '19 14:06 killercup

See this more as a draft that just happens to help me debug stuff locally :)

Bikeshedding of feature name and error format very welcome.

killercup avatar Jun 01 '19 14:06 killercup

Just used this to debug some nested enums that had a typo. Very helpful! Thanks!

BlinkyStitt avatar Aug 16 '19 22:08 BlinkyStitt

@dtolnay are you interested in merging this? I'm not sure what Serde's approach to verbose debugging is/whether there is any precedence for this.

killercup avatar Aug 17 '19 11:08 killercup

What would be holding us back from merging this in? Seems fantastic.

DerrikMilligan avatar Apr 05 '20 07:04 DerrikMilligan

I cannot get this branch to work. Compiling throws a lot of errors like:

error[E0277]: the trait bound `serde_yaml::Mapping: serde::Deserialize<'_>` is not satisfied
  --> src/config.rs:48:2
   |
48 |     #[serde(default)]
   |     ^ the trait `serde::Deserialize<'_>` is not implemented for `serde_yaml::Mapping`
   |
   = note: required by `serde::de::SeqAccess::next_element`

piegamesde avatar Apr 16 '20 20:04 piegamesde

how to use it along with serde_json cause I can override serde version but not serde_json version of serde

Well after try everything, I clone serde_bytes, serde_json, serde, did some git rebase and it's worked. BTW, that help me a lot, need this feature in serde but maybe named verbose_debug

Stargateur avatar Aug 09 '20 11:08 Stargateur

You can use [patch] in your Cargo.toml. Check out this section of the rust book: https://doc.rust-lang.org/cargo/reference/overriding-dependencies.html

On Aug 9, 2020, at 4:09 AM, Antoine PLASKOWSKI [email protected] wrote:

 how to use it along with serde_json cause I can override serde version but not serde_json version of serde

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

BlinkyStitt avatar Aug 09 '20 15:08 BlinkyStitt

You can use [patch] in your Cargo.toml. Check out this section of the rust book: https://doc.rust-lang.org/cargo/reference/overriding-dependencies.html On Aug 9, 2020, at 4:09 AM, Antoine PLASKOWSKI @.***> wrote:  how to use it along with serde_json cause I can override serde version but not serde_json version of serde — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

This wouldn't work cause there would be two version of serde, and so as @piegamesde reported this cause Deserialize to not be implement for serde_json version of serde.

Stargateur avatar Aug 09 '20 20:08 Stargateur

Thanks for doing this work. Is this behind a flag because the message is too long or because collecting the debugging information comes at a performance cost (or something else)?

yaymukund avatar Sep 17 '20 22:09 yaymukund

Just had a use for this again because the error messages serde gives me when reading in a config file are not helpful to users.

@dtolnay I'd really appreciate if you could have a look at this :)

killercup avatar Sep 25 '20 11:09 killercup

+1 for adding this or more extensive error messages - some failures are very hard to debug.

andrewdavidmackenzie avatar Sep 30 '20 12:09 andrewdavidmackenzie

Can any of the maintainers take a look at this change? Debugging deserialization problems into enums is basically impossible without it.

calavera avatar Dec 31 '20 01:12 calavera

I know it has been asked already but I think maintainers should seriously consider checking this. It looks like a small change and it will be of huge help to all of us using untagged enums in our projects.

panther99 avatar May 27 '21 20:05 panther99

Is there any reason why this is not being considered? This would be so helpful for actually finding errors in YAML files.

torkleyy avatar Jul 29 '21 08:07 torkleyy

I really need something like this. Unfortunately I was not able to use the git version in the PR repo because it's a bit outdated. My use-case: I'm querying an API that either returns an error JSON or the value requested. The API is apparently unstable because after some time, my code no longer works. I'm trying to work out what's not matching in the JSON, but it's really difficult because it's a huge chunk and it almost fits. Is it a type not matching? Is it a field that became optional? I have no way of knowing without looking at each field in my struct and trying to manually match that against the JSON I got.

My workaround is likely going to be: identify an error response without deserializing (i.e. dumb string matching) and then deserializing only the concrete types and use serde_path_to_error to gain some insight.

sztomi avatar Aug 29 '21 11:08 sztomi

@dtolnay ?

Stargateur avatar Aug 29 '21 11:08 Stargateur

@dtolnay there seems to be a fair amount of interest for this issue. We'd all appreciate if you let us know what your concerns are, and if you see a path for it to get merged.

nirvana-msu avatar Feb 11 '22 10:02 nirvana-msu

For anyone who encounters this issue today, you need a branch where the serde version matches your other dependencies.

I took a similar commit (that uses no compiler feature flag) from this fork: https://github.com/jonasbb/serde/tree/dev and cherry picked it on top of mainline's master.

Result: https://github.com/kurtbuilds/serde is version 1.0.144 of serde, and it will give you better error messages. Use it with these lines in your Cargo.toml

[patch.crates-io]
serde_derive = { git = "https://github.com/kurtbuilds/serde" }

Hopefully this is mainlined soon.

kurtbuilds avatar Sep 13 '22 01:09 kurtbuilds

Serde has such a fast release cadence that the repository that @kurtbuilds kindly provided is already out of date. If anyone wants this patch, they practically need to keep their own automatically-updated fork with the patch (using tools such as the wei/pull GitHub Action), or else bend over backwards to pin to a Serde version and make sure that nothing ends up pulling a slightly different one, including dependencies of dependencies. Both approaches are inconvenient and contribute to technical debt in downstream applications.

What has been blocking this improvement for more than three years now? It's a really, really welcome change for those that use Serde to parse user-provided configuration.

Edit: I've set up a fork that is automatically updated with wei/pull here. It should be easy to keep it up to date at least.

Edit 2: wei/pull was not working as expected and I could not troubleshoot the cause. I've switched to a periodic GitHub Actions job that automatically opens and merges PRs instead. The fork is updated to 1.0.147 at the time of writing. For reference, the only relevant patch commit applied to my fork is this one: https://github.com/ComunidadAylas/serde/commit/553519b9f110e0fc56229ad8444053ec032a2511

Edit 3 (November 28th, 2022): wei/pull suddenly started "working" and rewriting the tree so that the patches were discarded. I've disabled it for good and reviewed the GitHub Actions sync job to merge upstream commits by squashing them, so hopefully commit hashes stay constant without the noise of merge commits. I've also updated the fork to 1.0.148. The aforementioned patch commit is now this one.

AlexTMjugador avatar Oct 01 '22 17:10 AlexTMjugador

@dtolnay Did you have any thoughts on this approach? When trying to build user facing tools/services this would be a big help.

If you have thoughts on a different approach that's cool too. It'd be great to get some collaboration going on it.

coltfred avatar Nov 30 '22 21:11 coltfred

https://github.com/serde-rs/serde/pull/2403 introduced new merge conflicts for this PR, which I've just fixed in my fork (if anyone's using my fork and notices any problem due to that, please file an issue on the fork). If conflicts are keeping this PR from being merged, I'd be happy to contribute and submit my updated patches upstream.

Edit (2023-08-04): https://github.com/serde-rs/serde/commit/6f1f38d0467fdf4659d3579e7dfc861ad336f1fc introduced yet another merge conflict, which I've fixed in my fork. The full list of changes from upstream that will be updated automatically as I resolve conflicts can be found here.

Edit 2 (2023-08-04): the above conflict resolution was not enough to get serde to build on at least some dependent projects, as the Deserialize and Serialize procedural macro re-exports at serde fail to locate such procedural macro definitions on serde_derive. I suspect this is caused due to recent changes to precompile serde_derive, but I'm at a loss as why this happens. I'm exploring more maintainable solutions based on custom visitors instead, as recommended in the PR closing comment below.

AlexTMjugador avatar Jun 11 '23 10:06 AlexTMjugador

When will this PR be merged? This is really useful! There is so little misinformation now!

0xffffharry avatar Aug 04 '23 14:08 0xffffharry

One way to substitute this is with serde_path_to_error, for me, that has completely satisfied what this pr did.

Emilgardis avatar Aug 04 '23 16:08 Emilgardis

Thank you @dtolnay for your clarification on the status of this PR - I appreciate being decisive and explicit with PRs as important to certain users as this one has been for years.

AlexTMjugador avatar Aug 04 '23 17:08 AlexTMjugador

This approach was rejected in a different PR at #2376.

@dtolnay I'm sorry but this just doesn't make sense to me. The reasoning in rejecting #2376 suggests that this is a hack and there's a better way to do this, even though that's not true at all.

EDIT: serde-with has also been suggested as a place to implement this, and they also said it's not doable without forking serde_derive.

Implementing a visitor only works when the enum doesn't require rolling the parser back which is not possible in serde or buffering the input, which is as far as I know only possible with an internal API __private::de::Content. The options are to reimplement or use directly. Using it directly is not ideal since it's a private unstable API. Reimplementation seems to be at least 1000 LOC of duplicated code and effort.

There's also serde-untagged which basically only supports matching on the type, so it's not really a solution either. And even then it's also quite complex at more than >1000 LOC.

Forking serde and serde_derive is obviously complicated since that would also require patching the entire dependency chain to use that fixed version. That is on top of maintaining the fork, keeping it up to date and resolving any conflicts.

Implementing a derive macro is not a good option either since it has the same issues as implementing a custom visitor, on top of the even higher complexity of implementing the derive macro.

There is quite a decent amount of interest in this since this PR is 2nd most upvoted PR and would be the 5th most upvoted issue. And there is no good solution still. This PR basically provides the only solution that is not overly complex and works in every case. I think this feature is pretty much basic and a must-have, since currently it's basically impossible to tell what went wrong in an untagged enum, especially when it's the top level type or has a lot of children, then the entire error is completely worthless.

33KK avatar Sep 07 '23 01:09 33KK

When trying to use serde-untagged I also ran into issues with not being able to roll the parser back thus not being able to easily deserialize an enum containing multiple structs. However I guess an alternate implementation could potentially either fork/copy over the Content API or buffer to some other intermediate struct such as serde_json::Value - it does however require re-implementing a fair bit of code for each struct just to get error messages that are decent.

GeeWee avatar Sep 07 '23 04:09 GeeWee

Adding to some of the points exposed by @33KK:

Forking serde and serde_derive is obviously complicated since that would also require patching the entire dependency chain to use that fixed version. That is on top of maintaining the fork, keeping it up to date and resolving any conflicts.

I've been doing that maintenance work for a while now, so other people don't have to. Maintaining that fork has been straightforward so far, barring the time when precompiled binaries where introduced, but they ended up being reverted for a plethora of reasons. That does not solve the problem that patching serde and serde_derive is only viable for end applications, but fortunately those are the most affected by these cryptic error messages.

[...] I think this feature is pretty much basic and a must-have, since currently it's basically impossible to tell what went wrong in an untagged enum, especially when it's the top level type or has a lot of children, then the entire error is completely worthless.

While I agree with this, I recently stumbled upon and documented the little-known expecting container attribute, which allows setting a custom fixed error message when deserializing untagged enums. By design, such a message lacks any situational information, unlike the changes proposed by this PR, but applications can write it so to be user-friendly enough for most circumstances, and supersede this PR for the applicable use cases.

AlexTMjugador avatar Sep 07 '23 09:09 AlexTMjugador