serde
serde copied to clipboard
Combination of flattened internally-tagged enum and deny_unknown_fields results in unsatisfiable requirements
I think it is best explained by a bit of code:
#[macro_use]
extern crate serde_derive;
extern crate serde; // 1.0.70
extern crate serde_json; // 1.0.24
#[derive(Deserialize, Debug)]
#[serde(tag = "type")]
enum Y {
A {
x: i32,
},
B,
}
#[derive(Deserialize, Debug)]
#[serde(deny_unknown_fields)]
struct X {
#[serde(flatten)]
y: Y,
z: Option<i32>,
}
fn main() {
let x: X = serde_json::from_str(r#"{"type": "A", "x": 32 }"#).unwrap();
println!("{:?}", x);
}
This panics with „unknown field type“. If the field type is removed, it complains that type is missing (that feels like it can't make its mind if it wants type or not).
If the deny_unknown_fields on the X struct is removed, it is willing to parse it.
I ran into this, too.
The reason for it is kinda complicated. When parsing untagged or externally tagged enums, the enum has to parse every field looking for the "type" field, or other identifying features. Because it has to look at them all, it would normally consume every one, and then only use a few. However, that would break situations where multiple flattened fields are next to each other. So instead, it doesn't mark any field as consumed, allowing flattened fields to work normally. Unfortunately, that causes deny_unknown_fields to get false alarms.
The issue is complex enough that a solution seems impossible without a major rewriting of how flattening works.
It would probably be a good idea to add to the docs that deny_unknown_fields doesn't work when one of the fields is a flattened enum.
I've also got this issue.
I don't know much about serde's implementation, but maybe it wouldn't be too difficult to implement it like this:
#[derive(Deserialize, Debug)]
-#[serde(tag = "type")]
+#[serde(untagged)]
enum Y {
A {
x: i32,
},
B,
}
#[derive(Deserialize, Debug)]
#[serde(deny_unknown_fields)]
struct X {
- #[serde(flatten)]
+ #[serde(flatten_with_tag = "type")]
y: Y,
z: Option<i32>,
}
Where the flatten_with_tag is a combination of flatten, and inserting a new discriminator field "type".
Is there any workaround to this or a way to implement Deserialize manually to work around it?
EG:
#[derive(Deserialize)]
struct EnumName {
A,
B,
}
#[derive(Deserialize)]
#[serde(untagged)]
struct EnumValue {
A(AData),
B(BData),
}
struct Enum {
name: EnumName,
#[serde(flatten)]
value: EnumValue
}
impl Deserialie for Enum {
// ...
}
struct MyOutermostStruct {
other_field_1: &str,
#[serde(flatten)]
some_field: Enum,
other_field_2: u64,
}
Been looking into this, but I don't understand implementing Deserialize well enough to know what the proper way to do this would be.
And yeah, in the case I'm dealing with the enum data fields are flattened into the outer structure, as well as it being internally tagged.
An update, you can do this with a lot of boilerplate using:
#[serde(try_from = "...", into = "...")]
Then having a wrapper which mirrors the struct, but also contains a field to discriminate the variant.
Like this:
#[derive(Serialize, Deserialize)]
#[serde(try_from = "FooWrapper", into = "FooWrapper")]
pub struct Foo {
example: i32,
enum_value: FooEnum,
}
#[derive(Serialize, Deserialize)]
#[serde(deny_unknown_fields)]
pub struct FooWrapper<'a> {
#[serde(rename = "C")]
example: i32,
#[serde(rename = "A?")]
enum_value_variant: &'a str,
#[serde(flatten)]
enum_value: FooEnum,
}
impl<'a> std::convert::TryFrom<FooWrapper<'a>> for Foo {
type Error = String;
fn try_from(that: FooWrapper<'a>) -> Result<Self, Self::Error> {
let FooWrapper { example, enum_value, enum_value_variant } = that;
if enum_value_variant != enum_value.variant() {
return Err(format!("Mismatched variant, expecting '{}', got '{}'", enum_value_variant, enum_value.variant()));
}
Ok(Self { example, enum_value })
}
}
impl<'a> From<Foo> for FooWrapper<'a> {
fn from(that: Foo) -> Self {
let Foo { example, enum_value } = that;
let enum_value_variant = enum_value.variant();
FooWrapper { example, enum_value, enum_value_variant }
}
}
Where variant returns an identifier for each variant on the enum. The wrapper has a lifetime because I just used a &'a str.
I imagine this could all be done with derive macros, and more efficient reuse of the existing EnumAccess implementation rather than needing a separate variant method.
It fails appropriately when unknown fields are encountered, because the enum is externally tagged, which is unambiguous.
When X is deserialized, it drains external (serde_json) deserializer into internal buffer. Then it creates FlatMapDeserializer around it and expects, that all keys will be drained from it during deserialization of its fields.
When Y is deserialized it uses visit_map to drain provided deserializer (it is FlatMapDeserializer) entirely. Because FlatMapDeserializer is aware of such behavior of internally tagged deserializer, it gives references in MapAccess implementation and not drained:
https://github.com/serde-rs/serde/blob/05a5b7e3c6de502d45597cbc083f28bc1d4f4626/serde/src/private/de.rs#L2752-L2768
This problem could be related to #2200