rust-csv
rust-csv copied to clipboard
Error Deserializing Struct With `flatten`-ed Tagged Enum
Version
Tested on both v1.1.6
and master
branch.
Briefly describe the bug.
Deserializing a struct with a flattened tagged enum field does not seem to work.
Include a complete program demonstrating a problem.
Code:
#![allow(dead_code)]
use serde::de::DeserializeOwned;
use std::fmt::Debug;
mod tagged_enum {
use serde::Deserialize;
#[derive(Debug, Deserialize)]
pub struct Record {
common: u64,
#[serde(flatten)]
kind: Kind,
}
#[derive(Debug, Deserialize)]
#[serde(tag = "kind", content = "parameter", rename_all = "lowercase")]
enum Kind {
Foo(u64),
Bar(bool),
}
}
mod flattened_struct {
use serde::Deserialize;
#[derive(Debug, Deserialize)]
pub struct Record {
common: u64,
#[serde(flatten)]
inner: Inner,
}
#[derive(Debug, Deserialize)]
struct Inner {
kind: Kind,
parameter: Value,
}
#[derive(Debug, Deserialize)]
#[serde(rename_all = "lowercase")]
enum Kind {
Foo,
Bar,
}
#[derive(Debug, Deserialize)]
#[serde(untagged)]
enum Value {
Num(u64),
Bool(bool),
}
}
const CSV: &[u8] = b"\
common,kind,parameter
1,foo,42
2,bar,true";
fn dbg_records<T>()
where
T: Debug + DeserializeOwned,
{
let mut reader = csv::Reader::from_reader(CSV);
for record in reader.deserialize::<T>() {
println!("{:#?}", record);
}
}
fn main() {
dbg_records::<tagged_enum::Record>(); // Error!
dbg_records::<flattened_struct::Record>(); // OK.
}
What is the observed behavior of the code above?
Fails to deserialize with invalid type: byte array, expected "kind", "parameter", or other ignored fields
error:
Err(
Error(
Deserialize {
pos: Some(
Position {
byte: 22,
line: 2,
record: 1,
},
),
err: DeserializeError {
field: None,
kind: Message(
"invalid type: byte array, expected \"kind\", \"parameter\", or other ignored fields",
),
},
},
),
)
Err(
Error(
Deserialize {
pos: Some(
Position {
byte: 31,
line: 3,
record: 2,
},
),
err: DeserializeError {
field: None,
kind: Message(
"invalid type: byte array, expected \"kind\", \"parameter\", or other ignored fields",
),
},
},
),
)
What is the expected or desired behavior of the code above?
To correctly deserialize records as they are done in the case of flattened structs (also included in code example).
I did a little more digging, using cargo +nightly rustc --profile=check -- -Zunpretty=expanded
to check out the generated Deserialize
implementation.
It looks like flattened structs generate a "field visitor" type that looks like this:
impl<'de> _serde::de::Visitor<'de> for __FieldVisitor {
type Value = __Field;
fn expecting(&self,
__formatter: &mut _serde::__private::Formatter)
-> _serde::__private::fmt::Result {
_serde::__private::Formatter::write_str(__formatter,
"variant identifier")
}
fn visit_u64<__E>(self, __value: u64)
-> _serde::__private::Result<Self::Value, __E> where
__E: _serde::de::Error {
match __value {
0u64 => _serde::__private::Ok(__Field::__field0),
1u64 => _serde::__private::Ok(__Field::__field1),
_ =>
_serde::__private::Err(_serde::de::Error::invalid_value(_serde::de::Unexpected::Unsigned(__value),
&"variant index 0 <= i < 2")),
}
}
fn visit_str<__E>(self, __value: &str)
-> _serde::__private::Result<Self::Value, __E> where
__E: _serde::de::Error {
match __value {
"foo" => _serde::__private::Ok(__Field::__field0),
"bar" => _serde::__private::Ok(__Field::__field1),
_ => {
_serde::__private::Err(_serde::de::Error::unknown_variant(__value,
VARIANTS))
}
}
}
fn visit_bytes<__E>(self, __value: &[u8])
-> _serde::__private::Result<Self::Value, __E> where
__E: _serde::de::Error {
match __value {
b"foo" => _serde::__private::Ok(__Field::__field0),
b"bar" => _serde::__private::Ok(__Field::__field1),
_ => {
let __value = &_serde::__private::from_utf8_lossy(__value);
_serde::__private::Err(_serde::de::Error::unknown_variant(__value,
VARIANTS))
}
}
}
}
Meaning it accepts map keys as:
- indices starting from 0
-
&str
matching the names of the fields -
&[u8]
matching the UTF-8 bytes of the names of the fields
In contrast, tagged enums use a "tag-content-other field visitor":
impl<'de> Visitor<'de> for TagContentOtherFieldVisitor {
type Value = TagContentOtherField;
fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
write!(
formatter,
"{:?}, {:?}, or other ignored fields",
self.tag, self.content
)
}
fn visit_str<E>(self, field: &str) -> Result<Self::Value, E>
where
E: de::Error,
{
if field == self.tag {
Ok(TagContentOtherField::Tag)
} else if field == self.content {
Ok(TagContentOtherField::Content)
} else {
Ok(TagContentOtherField::Other)
}
}
}
Note that the tagged enum field visitor only accepts &str
field names. This is a problem since the deserializer uses "borrowed bytes deserializer" which forwards all deserialization calls to visit_bytes
which TagContentOtherFieldVisitor
does not implement (or, to be more specific, implements the method by returning an error unconditionally):
https://github.com/BurntSushi/rust-csv/blob/41c71ed353a71526c52633d854466c1619dacae4/src/deserializer.rs#L647
So, I'm not sure how this should be fixed. Should
-
csv
do a little UTF-8 decoding dance?if let Ok(field) = str::from_utf8(field) { seed.deserialize(BorrowedStrDeserializer::new(field)).map(Some) } else { seed.deserialize(BorrowedBytesDeserializer::new(field)).map(Some) }
-
serde::TagContentOtherFieldVisitor
should also add implementations forvisit_bytes
?
Personally, the second option looks "cleaner" to me. Not sure what others think.
I think I have much smaller example that might be related to this issue.