serde icon indicating copy to clipboard operation
serde copied to clipboard

Binary serialize/deserialize inconsistency of enums with skipped fields

Open farmaazon opened this issue 11 months ago • 1 comments

When running tests on this snippet:

use serde::{Serialize, Deserialize};

#[derive(Debug, Eq, PartialEq, Serialize, Deserialize)]
enum Abcd {
    A,
    #[serde(skip)]
    B,
    C,
    D,
}

#[cfg(test)]
mod tests {
    use super::*;
    #[test]
    fn bincode() {
        let c = Abcd::C;
        let serialized = bincode2::serialize(&c).unwrap();
        let deserialized: Abcd = bincode2::deserialize(&serialized).unwrap();
        assert_eq!(deserialized, Abcd::C);
    }

    #[test]
    fn postcard() {
        let c = Abcd::C;
        let serialized = postcard::to_allocvec(&c).unwrap();
        let deserialized: Abcd = postcard::from_bytes(&serialized).unwrap();
        assert_eq!(deserialized, Abcd::C);
    }
}

both of them fail, because deserialized is a variant D.

This seems to be a serde_derive problem: when expanding macro, the serialization the value index of C variant is 2:

// ...
        fn serialize<__S>(&self, __serializer: __S) -> _serde::__private::Result<__S::Ok, __S::Error> where __S: _serde::Serializer, {
            match *self {
                Abcd::A => _serde::Serializer::serialize_unit_variant(__serializer, "Abcd", 0u32, "A" ),
                Abcd::B => _serde::__private::Err(_serde::ser::Error::custom("the enum variant Abcd::B cannot be serialized")),
                Abcd::C => _serde::Serializer::serialize_unit_variant(__serializer, "Abcd", 2u32, "C" ),
                Abcd::D => _serde::Serializer::serialize_unit_variant(__serializer, "Abcd", 3u32, "D" ),
            }
        }
// ...

While the FieldVisitor in derived Deserialize expects C being represented by 1u64 if I read code correctly:

// ...
struct __FieldVisitor;
            impl<'de> _serde::de::Visitor<'de> for __FieldVisitor {
                type Value = __Field;
                fn expecting(&self, __formatter: &mut _serde::__private::Formatter) -> _serde::__private::fmt::Result { _serde::__private::Formatter::write_str(__formatter, "variant identifier") }
                fn visit_u64<__E>(self, __value: u64) -> _serde::__private::Result<Self::Value, __E> where __E: _serde::de::Error, {
                    match __value {
                        0u64 => _serde::__private::Ok(__Field::__field0),
                        1u64 => _serde::__private::Ok(__Field::__field2),
                        2u64 => _serde::__private::Ok(__Field::__field3),
                        _ => _serde::__private::Err(_serde::de::Error::invalid_value(_serde::de::Unexpected::Unsigned(__value), &"variant index 0 <= i < 3" )),
                    }
                }
// ...

And indeed, when I implement Deserialize by hand changing only those values, the tests pass.

I'd expect that when both Serialize and Deserialize are derived, then serializing and then deserializing a value always gives same value, regardless of the data format.

farmaazon avatar Sep 12 '23 07:09 farmaazon

I dived into the code and seemingly the problem is that we lose information about actual variant index around here: https://github.com/serde-rs/serde/blob/360606b9a63ba4e594dc20ddf0e20228b60b34cb/serde_derive/src/de.rs#L1217

Here the enumerate is properly called before filter, but the index is used only to construct fieldX identifier.

Then in https://github.com/serde-rs/serde/blob/360606b9a63ba4e594dc20ddf0e20228b60b34cb/serde_derive/src/de.rs#L1339 the fields are enumerated again, this time without the skipped variant.

I could create a PR fixing this, but I'm not sure what is the policy about such breaking changes?

farmaazon avatar Sep 12 '23 07:09 farmaazon