serde_with icon indicating copy to clipboard operation
serde_with copied to clipboard

feat: force option to serialize as variants (with = option_as_enum)

Open CAD97 opened this issue 3 years ago • 3 comments

This is one way to e.g. successfully roundtrip things like Option<Option<u32>> into formats that have unwrap_or_null behavior by default (as opposed to e.g. double_option or unwrap_or_skip).

Example implementation:

use serde; // 1.0.130
use serde_json; // 1.0.67

mod option_as_enum {
    pub fn serialize<T, S>(value: &Option<T>, serializer: S) -> Result<S::Ok, S::Error>
    where
        S: serde::ser::Serializer,
        T: serde::ser::Serialize,
    {
        match value {
            None => serializer.serialize_unit_variant("Option", 0, "None"),
            Some(value) => serializer.serialize_newtype_variant("Option", 1, "Some", value),
        }
    }
    pub fn deserialize<'de, T, D>(deserializer: D) -> Result<Option<T>, D::Error>
    where
        T: serde::de::Deserialize<'de>,
        D: serde::de::Deserializer<'de>,
    {
        #[derive(serde::Deserialize)]
        #[serde(rename = "Option")]
        enum Maybe<T> {
            None,
            Some(T),
        }
        match serde::de::Deserialize::deserialize(deserializer)? {
            Maybe::None => Ok(None),
            Maybe::Some(value) => Ok(Some(value)),
        }
    }
}

#[derive(Debug, serde::Serialize, serde::Deserialize)]
struct S {
    #[serde(with = "option_as_enum")]
    f: Option<Option<u32>>,
}

fn main() {
    let vs = [
        S { f: None },
        S { f: Some(None) },
        S { f: Some(Some(0)) },
    ];
    let serialized = serde_json::to_value(&vs).unwrap();
    let deserialized: [S; 3] = serde_json::from_value(serialized.clone()).unwrap();
    println!("starting: {:#?}\nserialized: {:#}\ndeserialized: {:#?}", vs, serialized, deserialized);
}

Output, showing behavior (manually reformatted):

starting: [
    S { f: None, },
    S { f: Some(None) },
    S { f: Some(Some(0)) },
]
serialized: [
  { "f": "None" },
  { "f": { "Some": null } },
  { "f": { "Some": 0 } }
]
deserialized: [
    S { f: None, },
    S { f: Some(None) },
    S { f: Some(Some(0)) },
]

CAD97 avatar Sep 23 '21 02:09 CAD97

This seems like a useful addition. I think I would prefer this to exists as a serde_as compatible type, which allows it to be used for nested types. This could be used for Vec<Option<T>> or for Option<Option<Option<T>>>, if you want to ensure that all three layers of Options are serialized.

Would there be a benefit to also offer different enum representations, i.e., internally and adjacently tagged? Externally tagged is the more efficient one and does also work with non self-describing formats.

jonasbb avatar Sep 23 '21 20:09 jonasbb

I don't think internal/adjacent tagging would see any use; it's not really useful to have such structure for a unit variant, imho. But maybe I'm wrong; if the implementation effort is minimal enough, it might be worth providing just for completeness. That said, I do think providing the different ways to map Option to other common enum representations in the serde data model is a useful thing to provide.

  • unwrap_or_skip
    • none => {nothing}
    • some => {inner}
  • unwrap_or_unit (untagged)
    • none => unit
    • some => {inner}
  • as_option_variant (externally tagged)
    • none => unit_variant
    • some => newtype_variant
  • unwrap_or_tuple
    • none => unit
    • none => tuple { 0: {inner} } (this coerces transparent-newtype (encouraged) formats to emit a wrapper)
  • (internal, adjacent tagging)

CAD97 avatar Sep 23 '21 22:09 CAD97

I think the listed variants work ok and could be added. unwrap_or_skip and unwrap_or_unit seem very similar, though, and would only differ in the deserialization behavior. They can use the same serialization function, but in the unwrap_or_skip case it would never be called. For deserialization, you could also use the same function, which would permit deserialize "foobar": null in the unwrap_or_skip case. This might or might not be desired. Other than that, the difference is mainly in the serde attributes which need to be applied to a field.

/// unwrap_or_skip
#[serde(with = "unwrap_or_skip", skip_serializing_if = "Option::is_none", default)]
foobar: Option<i32>,

/// unwrap_or_unit
#[serde(with = "unwrap_or_unit")]
foobar: Option<i32>,

If there is no immediate desire to add internal and adjacent variants, I would not implement them until they are requested.

Do you want to start implementing some of these variants and send a PR? I can assist too. As mentioned, I would prefer them to be structs implementing SerializeAs/DeserializeAs traits.

jonasbb avatar Sep 26 '21 12:09 jonasbb