serde icon indicating copy to clipboard operation
serde copied to clipboard

Support, or at least document the "double option" pattern.

Open golddranks opened this issue 8 years ago • 17 comments

When developing services that communicate with JSON or similar flexible data formats, it's a common pattern to communicate with "changesets"; or updates to state with only the changed fields present. To represent this kind of data in Rust one can have a struct with every field wrapped as Option<T>, where the None variant signifies a missing field. For one, Diesel supports this pattern when updating data in the database.

However, when the datatype itself is nullable, the meaning of a "missing field" (=don't update) and "null value" (= update by setting to null) get conflated. Fortunately, there is a pattern to support this: double option: Option<Option<T>>. Here the outer option signifies the existence of the field, and the inner option represents the value, or a null.

Serde doesn't support this pattern by default; the visitor visits only the outer Option, and sets it None in case of the default behaviour. However, I think that it would be valuable to support the double option pattern, where it tries to set the inner Option.

Below is an implementation that I'm using with #[serde(deserialize_with = "double_option")].

In case of Option<Option<T>> at the moment, the inner Option never gets set as None – it's essentially transparent, and thus serves no purpose. So I see the double option pattern as more sensible for this case.


pub fn double_option<'de, T: Deserialize<'de>, D>(de: D) -> Result<Option<Option<T>>, D::Error>
    where D: Deserializer<'de>
{
    #[derive(Debug)]
    struct DoubleOptionVisitor<T> {
        _inner: PhantomData<T>,
    };

    impl<'de, T: Deserialize<'de>> Visitor<'de> for DoubleOptionVisitor<T> {
        type Value = Option<Option<T>>;
        fn expecting(&self, formatter: &mut Formatter) -> fmt::Result {
            write!(formatter,
                   "Either a missing field, a field with null/None, or a field with value T")
        }
        fn visit_none<E>(self) -> Result<Self::Value, E>
            where E: Error
        {
            Ok(Some(None))
        }
        fn visit_some<D>(self, de: D) -> Result<Self::Value, D::Error>
            where D: Deserializer<'de>
        {
            match T::deserialize(de) {
                Ok(val) => Ok(Some(Some(val))),
                Err(e) => Err(e),
            }
        }
    }
    de.deserialize_option(DoubleOptionVisitor::<T> { _inner: PhantomData })
}

golddranks avatar Sep 05 '17 11:09 golddranks

Here is a more concise implementation from #984:

pub fn double_option<'de, T, D>(de: D) -> Result<Option<Option<T>>, D::Error>
    where T: Deserialize<'de>,
          D: Deserializer<'de>
{
    Deserialize::deserialize(de).map(Some)
}

I would be on board with explaining this pattern in an example on the website.

dtolnay avatar Sep 05 '17 14:09 dtolnay

Could we export this function somewhere, say serde::de::double_option? Copy and pasting things like this all over the place is icky.

chris-morgan avatar Nov 01 '17 23:11 chris-morgan

This would be a good candidate for https://github.com/serde-rs/serde/issues/553.

dtolnay avatar Nov 01 '17 23:11 dtolnay

Hello. I've recently discovered that the absence of a value and null are treated the same way. @dtolnay is there a way to make it somehow more generic (https://github.com/serde-rs/serde/issues/984#issuecomment-314143738)? What I mean by generic is being able to operate on untyped JSON values (https://docs.serde.rs/serde_json/index.html#operating-on-untyped-json-values). Thanks!

yamafaktory avatar Nov 05 '18 08:11 yamafaktory

-     a: Option<Option<i32>>,
+     a: Option<Value>,

https://github.com/serde-rs/serde/issues/984#issuecomment-314143738 is already generic to work on any type. The code uses Option<Option<i32>> so that the three cases are represented by None, Some(None), Some(Some(1)). But you can make that Option<Value> and get None, Some(Null), Some(Number(1)).

dtolnay avatar Nov 05 '18 10:11 dtolnay

Thanks for the reply 👍!

yamafaktory avatar Nov 05 '18 12:11 yamafaktory

Another question: is there a way to make that approach recursive without knowing what a will be (untyped JSON value)?

println!("{:?}", serde_json::from_str::<S>("{\"a\":{\"b\":null}}").unwrap());

yamafaktory avatar Nov 05 '18 12:11 yamafaktory

type S = serde_json::Value;

For a Value you can already tell whether some key is absent, present but null, or present and non-null so you don't need anything in this thread.

dtolnay avatar Nov 05 '18 17:11 dtolnay

Thanks a lot!

yamafaktory avatar Nov 05 '18 20:11 yamafaktory

Workaround: instead of Option<Option<i32>>, use Option<(Option<i32>,)> (an Option containing a 1-element tuple). The tuple serializes as a list.

None serializes to null
Some(None) serializes to [null]
Some(Some(3)) serializes to [3]

bgeron avatar Oct 18 '21 10:10 bgeron

I noticed that async-graphql defines a type MaybeUndefined for this, which feels like a semantically more accurate representation than Option<Option<T>>.

djc avatar Mar 30 '22 12:03 djc

For anyone else who arrived here looking to find an existing solution, I wanted to point out this function is already implemented in the serde_with crate:

https://docs.rs/serde_with/latest/serde_with/rust/double_option/index.html

Thanks to @jonasbb !

jacob-pro avatar Aug 12 '22 19:08 jacob-pro

Is there a macro-free implementation that uses a custom type instead of the outer option?

Macros are better for types used for multiple purposes thanks to not having to change the actual type, but editor support is better for plain types and there's no downside if the custom serialization type is necessary for other reasons.

Kinrany avatar May 04 '23 19:05 Kinrany

I noticed that async-graphql defines a type MaybeUndefined for this, which feels like a semantically more accurate representation than Option<Option<T>>.

There's this type, though using a heavy dep like that just for that type is obviously not great.

oli-obk avatar May 05 '23 09:05 oli-obk

I'm new to rust so I might have gotten this wrong, i did this while learning rust, but Option<Option<T>> seems to me like a really bad / recursive idea .... Coming from other programming languages, one option is already too much, but that's acceptable because rust doesn't have null variables, but 2 nested option is insanity especially if you have to determine whether the field is missing, is null of has a value. This really looks like one of those cases where a rust enum can save the day.

Here is My solution : Instead of Option<Option<T>> you use JsonOption, you can convert it to Option let o : Option<T> = json_option.to_option();

JsonOption has custom serializer and deserializer along with a fake visitor. every JsonOption field of a struct MUST be:

#[serde(default)]
#[serde(skip_serializing_if = "JsonOption::is_undefined")]

cargo toml

[package]
name = "variables"
version = "0.1.0"
edition = "2021"

[dependencies]
serde = { version = "1.0", default-features = true, features = ["std","alloc","derive"]}
serde_json = { version = "1.0" }

src/serde_json_util/json_option.rs:

use serde::{Serialize, Serializer, Deserialize, Deserializer};
#[allow(unused_imports)]
use serde_json::json; // rust warning unused imports bug

#[derive(Debug)]
pub enum JsonOption<T> {
    Undefined,
    Null,
    Value(T),
}

pub struct JsonOptionVisitor<T> {
    marker: std::marker::PhantomData<T>,
}

impl<T> JsonOption<T> {
    pub const fn is_undefined(&self) -> bool {
        matches!(*self, JsonOption::Undefined)
    }
    pub const fn undefined() -> Self {
        JsonOption::Undefined
    }
    pub fn to_option(self) -> Option<T> {
        match self {
            JsonOption::Value(v) => Option::Some(v),
            _ => Option::None,
        }
    }
}

impl<T> Default for JsonOption<T> {
    fn default() -> Self {
        JsonOption::Undefined
    }
}

impl<T> Serialize for JsonOption<T>
where
    T: Serialize,
{
    fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
    where
        S: Serializer,
    {
        match *self {
            JsonOption::Undefined => serializer.serialize_none(), // Problem here, I can't figure out how to NOT serialize a null (or serialize nothing)
            // the problem is solved by #[serde(skip_serializing_if = "JsonOption::is_undefined")] on the struct field
            // however if we could return somethin here that is not an error and does not cause a serialization to occur, serde skip_serializing_if would not be necessary on the struct
            JsonOption::Null => serializer.serialize_none(),
            JsonOption::Value(ref value) => value.serialize(serializer),
        }
    }
}

impl<'de, T> Deserialize<'de> for JsonOption<T>
where
    T: Deserialize<'de>,
{
    fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
    where
        D: Deserializer<'de>,
    {
        deserializer.deserialize_option(JsonOptionVisitor::<T> {
            marker: std::marker::PhantomData,
        })
    }
}

impl<'de, T> serde::de::Visitor<'de> for JsonOptionVisitor<T>
where
    T: Deserialize<'de>,
{
    type Value = JsonOption<T>;

    #[inline]
    fn expecting(&self, formatter: &mut std::fmt::Formatter) -> std::fmt::Result {
        formatter.write_str("JsonOption<T>")
    }

    #[inline]
    fn visit_none<E>(self) -> Result<JsonOption<T>, E>
    where
        E: serde::de::Error,
    {
        Ok(JsonOption::Null)
    }

    #[inline]
    fn visit_some<D>(self, deserializer: D) -> Result<Self::Value, D::Error>
    where
        D: Deserializer<'de>,
    {
        T::deserialize(deserializer).map(JsonOption::Value)
    }
}

#[test]
fn test_json_option() {
    // This is not an actual TEST with asserts, just look at the standard output and check the print line messages
    //////// TYPES

    #[derive(
        // Default, // this unfortunately deserializes required strings to "", if this wasn't the case #[serde(default)] would not be necessary on each field
        Debug, Serialize, Deserialize)]
    #[serde(deny_unknown_fields)]
    // #[serde(default)] // this unfortunately deserializes required strings to ""
    struct TestStruct {
        name: String,
        nik: String,
        #[serde(default)] // use default 'undefined' if value not present
        #[serde(skip_serializing_if = "JsonOption::is_undefined")] // do not serialize undefined
        // it would be awesome if we could place this on the JsonOption enum variant directly, however that causes an error in either serial or deserial
        age: JsonOption<u8>,
        #[serde(default)]
        #[serde(skip_serializing_if = "JsonOption::is_undefined")]
        p5: JsonOption<InnerStruct>,
        #[serde(default)]
        #[serde(rename = "coolValueUnit")]
        #[serde(skip_serializing_if = "JsonOption::is_undefined")]
        unit: JsonOption<()>,
        #[serde(alias = "seTwo")] // can deser both se and setwo
        #[serde(default)]
        #[serde(skip_serializing_if = "JsonOption::is_undefined")]
        se : JsonOption<SomeEnum>,
    }

    #[derive(Debug, Serialize, Deserialize)]
    #[serde(rename_all = "snake_case")] 
    enum SomeEnum {
        #[serde(rename = "UAN")] 
        OneFirst,
        TwoSecond,
        #[serde(rename = "TRE")] 
        Three,
    }

    #[derive(Debug, Serialize, Deserialize)]
    #[serde(rename = "AnotherName!")] // currently buggy : https://github.com/serde-rs/serde/issues/2402
    struct InnerStruct {
        x: u32,
    }

    // let option = JsonOption::Value(3).to_option();

    //////// TEST

    // Serialize to JSON undefined (does not serialize undefined)
    let person = TestStruct {
        name: "John Doe".to_owned(),
        nik: "John nik Doe".to_owned(),
        age: JsonOption::Undefined, // field will be omitted
        p5: JsonOption::Undefined,
        unit: JsonOption::Undefined,
        se : JsonOption::Undefined,
    };

    let serialized = serde_json::to_string(&person).unwrap();
    println!("Serialized: {}", serialized);

    // Serialize to JSON null (serializes null)
    let person = TestStruct {
        name: "Jane Doe".to_owned(),
        nik: "John nik Doe".to_owned(),
        age: JsonOption::Null, // field will be set to null
        p5: JsonOption::Null,
        unit: JsonOption::Null,
        se : JsonOption::Null
    };

    let serialized = serde_json::to_string(&person).unwrap();
    println!("Serialized: {}", serialized);

    // Serialize to JSON value (serializes value as if JsonOption wasn't there)
    let person = TestStruct {
        name: "Jim Doe".to_owned(),
        nik: "Jim nik Doe".to_owned(),
        age: JsonOption::Value(30), // field will be set to 30
        p5: JsonOption::Value(InnerStruct { x: 3 }),
        unit: JsonOption::Value(()),
        se : JsonOption::Value(SomeEnum::OneFirst)
    };

    let serialized = serde_json::to_string(&person).unwrap();
    println!("Serialized: {}", serialized);

    // Serialize to JSON value (serializes value as if JsonOption wasn't there) (se is two second variant)
    let person = TestStruct {
        name: "Jim Doe".to_owned(),
        nik: "Jim nik Doe".to_owned(),
        age: JsonOption::Value(30), // field will be set to 30
        p5: JsonOption::Value(InnerStruct { x: 3 }),
        unit: JsonOption::Value(()),
        se : JsonOption::Value(SomeEnum::TwoSecond)
    };

    let serialized = serde_json::to_string(&person).unwrap();
    println!("Serialized: {}", serialized);

    // Deserialize Errors, garbage
    match serde_json::from_str::<TestStruct>(r#"{ dsd[] }"#) {
        Ok(deserialized) => {
            println!("Deserialized: {:?}", deserialized);
        }
        Err(e) => {
            println!("Deserialize ERROR: {}", e);
        }
    }

    // Deserialize Errors, missing required field (non JsonOption field)
    match serde_json::from_str::<TestStruct>(r#"{ "name" : "Got A Name" }"#) {
        Ok(deserialized) => {
            println!("Deserialized: {:?}", deserialized);
        }
        Err(e) => {
            println!("Deserialize ERROR: {}", e);
        }
    }

    // Deserialize Errors, unknown field this is because #[serde(deny_unknown_fields)]
    match serde_json::from_value::<TestStruct>(
        json!({"name": "Janet Doe", "nik" : "2", "surname" : "Doe"}),
    ) {
        Ok(deserialized) => {
            println!("Deserialized: {:?}", deserialized);
        }
        Err(e) => {
            println!("Deserialize ERROR: {}", e);
        }
    }

    // Deserialize Errors, wrong type
    match serde_json::from_value::<TestStruct>(
        json!({"name": "Janet Doe", "nik" : "2", "p5" : "inv_val"}),
    ) {
        Ok(deserialized) => {
            println!("Deserialized: {:?}", deserialized);
        }
        Err(e) => {
            println!("Deserialize ERROR: {}", e);
        }
    }

    // Deserialize optional Undefined, you should see JsonOption::Undefined rather than Option::None
    let deserialized: TestStruct =
        serde_json::from_value(json!({"name": "Janet Doe", "nik" : "2"})).unwrap();
    println!("Deserialized: {:?}", deserialized);

    // Deserialize optional Null, you should see JsonOption::Null rather than Option::None
    let deserialized: TestStruct = serde_json::from_value(
        json!({"name": "Janet Doe", "nik" : "2", "age" : null, "p5":null, "coolValueUnit":null}),
    )
    .unwrap();
    println!("Deserialized: {:?}", deserialized);

    // Deserialize Full with substructure
    let deserialized: TestStruct =
        serde_json::from_value(json!({"name": "Janet Doe", "nik" : "2", "age" : 45, "p5":{"x":3}}))
            .unwrap();
    println!("Deserialized: {:?}", deserialized);

}

mateiandrei94 avatar Jul 28 '23 21:07 mateiandrei94