quick-xml icon indicating copy to clipboard operation
quick-xml copied to clipboard

Option<bool> doesn't (de)serialize properly with serde

Open dralley opened this issue 1 year ago • 8 comments

Using the following code:

use serde;
use quick_xml;

#[derive(serde::Serialize, serde::Deserialize)]
pub struct Player {
    spawn_forced: Option<bool>
}

fn main() {
    let data = Player {
       spawn_forced: None,
    };

    let mut deserialize_buffer = String::new();
    quick_xml::se::to_writer(&mut deserialize_buffer, &data).unwrap();
    println!("{}", &deserialize_buffer);

    quick_xml::de::from_reader::<&[u8], Player>(&deserialize_buffer.as_bytes())
                    .unwrap();
}

The code crashes during the deserialize step

[dalley@thinkpad playground]$ cargo run
   Compiling playground v0.1.0 (/home/dalley/Devel/practice/playground)
    Finished dev [unoptimized + debuginfo] target(s) in 0.28s
     Running `target/debug/playground`
<Player><spawn_forced/></Player>
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: InvalidBoolean("")', src/main.rs:98:22
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

If instead spawn_forced is set to Some(true) or Some(false), it works fine. Only None, which serializes to <spawn_forced/>, panics upon deserialization.

[dalley@thinkpad playground]$ cargo run
   Compiling playground v0.1.0 (/home/dalley/Devel/practice/playground)
    Finished dev [unoptimized + debuginfo] target(s) in 0.44s
     Running `target/debug/playground`
<Player><spawn_forced>true</spawn_forced></Player>
[dalley@thinkpad playground]$ cargo run
   Compiling playground v0.1.0 (/home/dalley/Devel/practice/playground)
    Finished dev [unoptimized + debuginfo] target(s) in 0.28s
     Running `target/debug/playground`
<Player><spawn_forced>false</spawn_forced></Player>

dralley avatar Oct 28 '22 03:10 dralley

Test case minimized from https://github.com/djkoloski/rust_serialization_benchmark/pull/34

dralley avatar Oct 28 '22 03:10 dralley

This is using the current master branch ^^

dralley avatar Oct 29 '22 03:10 dralley

This bug is not specific to bool -- any Option<T> fields that deserializing from empty element, are affecting. Let's look at such XMLs:

<root/>                               <!-- (1) None -->
<root><element/></root>               <!-- (2) ? -->
<root><element>$DATA</element></root> <!-- (3) Some(...) -->

While result for (1) and (3) are pretty obvious, that is hard to say for (2). The JSON equivalent of that structs is

{}
{ "element": null }
{ "element": $DATA }

How we should map the case (2)? serde_json maps it to None always. That means, that, for example, Option<()> and any Option<UnitStruct> are always deserialized as None, while they are serialized into "value": null.

quick-xml currently does the opposite: it always deserializes to Some(...). It is a bit tricky to handle this similar to JSON, because our Deserializer::deserialize_option does not advance event pointer, so the type under Option has access to an Event::Start. We only lookahead to one event, but Deserializer::deserialize_option requires lookahead for 2 events: we need to look at Event::Text in order to see that it is empty and we need to return None. The Event::Text generated, because <element/> is represented as <element></element> for a deserializer and we want no difference in handling between that two representations.

Fixing that likely will have a consequence, that Option<&str> and Option<String> will never be deserialized to Some("").

Mingun avatar Oct 30 '22 10:10 Mingun

How we should map the case (2)? serde_json maps it to None always. That means, that, for example, Option<()> and any Option<UnitStruct> are always deserialized as None, while they are serialized into "value": null.

Would it be possible to use a special attribute value, e.g. <element none="true" />?

dralley avatar Oct 31 '22 02:10 dralley

I would try to avoid that and use #[serde(deserialize_with = "path")] instead, if that would be possible, but handling xsi:nil="true" in that sense could be useful for interop (#464).

Mingun avatar Oct 31 '22 16:10 Mingun

Actually, that case is much more complex then I thought initially. It is unclear, how to process cases like

<element attribute="..."/>

At time of deciding if we should return None or not, we don't known how the type under Option would be deserialized. If it is a

struct Struct {
  #[serde(rename = "@attribute")]
  attribute: (),
}

then this XML snippet are definitely not None.

Because of the nature of XML we like to keep ability to deserialize things like

<element attribute="...">
  content
</element>

into a String "content", i.e. ignore excess attributes. This is unique situation for XML, because in JSON that XML would look like

{
  "@attribute": "...",
  "$text": "content"
}

and could not be deserialized into a String. Unfortunately, the situation when additional attributes added to the element are quite usual, so I think we should support that case. That means, however, that deserializing

<element attribute="..."/>

into an Option<String> we should return the same result as for

<element/>

which suggested to be None.

Mingun avatar Nov 20 '22 19:11 Mingun

Is there any update on this, or maybe some kind of workaraound?

Currently the only way I cound get the serialization of an optional value to work is when the target type is Option<String>, but even then the result is not correct, since empty strings serialize to Some("") and not None.

RoJac88 avatar Jul 28 '23 14:07 RoJac88

The simplest workaround I found if you control both the serializing and de-serializing parts is to use the following serde attributes: #[serde(skip_serializing_if = "Option::is_none", default)]

This problem is still quite bothersome though...

LB767 avatar Aug 21 '23 14:08 LB767