serde-xml-rs icon indicating copy to clipboard operation
serde-xml-rs copied to clipboard

Deserialize directly to vec

Open mward opened this issue 5 years ago • 15 comments

Updated to allow deserializing directly to a Vec, without need to define an intermediate struct. This does not break current functionality, so you can still use intermediate struct if you want.

This will resolve questions like this: https://stackoverflow.com/questions/58571896/rust-serde-deserialize-xml-directly-to-vect

mward avatar Jan 24 '20 17:01 mward

The travis CI build failed on my fork too, I had to clear cache in travis to get past this

mward avatar Jan 24 '20 18:01 mward

Travis error details:

error[E0514]: found crate `serde_derive` compiled by an incompatible version of rustc
 --> README.md:13:1
  |
3 | extern crate serde_derive;
  | ^^^^^^^^^^^^^^^^^^^^^^^^^^
  |
  = help: please recompile that crate using this compiler (rustc 1.40.0 (73528e339 2019-12-16))
  = note: the following crate versions were found:
          crate `serde_derive` compiled by rustc 1.39.0 (4560ea788 2019-11-04): /home/travis/build/RReverser/serde-xml-rs/target/debug/deps/libserde_derive-01b75c0c69c91066.so
error: aborting due to 2 previous errors

mward avatar Jan 25 '20 17:01 mward

@mward, I keep seeing this problem and another one that is similar. As you say, clearing the cache works around the problem. I'd love to find a permanent fix that can be automated.

A few months back, I noticed that the option to clear the cache (https://docs.travis-ci.com/user/caching/#clearing-caches) is no longer available on travis-ci.org.

punkstarman avatar Jan 27 '20 11:01 punkstarman

@mward are you able to rebase/ merge master so we can see if the CI change I made fixes anything 😄 ?

ghost avatar Feb 13 '20 19:02 ghost

@efx I merged upstream master to my forked project.

Thanks

mward avatar Feb 13 '20 20:02 mward

Do we have a schedule for merging this fix ? It one solves part my issue with parsing:

// ----------------------------------------------------
#[derive(Debug, Deserialize, PartialEq)]
#[serde(rename_all(deserialize = "lowercase"))]
pub struct LookupEntrySingle {
    #[serde(rename = "state")]
    lkup_state: String,

    #[serde(rename = "value")]
    lookup_value: u64,    

    #[serde(rename = "$value", default)]
    lkup_label: String,    
}

#[derive(Debug, Deserialize, PartialEq)]
#[serde(rename_all(deserialize = "lowercase"))]
pub enum LookupEntry{
    #[serde(rename = "SingleInt")]
    SingleInt(LookupEntrySingle),

    #[serde(rename = "Boolean")]
    Boolean(LookupEntrySingle),

    #[serde(rename = "BitField")]
    BitField(LookupEntrySingle),
}



#[derive(Debug, Deserialize, PartialEq)]
pub struct ValueLookup {
    id: String,
    #[serde(rename = "desiredValue", default)]
    desiredvalue: u64,

    #[serde(rename = "undefinedState", default)]
    undefinedstate: Option<String>,

//    #[serde(rename = "Lookups", default)]
    #[serde(rename = "$value")]
    lookups: Vec<LookupEntry>,

}

static LOOKUP_STD: &'static str = r##"<?xml version="1.0" encoding="UTF-8"?>
<ValueLookup id="int" desiredValue="1" >
<Lookups>
    <SingleInt state="None"    value="0"  >Unknown</SingleInt>
    <SingleInt state="Ok"      value="1" >OK</SingleInt>
</Lookups>
</ValueLookup>
"##;

#[test]
fn test_lookupInt_JSON() {
/* This one is returns one element of the vec, only the last one 
[
    Object(
        {
            "Lookups": Object(
                {
                    "SingleInt": Object(
                        {
                            "$value": String(
                                "OK",
                            ),
                            "state": String(
                                "Ok",
                            ),
                            "value": String(
                                "1",
                            ),
                        },
                    ),
                },
            ),
            "desiredValue": String(
                "1",
            ),
            "id": String(
                "prtgc.std.lookups.status",
            ),
        },
    ),
]
*/
    let _ = simple_logger::init_with_level(log::Level::Debug);
    let json_values: ValueLookup = 
    match serde_xml_rs::from_str(LOOKUP_STD) {
        Ok(val) => val,
        Err(err) => {
            println!("{:#?}", err);
            panic!(err);
        },
    };
    println!("{:#?}", json_values);
}

#[test]
fn test_lookupInt(){
// This one works
    let _ = simple_logger::init_with_level(Level::Debug);
    let json_values: Vec<serde_json::Value> = 
    match serde_xml_rs::from_str(LOOKUP_STD) {
        Ok(val) => val,
        Err(err) => {
            println!("{:#?}", err);
            panic!(err);
        },
    };
    println!("{:#?}", json_values);
}


#[test]
fn test_lookupdange() {
// This one does not work
    let _ = simple_logger::init_with_level(log::Level::Debug);
    let s = r##"<?xml version="1.0" encoding="UTF-8"?>
<ValueLookup id="range" desiredValue="1" >
  <Lookups>
    <Range From="100" To="199" state="Ok">Information</Range>
    <Range From="200" To="299" state="Ok">Success</Range>
    <Range From="300" To="399" state="Warning">Redirection</Range>
    <Range From="400" To="499" state="Error">Client Error</Range>
    <Range From="500" To="599" state="Error">Server Error</Range>
  </Lookups>
</ValueLookup>
"##;

    let json_values: ValueLookup = 
    match serde_xml_rs::from_str(s) {
        Ok(val) => val,
        Err(err) => {
            println!("{:#?}", err);
            panic!(err);
        },
    };
    println!("{:#?}", json_values);
}

#[test]
fn test_lookupBit() {
// This one does not work
    let _ = simple_logger::init_with_level(log::Level::Debug);
    let s = r##"<?xml version="1.0" encoding="UTF-8"?>
    <ValueLookup id="status" desiredValue="1" >
<Lookups>
  <BitField state="OK" value="0" BitNumber="1">No Alarm</BitField>
  <BitField state="Warning" value="1" BitNumber="1">
    Alarm Active
  </BitField>
</Lookups>
</ValueLookup>
"##;

    let json_values: ValueLookup = 
    match serde_xml_rs::from_str(s) {
        Ok(val) => val,
        Err(err) => {
            println!("{:#?}", err);
            panic!(err);
        },
    };
    println!("{:#?}", json_values);
}

JRAndreassen avatar Apr 20 '20 16:04 JRAndreassen

I've been experimenting with this and it seems to not behave properly when the considered Vec should be empty.

The following test (built similarly to your new vec_container test) does not pass:

/** For reference */
#[derive(Debug, Deserialize, PartialEq)]
struct Shelter {
    animals: Vec<Animal>,
}

#[derive(Debug, Deserialize, PartialEq)]
struct Animal {
    name: String,
    foo: String,
}
/********************/

#[test]
fn empty_vec_container() {
    let _ = simple_logger::init();

    let s = r##"
    <shelter>
      <animals>
      </animals>
    </shelter>
    "##;

    let shelter: Shelter = from_str(s).unwrap();

    assert_eq!(shelter.animals, Vec::new());
}
test empty_vec_container ... FAILED

failures:

---- empty_vec_container stdout ----
2020-04-27 15:19:56,975 DEBUG [serde_xml_rs::de] Fetched StartElement(shelter, {"": "", "xml": "http://www.w3.org/XML/1998/namespace", "xmlns": "http://www.w3.org/2000/xmlns/"})
2020-04-27 15:19:56,975 DEBUG [serde_xml_rs::de] Peeked StartElement(animals, {"": "", "xml": "http://www.w3.org/XML/1998/namespace", "xmlns": "http://www.w3.org/2000/xmlns/"})
2020-04-27 15:19:56,975 DEBUG [serde_xml_rs::de] Peeked StartElement(animals, {"": "", "xml": "http://www.w3.org/XML/1998/namespace", "xmlns": "http://www.w3.org/2000/xmlns/"})
2020-04-27 15:19:56,975 DEBUG [serde_xml_rs::de] Peeked StartElement(animals, {"": "", "xml": "http://www.w3.org/XML/1998/namespace", "xmlns": "http://www.w3.org/2000/xmlns/"})
2020-04-27 15:19:56,975 DEBUG [serde_xml_rs::de] Fetched StartElement(animals, {"": "", "xml": "http://www.w3.org/XML/1998/namespace", "xmlns": "http://www.w3.org/2000/xmlns/"})
2020-04-27 15:19:56,975 DEBUG [serde_xml_rs::de] Peeked EndElement(animals)
2020-04-27 15:19:56,975 DEBUG [serde_xml_rs::de] Pushed EndElement(animals)
2020-04-27 15:19:56,975 DEBUG [serde_xml_rs::de] Pushed StartElement(animals, {"": "", "xml": "http://www.w3.org/XML/1998/namespace", "xmlns": "http://www.w3.org/2000/xmlns/"})
2020-04-27 15:19:56,975 DEBUG [serde_xml_rs::de] Peeked StartElement(animals, {"": "", "xml": "http://www.w3.org/XML/1998/namespace", "xmlns": "http://www.w3.org/2000/xmlns/"})
2020-04-27 15:19:56,975 DEBUG [serde_xml_rs::de] Peeked StartElement(animals, {"": "", "xml": "http://www.w3.org/XML/1998/namespace", "xmlns": "http://www.w3.org/2000/xmlns/"})
2020-04-27 15:19:56,975 DEBUG [serde_xml_rs::de] Fetched StartElement(animals, {"": "", "xml": "http://www.w3.org/XML/1998/namespace", "xmlns": "http://www.w3.org/2000/xmlns/"})
2020-04-27 15:19:56,975 DEBUG [serde_xml_rs::de] Peeked EndElement(animals)
thread 'empty_vec_container' panicked at 'called `Result::unwrap()` on an `Err` value: Custom { field: "missing field `name`" }', tests/test.rs:282:28

Other than that it's very practical!

Ten0 avatar Apr 27 '20 13:04 Ten0

@Ten0 I have a fix for the empty_vec_container on my branch.

Before fixing that I realized that the test "test_doctype_fail" was failing for some unknown reason. That test also fails when I run directly from "RReverser/serde-xml-rs" master. I temporarily ignored that test on my branch

mward avatar May 19 '20 15:05 mward

I removed ignore from 'test_doctype_fail' test. Locking xml-rs and thiserror deps to specific versions fixed the problem

mward avatar May 19 '20 17:05 mward

I removed ignore from 'test_doctype_fail' test. Locking xml-rs and thiserror deps to specific versions fixed the problem

I don't beleive pinning these crates to old versions is the right approach, I think it's better to rather fix the reason why it's failing in either the dependencies or this test (depending on which one is the cause). Besides, as you pointed out it's unrelated to this PR, and I had already opened an issue to track this: https://github.com/RReverser/serde-xml-rs/issues/124

Ten0 avatar May 19 '20 18:05 Ten0

Is there anything I can do to help get this merged?

This PR enables me to remove a lot of ugly container structs from my code base which has to support parsing very large and poorly designed XML document.

steven-joruk avatar Oct 08 '20 16:10 steven-joruk

@steven-joruk - I'd like to see this get merged too, but in the meantime I have my Cargo.toml point to:

serde-xml-rs = { git = "https://github.com/mward/serde-xml-rs.git", branch = "deserialize-directly-to-vec" }

BTW - I've merged master onto my fork and the CI build is still passing

mward avatar Oct 08 '20 18:10 mward

This might be my misunderstanding, but I'm seeing my "nested" example fail where I would expect it to succeed. This is pretty much the nested_collection unit test within a container deserialized in to a vec.

use serde::Deserialize;

#[derive(Debug, Deserialize)]
struct Root {
    #[serde(rename = "Groups")]
    groups: Vec<Group>,
}

#[derive(Debug, Deserialize)]
struct Group {
    #[serde(rename = "Item")]
    items: Vec<Item>,
}

#[derive(Debug, Deserialize)]
struct Item {}

fn main() {
    let not_nested = "
    <Root>
        <Groups>
            <Group>
            </Group>
        </Groups>
    </Root>
    ";

    let root: Root = serde_xml_rs::from_str(not_nested).unwrap();
    println!("{:#?}", root);

    let nested = "
    <Root>
        <Groups>
            <Group>
                <Item/>
                <Item/>
            </Group>
        </Groups>
    </Root>
    ";

    let root: Root = serde_xml_rs::from_str(nested).unwrap();
    println!("{:#?}", root);
}

Output:

Root {
    groups: [],
}
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: UnexpectedToken { token: "XmlEvent::EndElement { name, .. }", found: "StartElement(Item, {\"\": \"\", \"xml\": \"http://www.w3.org/XML/1998/namespace\", \"xmlns\": \"http://www.w3.org/2000/xmlns/\"})" }', src/main.rs:41:53

steven-joruk avatar Oct 26 '20 14:10 steven-joruk

Bump

malthe avatar Jan 23 '21 20:01 malthe

Been parsing though this PR to try and understand it and apply it to the latest version of this repo. i was able to cherry pick only commits 42b5c442a9608d333e23d63f962a46d5f4b5116f and 2e4d9f083f0d4b83372b6af58d32b40d07c2d392 and have things work the same (as far as my limited testing) as if i had included this whole PR. In case I don't end up figuring out how this PR works and fixing the issue that @steven-joruk and I are seeing, I hope this information at least helps someone else.

MoralCode avatar Jul 12 '23 05:07 MoralCode