quick-xml
quick-xml copied to clipboard
serde example in README doesn't work
I'm having some trouble parsing an XML file (I'm getting errors saying "duplicate field"), but noticed the serde example in the README seemed to be doing something similar, so I copied that code and put it into an integration test, but I can't seem to get that to pass either. I stripped out a bit of the XML to try to simplify, so I'm currently at this:
use quick_xml::de::from_str;
use serde::Deserialize;
#[derive(Debug, Deserialize, PartialEq)]
struct Link {
rel: String,
href: String,
sizes: Option<String>,
}
#[derive(Debug, Deserialize, PartialEq)]
#[serde(rename_all = "lowercase")]
enum Lang {
En,
Fr,
De,
}
#[derive(Debug, Deserialize, PartialEq)]
struct Head {
title: String,
#[serde(rename = "link", default)]
links: Vec<Link>,
}
#[derive(Debug, Deserialize, PartialEq)]
struct Script {
src: String,
integrity: String,
}
#[derive(Debug, Deserialize, PartialEq)]
struct Body {
#[serde(rename = "script", default)]
scripts: Vec<Script>,
}
#[derive(Debug, Deserialize, PartialEq)]
struct Html {
lang: Option<String>,
head: Head,
body: Body,
}
#[test]
fn crates_io() {
let xml = r#"<!DOCTYPE html>
<html lang="en">
<head>
<title>crates.io: Rust Package Registry</title>
<link rel="manifest" href="/manifest.webmanifest">
<link rel="apple-touch-icon" href="/cargo-835dd6a18132048a52ac569f2615b59d.png" sizes="227x227">
</head>
<body>
<noscript>
<div id="main">
<div class='noscript'>
This site requires JavaScript to be enabled.
</div>
</div>
</noscript>
<script src="/assets/vendor-bfe89101b20262535de5a5ccdc276965.js" integrity="sha256-U12Xuwhz1bhJXWyFW/hRr+Wa8B6FFDheTowik5VLkbw= sha512-J/cUUuUN55TrdG8P6Zk3/slI0nTgzYb8pOQlrXfaLgzr9aEumr9D1EzmFyLy1nrhaDGpRN1T8EQrU21Jl81pJQ==" ></script>
<script src="/assets/cargo-4023b68501b7b3e17b2bb31f50f5eeea.js" integrity="sha256-9atimKc1KC6HMJF/B07lP3Cjtgr2tmET8Vau0Re5mVI= sha512-XJyBDQU4wtA1aPyPXaFzTE5Wh/mYJwkKHqZ/Fn4p/ezgdKzSCFu6FYn81raBCnCBNsihfhrkb88uF6H5VraHMA==" ></script>
</body>
</html>
}"#;
let html: Html = from_str(xml).unwrap();
dbg!(html);
assert_eq!(&html.head.title, "crates.io: Rust Package Registry");
panic!("Just wanna see the dbg...");
}
This fails with:
thread 'crates_io' panicked at 'called `Result::unwrap()` on an `Err` value: Xml(EndEventMismatch { expected: "link", found: "head" })', tests/meow_tests.rs:70:22
Using Rust 1.43.1, quick-xml 0.18.1.
I believe the example is wrong (it should use valid xml/xhtml not html).
The Serde implementation has check_end_names
set to true
on the Reader
:
https://github.com/tafia/quick-xml/blob/303003f94ce4114fc8c4e4d146d171b3f2cad2b7/src/de/mod.rs#L159
The parser expects the <link>
tag to be closed by a matching </link>
tag and panics when it finds </head>
instead.
As for the "duplicate field" error it's hard to say without more information, but if you're trying to deserialize a sequence of elements that have different namespaces it could be #212.
Otherwise it could be https://github.com/RReverser/serde-xml-rs/issues/55. Despite that issue being on the serde-xml-rs repo, it's a general limitation of Serde's design so it applies to quick-xml's implementation as well.
Edit: Here's an issue for the Serde limitation from this rep: #177 An issue on the Serde repo mentioning quick-xml: https://github.com/serde-rs/serde/issues/1725