xml-rs icon indicating copy to clipboard operation
xml-rs copied to clipboard

[Question] Why is this library not part of the Serde ecosystem?

Open robinmoussu opened this issue 5 years ago • 2 comments

I am new to Rust, and thought that basically anything that is related to serialization/deserialization would be related one way or another to the Serde crate. This one isn't, and it interrogates me. This is purely out of curiosity, and not a all a critique.

robinmoussu avatar Jun 19 '20 08:06 robinmoussu

I can't speak for either xml-rs or serde, but I've investigated this a bit myself. I think the serde data model and XML's (see the variants of xml_rs::reader::XmlEvent) are different enough that it's frustrating to try to make them work together. serde did an amazing thing already to come up with a data model that's pretty good for several formats, but XML's data model is Not Like The Others.

(btw, not quite all the other formats use serde. There are three popular Rust protobuf implementations that don't use serde. There I think the main reason is just that you want to generate the definition from the language-independent .proto file rather than annotate existing Rust code, and then you might as well just generate what you want directly rather than try to output generated code that uses serde's procedural macros. It wouldn't surprise me if there were other reasons also.)

There are crates that try to combine XML and serde, eg serde-xml-rs looks pretty popular. But even ignoring parts of XML you're less likely to care about (say, processing instructions), it's not obvious how to handle things like distinguishing attributes from child elements, handling text interleaved with child elements, or various complications around namespaces. The recently published xml_serde seems to make a decent stab at this, but I don't think it can (de)serialize all common/useful XML schemas, and you definitely couldn't use it to losslessly round-trip an arbitrary XML document.

The yaserde crate depends on xml-rs and is deliberately similar to serde but doesn't actually use serde. Details aside, I think its approach of building a serde-for-just-XML crate on top of a just-streaming-XML crate is the best:

  • Sometimes you just want streaming parsing/serialization, for example pretty-printing a huge XML document. I think this is possible with serde (see eg this doc), but it's not as straightforward as with the SAX-style interface. So it makes sense to me to expose the SAX-like layer.
  • Sometimes you want a general-purpose DOM that faithfully represents the underlying document. That's an easier problem than using schema-specific native Rust types, and there are separate crates like xmltree-rs that do this and share the SAX layer.
  • Not actually using serde means the data model can be just right for XML.
  • Additionally, finding the right API for a serde-like XML thing requires some iteration. Having it in a separate crate means a semver break in the serde-like layer doesn't mean an unnecessary semver break in the more stable SAX-like layer.
  • It can still look quite familiar to someone who uses serde, and that's helpful.

scottlamb avatar Nov 12 '21 20:11 scottlamb

I am new to Rust, and thought that basically anything that is related to serialization/deserialization would be related one way or another to the Serde crate. This one isn't, and it interrogates me. This is purely out of curiosity, and not a all a critique.

Not a contributor of either, but xml-rs parses XML, which is a meta-language with more or less arbitrary semantics. That is, XML doesn't in and of itself have semantics, the application (or dialect) you define does.

Serde is a framework for serializing and deserializing Rust datastructures, but in XML there's essentially an infinite number of ways to define the serialization of a Rust datastructure, all equally valid as the next. For instance even if we restrict ourselves to JSON-ish semantics off the top of my head I can cite 3 different, independent, and incompatible dialects providing roughly (or exactly) similar semantics:

  • XML-RPC (specifically the underlying serialization scheme)
  • macOS XML plists
  • JSONx, which literally exists as a way to get JSON through XML-processing pipelines without semantics alterations

Any of these could have a serde library (in fact at least two do, both using xml-rs: serde-xml-rpc and plist), but a single Serde implementation couldn't really cover both.

masklinn avatar Jan 02 '22 16:01 masklinn

There is a crate for this:

https://lib.rs/crates/serde-xml-rs

I think it's fine that it's implemented in a separate crate.

kornelski avatar May 10 '23 22:05 kornelski