strong-xml
strong-xml copied to clipboard
Support for namespaces: how hard?
I don't want to be pessimistic, but IMHO xml namespaces are a hard beast: you can mix nested default namespaces with named namespaces, making everything difficult to grasp at a first look for a human being. Unfortunately, I think that namespaces could be difficult to implement with a strong type system, but on the other hand in Rust we have powerful tools in our belt.
Do you think that implementing xml namespaces in strong-xml could be achieved in an appropriate amount of time (I explicitly wanted to avoid the short term)? Is it doable for the project or it is out of scope, at least for now?
Now, here some possible issues that come to my mind.
Each struct deriving XmlRead and XmlWrite is independent from their parent (if A contains b: B, B is still independent from A), but during read and write process, the namespaces are inherited from parent structs (I don't like to use parent/child terminology, which reminds OO polymorphism, but you get the idea in this context). Let's hypothesize that A contains c: C, and B contains c: C, but A and B have different default namespaces; what's the intended behavior of C in the two different contexts? The user would expect that both the namespaces of A and B have the same inner structure (represented by C), therefore the namespace of C would be inherited?
What's the best way to work with namespaces? From one point of view, each xml document can have a finite set of namespaces, therefore the simplest idea could be each namespace is an enum variant, and relative enum represents the set of namespaces. In this case it should be necessary to define a trait in order to make the enum usable as set of namespaces. I am still doubtful about this solution, because I am not sure if the approach could lead to some issue when working with different xml structures with different set of namespaces but with common structs.
Attribute namespaces should be handled? In theory it is possible to specify a namespace for each attribute, and I imagine that it is possible to pay for the feature only at compile-time, without any cost for attributes without explicit namespace (which is the common case). Am I right?
Said that, I would like to help implementing the feature, but at the same time I have a limited amount of time. Surely the author has a better understanding of the efforts required to implement xml namespaces. If this is doable and the amount of time I can invest could bring to something useful, then I will be happy to contribute. :relaxed:
Sorry for replying late, I was out of town.
I rarely use xml namespace feature, so i'm not quite sure about the use case in real world. Could you provide some examples? What kind of xml file you want to parse and what result you want to get?
The closest thing that I can provide now is adding some macro for declaring and using namespaces:
#[derive(XmlWrite)]
#[xml(tag = "foo")]
#[xml(xmlns:a = "b")]
struct Foo {
#[xml(ns = "a", attr = "bar")]
bar: String,
#[xml(ns = "a", flatten_text = "child")]
baz: String,
}
let foo = Foo { bar: "bar".into(), baz: "baz".into() };
will result in
<foo xmlns:a="b" a:bar="bar"><a:child>baz</a:child></foo>
The main real world case, unfortunately, is being able to perform SOAP requests, which generally involve specific usage of namespaces. I am saying unfortunately because there is an incredible amount of better intercommunication protocols out there... but, you know, legacy... :roll_eyes: -- end of rant.
I am working on a crate to handle these problems, and the more I do, the less I am sure I want this feature in strong-xml: the fact that XML namespaces does not work well with composition unless you forward some sort of namespaces state to children, and this introduces some overhead (in theory, I think it could be possible to partially avoid this issue using complete const generics, but it is probably kinda messy anyway). Moreover, for my use case I decided to bind the structs to one namespaces enum, which is very objectionable and not a good solution for public API. On the other hand, if namespaces needs to be checked at compile time (which I really want because they are incredibly error prone), it is not easy to find a design that does not end up being limiting or with avoidable runtime overhead.
I hope that in the future I will be allowed to release the source code of the ~mess~ crate I am developing. In this way it would be easier for you and other people to see, in practice, how I tackled the problem, the limitations of my solution and the errors made, in order to think about a better design that could be used inside strong-xml or other major XML crates.
P.S.: at the end of my journey I will show you the resulting annotated struct. This maybe can be useful to have an idea of what the underlying code is expected to do.
Lots of standard formats, like epub or collada, tend to use namespaces.
I totally forgot to write some updates about this.
I wrote a package for XML serialization and deserialization with namespace support. Unfortunately the project is private and I cannot publish anything related, but I can write about my experience.
First of all, it is definitely possible to support namespaces using struct and field attributes. On the other hand, it is incredibly painful, because there are a huge amount of edge cases. For instance, if a struct B declares a default namespaces and it is included in another struct A which uses a default namespace attribute on member b: B, probably the member namespace should take precedence over the declaration for B. But if this is already the default namespace, the whole ns override should be omitted for b.
I don't want to be pessimistic, but in my experience namespace support for XML using attributes should be handled as first-citizen feature, otherwise it could end up with a huge, unmaintainable mess. I don't know it it could be "too late" for strong-xml, surely the feature would require a lot of work and a good rewrite of the codebase.
The usefulness is extremely relative: if you need to support something heavily depending on namespaces (i.e.: SOAP), probably anything partially complete would not be sufficient. On the other hand, for most use cases, namespaces support would be almost useless.
I am currently working on support for this. Nothing usable yet but I have a clear path forward.
Hey guys, I am back. I did about a 70% complete implementation over on separate branch of my fork and I learned a lot about how I could do things. So out of respect to the author's original code I am going to do a rewrite that is much closer to the original. This also will potentially include a way that would make it be easier to generate xsds if we are lucky.
Also my git skills are a little lacking so, sorry in advance.