maud icon indicating copy to clipboard operation
maud copied to clipboard

Support XML templates

Open ghost opened this issue 7 years ago • 12 comments

Well, I take a look about maud, so I thought could we reuse syntax for XML template support.

XML is a markup language much like HTML

ghost avatar Nov 14 '16 14:11 ghost

I'll be keen for a corresponding xml! macro. AFAIK the only difference is that empty elements end with /> instead of just >.

Not sure how we'd prevent mixing HTML and XML code together though.

lambda-fairy avatar Nov 15 '16 01:11 lambda-fairy

Marking as "hard", since there are some unanswered design questions (though I believe the actual coding will be easy once it's figured out).

lambda-fairy avatar Nov 17 '16 08:11 lambda-fairy

Would be really interested in this as well!

gauteh avatar Sep 13 '20 07:09 gauteh

Did a quick test with XML generation and maud seems already usable for many use cases. What is it that should be supported for XML?

  • Respecting Namespaces. This is invalid XML because there are two attributes with the same name and the same namspace URI:
    <foo xmlns:a="http://example.com" xmlns:b="http://example.com" a:bar="test" b:bar="toast"/>
    
    Maud currently happily accepts to output this, while it rightly refuses to produce something like
    <foo a="test" a="toast"/>
    
    But it's already usable. The user still needs to take some responsibility not to output invalid XML. This is also such an edge case that it should rarely impose a problem in practice. It might be more important to make sure that all used namespace prefixes are bound.
  • Arbitrary doctypes besides html
  • Processing instructions
  • Possibly CDATA sections, but that's not really necessary because a consuming applications should not make a difference between text stored in CDATA and text that is ampersand escaped.

I assume that escaping is already working as required. Doctypes with local element type declarations or entity declarations can probably be ignored.

th-we avatar Jan 02 '21 22:01 th-we

Thinking about namespace validation: Here are examples demonstrating the challenges:

use maud::html as xml;
use maud::html as subtree;
use maud::PreEscaped;

fn subtree() -> PreEscaped<String> {
    subtree! {
        subtree ns1:bar="test" ns2:baz="toast" {}
    }
}

fn main() {
    let markup1 = xml! {
        foo xmlns:ns1="NS:1"  {
            (subtree())
        }
    };
    println!("{}", markup1.into_string());


    let reusable_subtree = subtree();
    let markup2 = xml! {
        root xmlns:ns1="NS:1" xmlns:ns2="NS:2" {
            child1 {
                (reusable_subtree)
            }
            child2 xmlns="NS:1" {
                (reusable_subtree)
            }
        }
    };
    println!("{}", markup2.into_string());
}

Problems here are:

  1. For markup1 we end up with an unbound prefix ns2.
  2. In markup2, the <subtree> will end up in two different namespaces. Inside <child1>, it will be interpreted as an element from the empty namespace, and inside <child2>, it will be interpreted as an element from the NS:1 namespace.

Two alternative suggestions how to fix these problems:

Messy but easy

It is enforced that all namespace prefixes and the default namespace (even if it's empty!) are re-declared in the subtree markup. Disadvantage: Produces messy XML and macro code with xmlns* attributes all over the place.

fn subtree() -> Subtree<String> {
    subtree! {
        subtree xmlns="" xmlns:ns1="NS:1" xmlns:ns2="NS:2"
            ns1:bar="test" ns2:baz="toast" {}
    }
}

Cleaner, but more involved

The subtree! macro returns a struct with information about all the unbound prefixes and the namespaces it expects them to be bound to. The runtime then has to make sure all the prefixes are bound as expected at the place of insertion. This means it either has to ensure that the existing bindings are OK or it has to create them dynamically.

As most of the time the same prefixes will be used everywhere, it's possible to declare a global constant for the mapping, maybe using a dedicated macro like so:

prefixes! {
    static PREFIXES = [
        ("", ""),
        ("ns1", "NS:1"),
        ("ns2", "NS:2"),
        ("ns3", "NS:3"),
    ]
}

fn subtree() -> Subtree<String> {
    subtree! PREFIXES {
        subtree ns1:bar="test" ns2:baz="toast" {}
    }
}

Because ns3 is not used by the subtree, the returned Subtree struct should only report that the bindings for ns1 and ns2 as well as the empty default namespace are required.

How could insertions at different places with different namespace bindings (especially different default namespaces) become possible? That might be solvable if at the place of insertion the runtime compares the bindings that are in action to the bindings that are expected by the subtree and declare any missing bindings in the "root" element of the subtree.

The "root" element of the subtree can therefore not be finalized until it is inserted. For the above example this means that a string could be pre-built for ns1:bar="test" ns2:baz="toast"/>. When inserting into the context, the leading part has to be constructed as needed (e.g. <subtree , or <subtree xmlns="" , or <subtree xmlns="" xmlns:ns1="NS:1", ...).

The reserved prefixes xml:* and xmlns:* (bound to the namespaces http://www.w3.org/XML/1998/namespace and http://www.w3.org/2000/xmlns/) of course always have to be treated specially as they are reserved and pre-bound without declaration.

th-we avatar Jan 03 '21 01:01 th-we

I think the namespace problem is solvable (probably with an HList that keeps track of what namespaces are in scope).

I'm actually wondering more about use cases.

If the use case is generating SVG/MathML, then what you want isn't XML support, it's SVG/MathML support. Because:

  1. If we know it's SVG/MathML, then we can do more specific checks. (This will become important later when context-aware escaping is implemented.)
  2. When embedded in HTML, SVG/MathML are not parsed as XML, but as HTML foreign elements, which have their own parsing rules unlike both HTML and XML.

lambda-fairy avatar Apr 24 '21 11:04 lambda-fairy

My use case is making calls to a SOAP webservice.

dcampbell24 avatar Apr 24 '21 15:04 dcampbell24

@ElnuDev on #388:

This could potentially be behind a feature flag.

Feature flags are global, which I think we don't want for this feature. Since you can embed SVG foreign content (which uses XML-ish syntax) within an HTML document.

lambda-fairy avatar Aug 27 '23 04:08 lambda-fairy

sooooo, what's the status on this?

FallBackITA27 avatar Oct 14 '23 19:10 FallBackITA27

I'm actually wondering more about use cases.

There are many. Rust has a lack of JS oriented tooling (yes, I'm serious in saying this), as in, a lot of JS Tooling hasn't yet been replaced by Rust, namely things like D3.js. I believe that if maud implemented XML support, it would make it a lot easier to write replacements for such tooling

If the use case is generating SVG/MathML, then what you want isn't XML support, it's SVG/MathML support.

I wonder what the issue would be in just, implementing small features to atleast call it "supported" (such as the option to output self closing tags) and just move to a later date more in depth features (such as specific SVG checks). There are all the bits to do everything XML requires, and given MathML and SVG are just derivatives... What's the issue?

Yes, SVG in HTML is parsed differently, but realistically some people (f.e. me tbh) would be using these just for XML and SVG files on their own. And in general, in most cases you could just separate the extra bits that are different from one to the other. For example, SVG elements in HTML don't require the Doctype, you could just make it separate like you did for the HTML doctype.

FallBackITA27 avatar Oct 14 '23 19:10 FallBackITA27

I think a xml! marco would be very nice. With this we could also generate XML or Atom Feeds.

tobiastom avatar Dec 20 '23 14:12 tobiastom