markup.rs icon indicating copy to clipboard operation
markup.rs copied to clipboard

Using Render trait for element and attribute names makes it easy to generate invalid HTML

Open utkarshkukreti opened this issue 1 year ago • 1 comments

Follow up to #28, which has been fixed now.

markup::define! {
    A { $"/" {} }
    B { div["=" = "c"] {} }
}

fn main() {
    println!("{}", A {});
    println!("{}", B {});
}

prints:

</><//>
<div =="c"></div>

This one is more tricky to fix since element and attribute names cannot be just escaped - they have a set of valid characters.

Element names: ref

Tags contain a tag name, giving the element's name. HTML elements all have names that only use ASCII alphanumerics. In the HTML syntax, tag names, even those for foreign elements, may be written with any mix of lower- and uppercase letters that, when converted to all-lowercase, matches the element's tag name; tag names are case-insensitive.

But this does not include custom elements, I'll have to check.

Attribute names: ref

Attributes have a name and a value. Attribute names must consist of one or more characters other than controls, U+0020 SPACE, U+0022 ("), U+0027 ('), U+003E (>), U+002F (/), U+003D (=), and noncharacters. In the HTML syntax, attribute names, even those for foreign elements, may be written with any mix of ASCII lower and ASCII upper alphas.


There could be two ways to implement this check:

  1. Return an error on encountering an invalid element or attribute name.
  2. Strip invalid characters from element and attribute names. (But the remaining characters may still not be a valid name if the original name only consists of invalid characters.)

I'm leaning towards (1) after I figure out what characters exactly are allowed in element names.

utkarshkukreti avatar Mar 28 '23 06:03 utkarshkukreti