syn-rsx icon indicating copy to clipboard operation
syn-rsx copied to clipboard

Some node names that include numbers are not parsed

Open gbj opened this issue 3 years ago • 3 comments

Per the HTML spec, an attribute name like data-id-25 is valid (albeit maybe unusual)

Attribute names must consist of one or more characters other than the space characters, U+0000 NULL, U+0022 QUOTATION MARK ("), U+0027 APOSTROPHE ('), U+003E GREATER-THAN SIGN (>), U+002F SOLIDUS (/), and U+003D EQUALS SIGN (=) characters, the control characters, and any characters that are not defined by Unicode. link

This can't be parsed by syn-rsx at present because each segment of the node name needs to be an Ident, and 25 isn't a valid Rust identifier.

You're probably more familiar with syn than I am but I wonder if changing the Punctuated variant of NodeName to something like

Punctuated(Punctuated<NodeNameFragment, Punct>),

where

pub enum NodeNameFragment {
  Ident(Ident),
  Number(u32)
}

would be a possibility. This only expands the possibilities to include numbers, and not other possibly-valid attribute names that aren't valid Rust identifiers, but would be a start.

Here's an issue in my repo with an example of where someone would use it—using a class: syntax to toggle a Tailwind-like CSS class name that included a number after a dash.

gbj avatar Nov 12 '22 11:11 gbj

Turning the fragment into an enum sounds generally like a workaround that could work. However, while giving an implementation a quick look, it turned out that this might get tricky in terms of adjusting the node_name_punctuated_ident_with_alternate method.

An alternative would be to switch an approach that also relies on https://github.com/rust-lang/rust/issues/54725, just like the quoted text feature. Basically: Parse all tree tokens until we hit any of the not allowed chars. But given that's nightly only, it's probably not something that would work as solution right now for your consumers? Because further expanding the node_name_punctuated_ident_with_alternate method, which already is kinda hack-ish doesn't sound too appealing, tho I'd be totally fine with doing so if we have to.

stoically avatar Nov 24 '22 21:11 stoically

I'd be comfortable relying on nightly if people want to be able to use this particular kind of node name. (As long as the library also works without nightly and just works as it does currently!)

We already encourage nightly for some features it enables, with an opt-out feature for stable, so it would work for me at least.

gbj avatar Nov 24 '22 22:11 gbj

Didn't have a chance to continue working on this and probably won't for some time still. There's an unquoted-text branch, in case you are interested to give it look. I somehow didn't manage to get the correct line/column numbers for spans.

stoically avatar Dec 26 '22 21:12 stoically