simple-html-tokenizer
simple-html-tokenizer copied to clipboard
Throw syntax errors for invalid EndTags
This PR is in response to this comment PR https://github.com/glimmerjs/glimmer-vm/pull/982#discussion_r340068648.
Currently Simple HTML Tokenizer allows leading whitespace as well as attributes to be defined in End Tags. The comment linked above suggests that we throw syntax errors in these cases as they are ultimately invalid End Tags
The HTML Spec seems to be a bit inconsistent when it comes to attributes in End Tags, as the 12.1.2.2 End tags section says;
... 4. After the tag name, there may be one or more ASCII whitespace. 5. Finally, end tags must be closed by a U+003E GREATER-THAN SIGN character (>).
However 12.2.5.7 End tag open state says to enter the 12.2.5.8 Tag name state when an the first ASCII Alpha character is encountered. The "tag name state" is not specific to End Tags, and allows for entering the "before attribute name state" and the "self-closing start tag state", both of which aren't really valid in End Tags.
This PR assumes that we want to prevent entering these invalid states and adds syntax errors in the following scenarios:
- Leading whitespace before the tagname in an EndTag. i.e
</ div>
- Attributes after EndTag tagname. i.e
</div foo="bar">
- Self closing EndTags. i.e
</div/>
up
Ok, after some fighting with git/eol characters on Windows I've got a updated working version.
This will now throw the error 'closing tag must only contain tagname'
whenever a closing tag contains trailing or leading whitespace, which also covers attributes in closing tags since they would need to be prepended with a whitespace character.
Bumping this. Hoping to get a review.
I'm happy to close this PR if it isn't relevant anymore. Thanks!
Eeck, sorry @camerondubas, will try to review tomorrow :weary:
Just noting here that this appears to address https://github.com/emberjs/ember.js/issues/19703 and https://github.com/glimmerjs/glimmer-vm/issues/1309.
Edit: don't know what the state is here since its been sitting around for a bit, but if there are things to address I'm happy to help out.
I believe this is still pending a review. @rwjblue any chance this could get looked at again?