HTML attributes not highlighted if unquoted
Tempest version
2.11.4
PHP version
8.4
Operating system
Linux
Description
Consider this HTML:
<img class="a" alt src=example.png>
This library only highlights the attribute "class".
<<span style="color: #0000ff;">img</span> <span style="color: #795E26;">class</span>="a" alt src=example.png>
HTML only requires attribute values to be quoted if they contain a space, equals sign, or a few other characters. See https://html.spec.whatwg.org/multipage/syntax.html#attributes-2
So, in this case the empty alt and the unquoted src should both be highlighted.
Steps to reproduce
echo $highlighter->parse( '<img class="a" alt src=example.png>', "html" );
Thanks! Are you up for submitting a PR fix? Otherwise I'll try to look into it as soon as possible :)
Changing this:
https://github.com/tempestphp/highlight/blob/5a239a92ad6bd3e506ca86a0de3e99ac9dbcb0dd/src/Languages/Xml/Patterns/XmlAttributePattern.php#L21
To make the quote optional:
return '(?<match>[\w\-]+)="?';
Does work for unquoted but not empty values, but may be invalid for XML.
However, as this isn't a validator, it's probably fine.
If you're happy with that, I'll send a PR and will think on how to tackle the empty value case.
Yes, that should work!
Did some testing, unfortunately it's not as simple as we thought: simply making " optional makes it so that our attribute-name match part can match too many things (anything with \w- character. We'd need to fine-tune to regex to make sure it's only matching words that actually are attributes (so within <> tags)