htmlentities icon indicating copy to clipboard operation
htmlentities copied to clipboard

Decode entities without semicolon when permissible

Open yulia-che opened this issue 7 years ago • 0 comments

I quite often run into deprecated html entities that don't contain trailing semicolon. I'm attaching a pr with respect to HTML 4 specification regarding character references:

Note. In SGML, it is possible to eliminate the final ";" after a character reference in some cases (e.g., at a line break or immediately before a tag). In other circumstances it may not be eliminated (e.g., in the middle of a word). We strongly suggest using the ";" in all cases to avoid problems with user agents that require this character to be present.

That is an entity would be decoded only if immediately followed by a newline character or an opening angle bracket.

yulia-che avatar Jul 26 '17 04:07 yulia-che