sdk icon indicating copy to clipboard operation
sdk copied to clipboard

[webdav] Escape HTML entities by number

Open CodingKoopa opened this issue 3 years ago • 0 comments

This pull request addresses issue #2599 by escaping the HTML entities by number, rather than the shorthand identifiers that strict XML parsers don't understand.

To obtain the list, I:

  • Downloaded Blink's HTML entity list
  • Removed entries at the bottom by hand that had two codepoints.
  • Removed duplicate entries by replacing
^"(.+?)","U\+([0-9a-fA-F]{5})"\n"(.+?)","U\+(\2)"$

with

"$1","U+$2"`
  • Converted the list to the source code format by replacing
^"(.+)?","U\+([0-9a-fA-F]{5})"$

with

        escapesec[$2] = "&#x$2;"; // $1

#2596 also fixes the aforementioned issue, which I admittedly didn't notice until submitting this PR. I believe they are sufficiently different approaches to the problem though.

This approach pollutes utils.cpp considerably, it may be better to split this off into a separate file.

Thanks!

CodingKoopa avatar Dec 01 '21 07:12 CodingKoopa