bluemonday icon indicating copy to clipboard operation
bluemonday copied to clipboard

Paragraph sanitization (e.g. img.alt) is too restrictive, disallows punctuation

Open palant opened this issue 1 year ago • 0 comments

This regexp is used to validate alt text of images. It disallows common punctuation, which causes issues when alt text is copied from news articles or source code listings for example. The result is alt attribute being dropped, rendering the image inaccessible to vision impaired people. And the text author is unlikely to even notice the issue, as visually the result seems just fine.

Subset of common symbols (some used in non-English languages) currently forbidden by this regular expression: "„“”‘’«»#$§%‰&*+±–—:;=?‽¡¿@{}|~…°®™.

I’m not sure I understand the purpose of restricting to a specific character set here, as opposed to properly escaping special characters (which I believe bluemonday does automatically). Is the concern that the contents of the alt or title attribute might be taken as the HTML source of some pop-up? Wouldn’t it make more sense to blacklist only angle brackets then?

palant avatar Nov 23 '22 14:11 palant