vocadb icon indicating copy to clipboard operation
vocadb copied to clipboard

Better HTML sanitization for Markdown content

Open riipah opened this issue 9 years ago • 0 comments

By spec, all HTML is allowed by Markdown. Currently we're HTML encoding all text before it is passed to the Markdown parser. This prevents the most obvious XSS attacks, but not all. It'd be better to sanitize the generated HTML with a whitelist of allowed tags. Obviously we can't just HTML encode all of the HTML generated by the Markdown parser (or strip HTML tags), because otherwise using Markdown would be pointless to begin with.

The HtmlSanitizer library could possibly be used for this. There's also the Web Protection Library by Microsoft, but I've heard it's not very good. CsQuery HTML parsing library could also be used.

riipah avatar Feb 28 '15 14:02 riipah