java-html-sanitizer
java-html-sanitizer copied to clipboard
is <plaintext> element required ??
Hi,
So I have a text as "<img src=x onerror=window.open('http://evil.test.com/');/>"
and the following is the policy
PolicyFactory policy = new HtmlPolicyBuilder()
.allowElements("a")
.allowUrlProtocols("https", "http")
.allowAttributes("href").onElements("a")
.toFactory();
When I sanitize it, I dont get any output. Although when I add
String text = "<plaintext><img src=x onerror=window.open('http://evil.test.com/');/>";
Why do I need to put plaintext element here? Any alternatives to it?
Appreciate your help.
Thanks
In your code snippet, you are not whitelisting the img element or the atributes src and onerror, so yes, it is expected that all markup gets removed from your input. That's exactly the goal of sanitizing.
Although when I add
to the text, it does get sanitized.
What do you mean when you say "get sanitized" here? What is the output? I would expect the input to be returned essentially unmodified.
Adding <plaintext> will mark everything after that start tag (including things like </body>) as non-markup text. The sanitizer will ignore everything after that start tag, and if you tried to place such a string into a document, you would probably break the document. I doubt <plaintext> is something you want.