java-html-sanitizer
java-html-sanitizer copied to clipboard
Doctype declaration is always removed
I'm using HtmlPolicyBuilder to build my HTML sanitization policy and I came across the issue that the doctype declaration is always removed after sanitization. How can I build a PolicyFactory that retains doctype declaration?
PolicyFactory factory = new HtmlPolicyBuilder()
.allowElements("html")
.toFactory();
String html = "<!doctype html><html></html>";
String sanitizedHtml = factory.sanitize(html); //=> "<html></html>";
Why do you want to do that?
The sanitizer produces a fragment that is safe to embed so is typically not used with whole documents or their envelopes: doctypes, <html>, <head>, or <body> elements.
This should be stated somewhere in the README.