parse5
parse5 copied to clipboard
Outlook VML statements become commented
version: 7.1.2
This html is parsed as a single comment node:
<!--[if !vml]>
<span style="mso-ignore:vglayout;position:absolute;z-index:252065792;margin-left:371px;margin-top:154px;width:645px;height:1px">
<img width="430" height="1" style="width:4.4791in;height:.0069in" src="cid:[email protected]" v:shapes="Picture_x0020_2074">
</span>
<![endif]-->
This html is parsed as a 2 independent comment nodes and all internal content appears as a valid active html:
<![if !vml]>
<span style="mso-ignore:vglayout;position:absolute;z-index:252065792;margin-left:371px;margin-top:154px;width:645px;height:1px">
<img width="430" height="1" style="width:4.4791in;height:.0069in" src="cid:[email protected]" v:shapes="Picture_x0020_2074">
</span>
<![endif]>
When we serialize it back the parser adds comments around these VML tags and the html is rendered different in the Outlook Windows Desktop client. The <![if !vml]> is converted to the <!--[if !vml]--> and the <![endif]> is converted to the <!--[endif]-->.
This is the code example:
const parse5 = require("parse5");
let doc = parse5.parse(`<!DOCTYPE html><html><body><![if !vml]>
<span style="mso-ignore:vglayout;position:absolute;z-index:252065792;margin-left:371px;margin-top:154px;width:645px;height:1px">
<img width="430" height="1" style="width:4.4791in;height:.0069in" src="cid:[email protected]" v:shapes="Picture_x0020_2074">
</span>
<![endif]></body></html>`);
let res = parse5.serialize(doc);
The resulted serialized html should look the same as the original one. Opened after: https://github.com/cure53/DOMPurify/issues/819
The behavior you're seeing with the two separate comment nodes is technically required per spec: https://html.spec.whatwg.org/multipage/parsing.html#parse-error-incorrectly-opened-comment