isomorphic-dompurify icon indicating copy to clipboard operation
isomorphic-dompurify copied to clipboard

Sanitize returns empty string when `PARSER_MEDIA_TYPE: application/xhtml+xml` and void tags

Open lucamerighi opened this issue 2 years ago • 6 comments

Bug

DOMPurify.sanitize returns an empty string when ran on HTML files containing void elements when application/xhtml+xml is set as parser media type.

Version: 2.6.0

Input

<html lang="en">
<head>
    <title>Sample HTML5 Page</title>
</head>
<body>
    <p>Hello</p>
    <br>
</body>
</html>

Given output

Empty string

Expected output

<html xmlns="http://www.w3.org/1999/xhtml" lang="en">
<head>
    <title>Sample HTML5 Page</title>
</head>
<body>
    <p>Hello</p>
    <br />
</body>
</html>

Same outcome for other void elements, such as meta or link tags

lucamerighi avatar Apr 09 '24 12:04 lucamerighi

Thanks for reporting it @lucamerighi !

Can you provide the full code sample with DOMPurify.sanitize() parameters?

Did you try 2.7.0 ?

Because isomorphic-dompurify is just a wrapper around dompurify, it also makes sense to report the issue to the dompurify issue queue.

kkomelin avatar Apr 09 '24 14:04 kkomelin

Sure, here's the complete snippet

DOMPurify.sanitize(
  `
<html lang="en">
<head>
    <title>Sample HTML5 Page</title>
</head>
<body>
    <p>Hello</p>
    <br>
</body>
</html>`,
  {
    PARSER_MEDIA_TYPE: "application/xhtml+xml",
  }
);

Tried with 2.7.0 and same empty output. Removing the <br> gives the expected output

lucamerighi avatar Apr 10 '24 07:04 lucamerighi

@lucamerighi Thanks. For XHTML, can you use self-closing tags like this <br/>?

kkomelin avatar Apr 10 '24 10:04 kkomelin

Yes, in XHTML you can use self closing tags. In fact, you have to convert all void elements to their self-closing variant for it to be valid

lucamerighi avatar Apr 10 '24 12:04 lucamerighi

@lucamerighi Well, it's definitely a question/issue for dompurify developers. I'm afraid I'm not very familiar with internal stuff, so I can't help here.

And I should not suggest switching to the default PARSER_MEDIA_TYPE because you're probably using it for a reason.

kkomelin avatar Apr 10 '24 12:04 kkomelin

Opened the issue there 🤞 Ye I need it for this reason exactly, parsing HTML into valid XML which between other means means fixing this kind of stuff

lucamerighi avatar Apr 10 '24 15:04 lucamerighi

According to the dompurify maintainer, the behavior is by design https://github.com/cure53/DOMPurify/issues/938#issuecomment-2048995982 , so I'm closing this issue too.

kkomelin avatar May 11 '24 14:05 kkomelin