sanitize-html icon indicating copy to clipboard operation
sanitize-html copied to clipboard

It is possible to add a disallowed closing tag with valid HTML as well as opening tag with invalid HTML markup.

Open ghost opened this issue 2 years ago • 3 comments

I found several variants of the library's incorrect behavior. In the examples below, it is possible to add any html tag (closing tag with valid HTML as well as opening tag with invalid HTML) if any tag is allowed.

Example 1:

const sanitizeHtml = require('sanitize-html');

const sanitizedString = sanitizeHtml('<b><div/', {
    allowedTags: ['b'],
});

Expected behavior

As a result of the execution I expect to see <b></b> or a empty line. However, I get <b></div>.

Example 2:

const sanitizeHtml = require('sanitize-html');
const HTMLParser = require('node-html-parser');

const sanitizedString = sanitizeHtml('<b><b<<div/', {
    allowedTags: ['b'],
});

console.log(sanitizedString) // <b></b<<div>

const unespectedDiv = HTMLParser.parse(sanitizedString).querySelector('div');

console.log(unespectedDiv);

Expected behavior

As a result of the execution I expect to see <b></b> or a empty line. However, I get <b></b<<div>. The resulting string contains a substring <div>, which is interpreted by some parsers as a valid html tag like node-html-parser (Browsers interpret it correctly).

ghost avatar May 03 '22 14:05 ghost

Since browsers interpret it correctly I don't regard it as a security issue, but I agree it is not ideal output. It would make more sense for the additional <'s to get escaped or something.

boutell avatar May 03 '22 17:05 boutell

Would you be interested in submitting a PR?

On Tue, May 3, 2022 at 10:09 AM Ilya @.***> wrote:

I found several variants of the library's incorrect behavior. In the examples below, it is possible to add any html tag if any tag is allowed. Example 1:

const sanitizeHtml = require('sanitize-html');

const sanitizedString = sanitizeHtml('<div/', { allowedTags: ['b'], });

Expected behavior

As a result of the execution I expect to see or a empty line. However, I get . This example is absolutely valid for any browser because valid html markup is generated.

Example 2:

const sanitizeHtml = require('sanitize-html'); const HTMLParser = require('node-html-parser');

const sanitizedString = sanitizeHtml('<b<<div/', { allowedTags: ['b'], });

console.log(sanitizedString) // </b<

const unespectedDiv = HTMLParser.parse(sanitizedString).querySelector('div');

console.log(unespectedDiv);

Expected behavior

As a result of the execution I expect to see or a empty line. However, I get </b<

. The resulting string contains a substring
, which is interpreted by some parsers as a valid html tag like node-html-parser (Browsers interpret it correctly).

— Reply to this email directly, view it on GitHub https://github.com/apostrophecms/sanitize-html/issues/549, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAH27OVGF6UC7Y4F2ZSQD3VIEXSXANCNFSM5U7AWCKQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

--

THOMAS BOUTELL | CHIEF TECHNOLOGY OFFICER APOSTROPHECMS | apostrophecms.com | he/him/his

boutell avatar May 03 '22 17:05 boutell

I completely agree with you and it is not a security risk, but in some scenarios it can lead to unexpected results. However, I'm not sure I'm the best candidate for fixing this behavior. I just wanted to share some research.

ghost avatar May 04 '22 10:05 ghost