html-escape icon indicating copy to clipboard operation
html-escape copied to clipboard

Script escaping is incorrect

Open lambda-fairy opened this issue 4 years ago • 8 comments

Thanks for maintaining this library @magiclen!

I found an issue with the script escaping code. It appears to escape </script> only, and leave <!-- and <script> unchanged.

This is incorrect: if the HTML parser encounters the string <!-- <script>, then it enters a double-escaped state, where it will take two </script> tags to close the element.

https://html.spec.whatwg.org/multipage/scripting.html#restrictions-for-contents-of-script-elements

lambda-fairy avatar Nov 07 '21 04:11 lambda-fairy

BTW, I recommend submitting a RustSec advisory for this, so that users are aware of the issue :smile:

lambda-fairy avatar Nov 07 '21 04:11 lambda-fairy

Hi, I would like to know in which case that you need to use encode_script* functions with HTML comments?

magiclen avatar Dec 09 '22 00:12 magiclen

One (discouraged but common) use case is to embed JSON in a page using a <script> element.

If that JSON includes user data, then a malicious user could include <!-- <script> in their input, and then corrupt the resulting page. (That might lead to XSS as well, though I can't think of a proof of concept right now.)

lambda-fairy avatar Dec 17 '22 04:12 lambda-fairy

After some experiments, I found <!-- and --> need to appear in pairs even in a single <script> element. Or the text after <!-- will seem to be considered as a comment.

<!DOCTYPE html>
<html>
<head>
    <meta charset="UTF-8">
    <title></title>
</head>
<body>
    Hello
    <script>
        alert("<!--");
        alert("123");
        alert("<script>");
        alert("<\/script>");
        alert("456");
        alert("-->");
    </script>
    world!
</body>
</html>

The HTML above is correct and shows the alert dialog six times. However, if we remove -->, no alert dialog will be shown.

The following HTML code is also incorrect

<!DOCTYPE html>
<html>
<head>
    <meta charset="UTF-8">
    <title></title>
</head>
<body>
    Hello
    <script>
        alert("<!--");
        alert("123");
        alert("<script>");
        alert("<\/script>");
        alert("456");
    </script>
    world!
    <script>
        alert("-->");
    </script>
</body>
</html>

magiclen avatar Dec 19 '22 03:12 magiclen

<!-- can be replaced with <\!--.

<!DOCTYPE html>
<html>
<head>
    <meta charset="UTF-8">
    <title></title>
</head>
<body>
    Hello
    <script>
        alert("<\!--");
        alert("123");
        alert("<script>");
        alert("<\/script>");
        alert("456");
    </script>
    world!
    <script>
        alert("-->");
    </script>
</body>
</html>

The HTML above is correct and shows the alert dialog six times.

magiclen avatar Dec 19 '22 03:12 magiclen

Internet Explorer has the same behavior.

magiclen avatar Dec 19 '22 03:12 magiclen

One (discouraged but common) use case is to embed JSON in a page using a .

To clarify, I think there are two use cases here:

  1. Embedding untrusted JSON.
  2. Inlining a trusted script (i.e. replacing <script src="foo.js"> with <script>[contents of foo.js]</script>.

For (1), I think something similar to your proposal would work. (Though I would ask a web security expert for review first.)

For (2), I'm not so sure. Because </script> might appear outside of a string. For example, this code is strange but still valid:

// Check if the number 1 is less than the regex "script>"
1 </script>/;

Escaping would be incorrect here.

So for arbitrary JavaScript (not just JSON), I think it would be safer to raise an error instead.

lambda-fairy avatar Jan 28 '23 10:01 lambda-fairy