HTML: Cap nesting depth
Almost all HTML parsers are vulnerable to stack overflows and denial-of-service attacks, because HTML parsing can easily be made to exhibit quadratic behavior by deep nesting of documents, and because naive tree-traversal algorithms are recursive.
I propose that the nesting depth of an HTML document be capped at 500. Any attempt to insert a node at a depth past that is erroneous — DOM APIs throw exceptions, while an implicit end tag is inserted if this is encountered during parsing.
I don't believe that preventing pages from crashing themselves is in our threat model, but maybe @whatwg/security can confirm.
The problem is that while every modern browser uses a multiprocess architecture, there are entities that parse HTML other than browsers, and these are subject to the same attacks. They should not need to parse HTML to different parse trees for security.
On Sun, Jun 3, 2018, 9:33 PM Domenic Denicola [email protected] wrote:
I don't believe that preventing pages from crashing themselves is in our threat model, but maybe @whatwg/security https://github.com/orgs/whatwg/teams/security can confirm.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/whatwg/html/issues/3732#issuecomment-394211720, or mute the thread https://github.com/notifications/unsubscribe-auth/AGGWB5aBSAvhujFJK2SuXNjUmwVE56HXks5t5I51gaJpZM4UYO5A .
Those parsers aren't covered by this specification; if that is your concern, you should file issues on those parsers instead of on the specification for web browsers.
Correct, they are not. But they are generally trying to follow it.
I know of no way that a server side application can safely parse HTML according to the specs and not be vulnerable to denial of service attacks.
Furthermore, Safari already has a nesting limit in its parser. Firefox and Chrome don’t. I think that there should be a single nesting limit that applies everywhere.
On Sun, Jun 3, 2018, 9:36 PM Domenic Denicola [email protected] wrote:
Those parsers aren't covered by this specification; if that is your concern, you should file issues on those parsers instead of on the specification for web browsers.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/whatwg/html/issues/3732#issuecomment-394212088, or mute the thread https://github.com/notifications/unsubscribe-auth/AGGWB_AbnHiQ4UueEEYsIOzGsodR006Zks5t5I86gaJpZM4UYO5A .
Thanks for the clarification. I'll close this, and you can file bugs on those other parsers if you want them to not follow the spec (which makes sense for non web browsers with different threat models), and on Safari if you want it to follow the spec.
Reopening as I forgot I wanted to wait a while to confirm the threat model with @whatwg/security. Please forgive the sloppy 4am bug-triaging :)
@domenic Firefox and Blink (as of Sept 2017 at least) also violate the current HTML parsing spec at large nesting depths, fwiw. They just violate it in a different way than Safari (and different versions of Firefox do it slightly differently).
Please see discussion and testcases in https://bugzilla.mozilla.org/show_bug.cgi?id=256180 and https://bugzilla.mozilla.org/show_bug.cgi?id=1188731 and https://groups.google.com/d/msg/mozilla.dev.platform/SUknMzK1ZAc/sMNBAjKCAgAJ (this last complete with @hsivonen saying "I'm rather unhappy about the prospect of having to examine another browser's HTML parser beyond reading the spec in order to achieve interop".... of course then he had to do just that with Blink, to achieve interop).
And to be clear, a hard limit of 500 with failures when it's reached is probably not web-compatible enough, but it's hard to say without more data.
I was thinking to fail if is added by JS, but otherwise to close the tag before adding a new one.
On Sun, Jun 3, 2018, 11:38 PM Boris Zbarsky [email protected] wrote:
And to be clear, a hard limit of 500 with failures when it's reached is probably not web-compatible enough, but it's hard to say without more data.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/whatwg/html/issues/3732#issuecomment-394226008, or mute the thread https://github.com/notifications/unsubscribe-auth/AGGWB6BSWzCoiEpxH_9y-Jz2Q6zhBhEWks5t5KuXgaJpZM4UYO5A .
Does any browser have the "fail if added by JS" behavior?
For the other, please check with @hsivonen what his investigations mentioned above showed.
The HTML parser already has some limits, I don't see a problem in principle to specify this one as well.
The Gecko-side status is:
- I want Gecko's parser limit to match Blink's parser limit exactly. Blink's magic number is 512, but if you put it in the spec, be sure to check that you count the same way as Blink and use the same comparison operator. (My Gecko patch has 511 as the magic number in order to get the same black-box behavior with different counting.) Please don't invent a new arbitrary number (i.e. please don't make it 500 exactly for no good reason).
- It's very important that the parser not throw away nodes but instead reparent them. See the
nodeFromStackWithBlinkCompat()method in my patch - The Gecko parser patch needs a corresponding layout patch to allow deeper frame construction.
- The patches haven't landed, because I failed to figure out how to increase the runtime stack on Android so as to not crash on Robohornet with the higher layout limits. Robohornet says "RoboHornet is a benchmark designed around performance pain points real web developers care about." Yet, it creates an unrealistically deep DOM tree. Still, the abandoned benchmark is out there and it would look bad if Firefox for Android crashed on it. So in the mean time, some webmail users don't see parts of emails from MUAs that generate bad HTML. But hey, at least we don't crash on an unrealistic benchmark.
I'm not convinced that we need to make DOM APIs throw at a certain depth limit. Has interop on that point been a problem in practice. (It is on the parser side.)
It is if we want to ensure that browsers do not stack overflow. Stack overflow can be exploited in practice — for instance, they can be used for a denial-of-service attack on in-process webviews (as found on iOS), and I believe they can be exploited for arbitrary code execution in certain cases.
On Mon, Jun 4, 2018, 7:55 AM Henri Sivonen [email protected] wrote:
The Gecko-side status is:
- I want Gecko's parser limit to match Blink's parser limit exactly. Blink's magic number is 512, but if you put it in the spec, be sure to check that you count the same way as Blink and use the same comparison operator. (My Gecko patch has 511 as the magic number in order to get the same black-box behavior with different counting.) Please don't invent a new arbitrary number (i.e. please don't make it 500 exactly for no good reason).
- It's very important that the parser not throw away nodes but instead reparent them. See the nodeFromStackWithBlinkCompat() method in my patch https://reviewboard.mozilla.org/r/178790/diff/21#index_header
- The Gecko parser patch needs a corresponding layout patch to allow deeper frame construction.
- The patches haven't landed, because I failed to figure out how to increase the runtime stack on Android so as to not crash on Robohornet with the higher layout limits. Robohornet says "RoboHornet is a benchmark designed around performance pain points real web developers care about." Yet, it creates an unrealistically deep DOM tree. Still, the abandoned benchmark is out there and it would look bad if Firefox for Android crashed on it. So in the mean time, some webmail users don't see parts of emails from MUAs that generate bad HTML. But hey, at least we don't crash on an unrealistic benchmark.
I'm not convinced that we need to make DOM APIs throw at a certain depth limit. Has interop on that point been a problem in practice. (It is on the parser side.)
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/whatwg/html/issues/3732#issuecomment-394329075, or mute the thread https://github.com/notifications/unsubscribe-auth/AGGWBwBz3G3Of19G09NPwShW7g_9UxSrks5t5SA5gaJpZM4UYO5A .
Stack overflow in the sense of "try to allocate stack and fail" is not the same as stack overflow in the sense of "try to write to the stack and write outside its bounds". The former is not exploitable to achieve code execution, to my knowledge, though I would love to see a reference if it is.
It can be in kernel contexts. It also can be in userspace, if a single very large stack allocation occurs and skips the guard page.
That said, I still think that the risk of denial of service attacks alone warrants a limit. A web page should not be able to crash the host application, even if it runs in-process. Consider a browser that gets its security from being written almost entirely in a memory safe language, but which cannot support sandboxing because its host environment does not.
On Thu, Jun 7, 2018, 11:16 PM Boris Zbarsky [email protected] wrote:
Stack overflow in the sense of "try to allocate stack and fail" is not the same as stack overflow in the sense of "try to write to the stack and write outside its bounds". The former is not exploitable to achieve code execution, to my knowledge, though I would love to see a reference if it is.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/whatwg/html/issues/3732#issuecomment-395633674, or mute the thread https://github.com/notifications/unsubscribe-auth/AGGWB7mHJImZfsRr0tKkB4d2Vk-P1qAwks5t6eyigaJpZM4UYO5A .
That said, I still think that the risk of denial of service attacks alone warrants a limit.
I disagree. There are so many ways to crash a browser engine in a safe way by resource exhaustion that it only makes sense to put limits for cases that get easily triggered by accident. Countering deliberate resource exhaustion attacks by means other than letting a content process crash is futile.
The Gecko-side status is
The fix to align the HTML parser in Gecko with the HTML parser in Blink has landed. The Java diff is the easiest one to map back to spec text. There are reftests.
Now that Gecko and Blink agree, I think the behavior seen in the above Java diff should be put in the spec.
Specifically, in places where the code calls nodeFromStackWithBlinkCompat(), instead of taking the most-recently-pushed element from the tree builder stack, element at index 511 is taken if there are more elements on the stack. Assumes zero-based indexing where the "html" element is at index 0.
Although I still don't really understand the motivation for why browsers are doing this, we should indeed align the spec with implementations. @hsivonen, would you be willing to submit a pull request?
Although I still don't really understand the motivation for why browsers are doing this,
Basically, to not crash any time you load e-mail stuff via the web. Unfortunately, (1) mail archive software tends to do a poor job of escaping the <[email protected]> bits in mail headers and just dumps it on the web as-is and (2) HTML mail creation tends to create hugely nested tag hierarchies because their many HTML editors are terrible.
And while preventing pages from crashing themselves deliberately is not in browsers' threat model, browsers do have incentive to not crash the gmail process when users click on something in their gmail inbox.
And while preventing pages from crashing themselves deliberately is not in browsers' threat model, browsers do have incentive to not crash the gmail process when users click on something in their gmail inbox.
Furthermore, browsers are not the only programs that need to parse HTML. Server-side HTML sanitizers do as well, and there denial of service is a legitimate attack vector.
The Gecko-side status is:
- I want Gecko's parser limit to match Blink's parser limit exactly. Blink's magic number is 512, but if you put it in the spec, be sure to check that you count the same way as Blink and use the same comparison operator. (My Gecko patch has 511 as the magic number in order to get the same black-box behavior with different counting.) Please don't invent a new arbitrary number (i.e. please don't make it 500 exactly for no good reason).
FWIW, the limit comes from a 9 year old code change in WebKit before Blink fork so WebKit shares the same limit. We might as well as integrate the limit into the spec given all three major browsers agree on the parser limit at this point.
(Our current limit seems to be 512, notwithstanding that commit starting at 4096).
would you be willing to submit a pull request?
Unfortunately, I expect not to have time for this in the near future. However, if someone want to volunteer, I still recommend looking at the Java patch.
Copy-pasting here what I first submitted as a new issue:
What is the issue with the HTML Standard?
Browsers do not support excessively nested DOM trees.
Consider this demo, which consists of a page with a <body> containing 600 <div> elements nested inside each other:
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<title>Deep nesting</title>
<style>
div {
border: 1px solid black;
margin: 1px;
min-height: 30px;
min-width: 30px;
}
html {
counter-reset: count;
}
* {
counter-increment: count;
}
div:has(+ div):not(div + div)::before {
content: counter(count);
}
</style>
</head>
<body>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
<div>
</body>
</html>
Each div has a border of 1px and a margin of 1px, so the expected rendering would be something like the following:
However, Chrome/Firefox/Safari render something like this:
You can see that at some point it stops nesting elements, and instead adds them at sibling. In all browser, there is some MAX_DEPTH, and if the stack of open elements has already a length of MAX_DEPTH then before pushing a new element to the stack they'll pop the previous one. Adding some CSS counters we can see that in Chrome&Firefox the max depth is 513, while in Safari it is 512.
This behavior does not seem to be defined (or at least, not defined in https://html.spec.whatwg.org/multipage/parsing.html#insert-a-foreign-element which is where I'd expect to see it).
There are three things that can be defined here:
- that there is a maximum depth
- what the maximum depth should be
- what to do when reaching it
I care especially about (3), since I hit a bug in a "spec-compliant HTML parsing library", whose code is basically a copy of the HTML parsing algorithms, where it was giving different results from browsers (https://github.com/jhy/jsoup/issues/2416).
I'm slowly investigating this, and I'm collecting some test cases in https://github.com/nicolo-ribaudo/deeply-nested-html, where the folders naming scheme is (last element before the limit)-(first element after the limit).
Other than the limit inconsistency (513 vs 512) I noticed another one when it comes to autogenerated elements.
When the last element before the limit is a <table>, and it contains a <tr>/<td>, Firefox will generate <table></table><tbody></tbody><tr></tr>, while Chrome/Safari generate <table></table><tbody><tr></tr></tbody> thus getting past the limit.
As a consequence of that, two sibling <td> elements might not end up as siblings in Chrome/Safari (see https://github.com/nicolo-ribaudo/deeply-nested-html/tree/main/tbody-2td). If we specify something, it should probably be Firefox's behavior.
Another difference, where Firefox/Chrome behave differently from Safari, is what to do when figuring out what is the opening element corresponding to a closing tag: https://github.com/nicolo-ribaudo/deeply-nested-html/tree/main/span-span-a-closed-span
Given an input of <span><span><a></a></span>, where <a> is put as a sibling of the innermost <span> due to the depth limit:
- Chrome/Firefox will consider the
</span>as closing the innermost span (effectively ignoring it?) - Safari will consider it as closing the outermost span, because the innermost one is already closed by pushing
<a>as a sibling of it
I think the Chrome/Firefox behavior is more intuitive, but it means that the stack size is actually unlimited.
Any limit needs to ensure that it does not create mutation XSS vulnerabilities. I’m not sure how to ensure that, though.
I'm working towards a concrete proposal for this.
@annevk Do you think Safari would be open to aligning with Chrome/Firefox's parsing approach (limited tree depth, but unlimited stack size for matching opening&closing tags)?
Maybe. Can you trigger a stack overflow in Chrome and Firefox?
Both Chrome and Firefox can handle a stack size of 2+218 (I'm testing with a document that just contains 2n nested divs). WebKit at lower numbers (I'm getting a "page unresponsive" popup, but I'm testing in Gnome Web and not in Safari right now).
Looking at Chrome's code the maximum stack size is represented by a u32, but it's impossible to actually reach it (https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/renderer/core/html/parser/html_element_stack.h;l=173;drc=97e7fa75720cf98a23b74404a0b14b9e7ac249d1). Even just assuming 3 bytes per tag, you'd need a 12GB document. It takes about 2 minutes to parse/render one with 218, so for one with 232 it would take days.
EDIT: 220 crashes the page in Chrome ("Aw, Snap"), and gets stuck forever in firefox.