html icon indicating copy to clipboard operation
html copied to clipboard

Remove the inert document from the HTML fragment parsing algorithm

Open foolip opened this issue 3 weeks ago • 2 comments

  • Remove the inert document from the HTML fragment parser
  • Add a target argument to the HTML/XML fragment parsing algorithms
  • [ ] At least two implementers are interested (and none opposed):
  • [ ] Tests are written and can be reviewed and commented upon at:
  • [ ] Implementation bugs are filed:
    • Chromium: …
    • Gecko: …
    • WebKit: …
    • Deno (only for timers, structured clone, base64 utils, channel messaging, module resolution, web workers, and web storage): …
    • Node.js (only for timers, structured clone, base64 utils, channel messaging, and module resolution): …
  • [ ] Corresponding HTML AAM & ARIA in HTML issues & PRs:
  • [ ] MDN issue is filed: …
  • [ ] The top of this comment includes a clear commit message to use.

(See WHATWG Working Mode: Changes for more details.)


/dynamic-markup-insertion.html ( diff ) /parsing.html ( diff ) /xhtml.html ( diff )

foolip avatar Nov 28 '25 18:11 foolip

This is speculative editing following https://github.com/whatwg/html/issues/11669#issuecomment-3589383629 to see what it would mean to remove the inert document from the HTML fragment parsing algorithm.

There are two questions, one small and one big.

Small: Is it necessary to put something on the stack of open elements to not violate assumptions elsewhere? At least Chromium and WebKit put a DocumentFragment on the stack of open elements, but that's not an element. In a quick survey of "stack of open elements" I couldn't find anything that would be broken by letting it be empty, but if there is something perhaps the context element or a shallow copy of it could be placed on the stack of open elements.

Big: What were the side effects of using an inert document that implementations might have achieved in some other way, and that also need to be spec'd?

The main reason for exploring this is to pave way for streamHTMLUnsafe() to simply insert directly into the target node, but it's not strictly necessary, the inert document could be kept around in the definition of existing APIs if it's too risky to change.

cc @zcorpan

foolip avatar Nov 28 '25 19:11 foolip

I think I agree this is what we need to do, but I don't want to lose sight of the requirements for streamHTML() (and setHTML()) while we do this. For those cases we do still want to create in a separate document (and then maybe mutate) before moving things over.

Do you mean for sanitizer, or are there other reasons to use an intermediate document? My thinking was that we'd integrate the sanitizer into the parser so that it's streaming in order to support streamHTML(), and then probably setHTML() could just use the same setup.

foolip avatar Dec 01 '25 08:12 foolip