html icon indicating copy to clipboard operation
html copied to clipboard

Coherent story for HTML-setting methods

Open noamr opened this issue 3 months ago • 34 comments

[Note: OP edited after some discussion]

What is the issue with the HTML Standard?

With the introduction of setHTML{Unsafe}, we have multiple ways of applying an HTML string into an existing document:

  • setHTML
  • setHTMLUnsafe
  • innerHTML and outerHTML setters
  • createContextualFragment
  • insertAdjacentHTML
  • Detached document streaming (document.write() into an inactive document and inserting the root node the active one)

With #2142 and #11542, we are about to introduce even more methods that insert HTML asynchronously using a stream. Some of these techniques support different levels of sanitation and slightly different behaviors.

So this is a good time to have a wider overview of these methods, and come up with a consistent structure for their API.

Specifically, the following variants apply when inserting HTML:

  • Does the HTML have to go through the sanitizer ("safe")
  • Does the HTML replace the whole contents of the element or just parts
  • Is the HTML passed as a string or as a stream (as per #2142)
  • Is the HTML parsed as a whole, or containing interleaved patches (as per #11542)
  • How does script execution work
  • How this works with TrustedTypes.

API shape

-Treat unsafe/safe stream/set and replaceChildren/replaceWith/append/prepend/before/after as permutations that are best exposed as different methods (226 = 20 methods). We can tweak this of course

dictionary UnsafeHTMLSetterOptions  {
  (Sanitizer or SanitizerConfig)? sanitizer = null;
  boolean? runScripts = false;
};

dictionary SafeHTMLSetterOptions {
  // The user of this dictionary must ensure a sanitizer is provided.
  (Sanitizer or SanitizerConfig) sanitizer;
};

[Exposed=Window]
interface ParentNode {
  void setHTML((DOMString or TrustedHTML) html, SafeHTMLSetterOptions options);
  void setHTMLUnsafe((DOMString or TrustedHTML) html, optional UnsafeHTMLSetterOptions options = {});
  void beforeHTML((DOMString or TrustedHTML) html, SafeHTMLSetterOptions options);
  void beforeHTMLUnsafe((DOMString or TrustedHTML) html, optional UnsafeHTMLSetterOptions options = {});
  void afterHTML((DOMString or TrustedHTML) html, SafeHTMLSetterOptions options);
  void afterHTMLUnsafe((DOMString or TrustedHTML) html, optional UnsafeHTMLSetterOptions options = {});
  void appendHTML((DOMString or TrustedHTML) html, SafeHTMLSetterOptions options);
  void appendHTMLUnsafe((DOMString or TrustedHTML) html, optional UnsafeHTMLSetterOptions options = {});
  void prependHTML((DOMString or TrustedHTML) html, SafeHTMLSetterOptions options);
  void prependHTMLUnsafe((DOMString or TrustedHTML) html, optional UnsafeHTMLSetterOptions options = {});
  void replaceWithHTML((DOMString or TrustedHTML) html, SafeHTMLSetterOptions options);
  void replaceWithHTMLUnsafe((DOMString or TrustedHTML) html, optional UnsafeHTMLSetterOptions options = {});
  WritableStream streamHTML(SafeHTMLSetterOptions options);
  WritableStream streamHTMLUnsafe(optional UnsafeHTMLSetterOptions options = {});
  WritableStream streamBeforeHTML(SafeHTMLSetterOptions options);
  WritableStream streamBeforeHTMLUnsafe(optional UnsafeHTMLSetterOptions options = {});
  WritableStream streamAfterHTML(SafeHTMLSetterOptions options);
  WritableStream streamAfterHTMLUnsafe(optional UnsafeHTMLSetterOptions options = {});
  WritableStream streamAppendHTML(SafeHTMLSetterOptions options);
  WritableStream streamAppendHTMLUnsafe(optional UnsafeHTMLSetterOptions options = {});
  WritableStream streamPrependHTML(SafeHTMLSetterOptions options);
  WritableStream streamPrependHTMLUnsafe(optional UnsafeHTMLSetterOptions options = {});
  WritableStream streamReplaceWithHTML(SafeHTMLSetterOptions options);
  WritableStream streamReplaceWithHTMLUnsafe(optional UnsafeHTMLSetterOptions options = {});
};

Alternative: 12 methods, with insertion mode (taken from #10122)


enum HTMLInsertionPoint { "before", "after", "start", "end" };

// Hypothetical interface where these methods would live,
// for example, on Element or ShadowRoot.
[Exposed=Window]
interface ParentNode {

  void setHTML((DOMString or TrustedHTML) html, SafeHTMLSetterOptions options);
  void setHTMLUnsafe((DOMString or TrustedHTML) html, optional UnsafeHTMLSetterOptions options = {});
  void insertHTML((DOMString or TrustedHTML) html, HTMLInsertionPoint insertionPoint, SafeHTMLSetterOptions options);
  void insertHTMLUnsafe((DOMString or TrustedHTML) html, HTMLInsertionPoint insertionPoint, optional UnsafeHTMLSetterOptions options = {});
  void replaceWithHTML((DOMString or TrustedHTML) html, SafeHTMLSetterOptions options);
  void replaceWithHTMLUnsafe((DOMString or TrustedHTML) html, optional UnsafeHTMLSetterOptions options = {});
  WritableStream streamHTML(SafeHTMLSetterOptions options);
  WritableStream streamHTMLUnsafe(optional UnsafeHTMLSetterOptions options = {});
  WritableStream streamInsertHTML(HTMLInsertionPoint insertionPoint, SafeHTMLSetterOptions options);
  WritableStream streamInsertHTMLUnsafe(HTMLInsertionPoint insertionPoint, optional UnsafeHTMLSetterOptions options = {});
  WritableStream streamReplaceWithHTML(SafeHTMLSetterOptions options);
  WritableStream streamReplaceWithHTMLUnsafe(optional UnsafeHTMLSetterOptions options = {});
}};

Alternative: pass ReadableStream instead of returning a WritableStream:

[Exposed=Window]
interface ParentNode {
  Promise<void> streamHTML((ReadableStream or TrustedReadableStream), SafeHTMLSetterOptions options);
  // ...
};

This is simpler from a trusted types perspective, as the stream has to be trusted as a whole rather than the chunks. OTOH the API feels a bit less "streamy" than returning a Writable.

Other notes

For the interleaved case, as well as for trusted types, suggesting to do that via transform streams:

  • A TransformStream that resolves a raw HTML streams to patches as per #11542
  • A TransformStream that allows a trusted types to add a control point in the middle of a stream, as per https://github.com/w3c/trusted-types/issues/594

The aforementioned issues include more details about these, but they are designed in a way that shouldn't add more complexity to the normal case.

noamr avatar Sep 17 '25 15:09 noamr

I like this!

There's another axis, which is whether script elements run when inserted. See https://github.com/whatwg/html/issues/10090 and demo https://software.hixie.ch/utilities/js/live-dom-viewer/saved/14067

Maybe the UnsafeHTMLSetterOptions can have boolean runScripts = false; -- or we could decide that this shouldn't be possible (with new methods at least).

zcorpan avatar Sep 18 '25 08:09 zcorpan

I like this!

There's another axis, which is whether script elements run when inserted. See #10090 and demo https://software.hixie.ch/utilities/js/live-dom-viewer/saved/14067

Maybe the UnsafeHTMLSetterOptions can have boolean runScripts = false; -- or we could decide that this shouldn't be possible (with new methods at least).

Good call! added to the OP. I think we can have a design that handles all of these options and then reason about each of these things individually.

noamr avatar Sep 18 '25 08:09 noamr

insertAdjacentHTML is only broken in the sense of its specification, which on its own is not a good reason of course. https://github.com/whatwg/html/issues/10122 discusses some reasons for why we might want a replacement. Using positional arguments could be reasonable, but is quite a bit more verbose. We also don't really have any mutation operations that cleanly map to that so you'd end up with multiple mutation records as a result. That might be okay.

annevk avatar Sep 18 '25 08:09 annevk

insertAdjacentHTML is only broken in the sense of its specification, which on its own is not a good reason of course. #10122 discusses some reasons for why we might want a replacement. Using positional arguments could be reasonable, but is quite a bit more verbose.

Yea it's "splice" rather than "insert", which is perhaps more verbose but is a common lower level primitive for mutations?

It's quite easy to implement the #10122 semantics on top of the proposal here, while keeping the consistency with passing the rest of the options (sanitizer, streaming, allow-scripts), or add them as sugar into the platform (e.g. position instead of before and after).

noamr avatar Sep 18 '25 09:09 noamr

... the other option for insertAdjacentHTML is to have insertHTML and insertHTMLUnsafe methods (so 6 instead of 4) that have the position argument instead of before + after and doesn't remove any element... but then if we want to add streaming to that we end up with 8 methods altogether which starts to feel like a lot?

4 HTML-setting methods with splicing feels like a good balance to me with leaving it open to adding insert* variants in the future.

noamr avatar Sep 18 '25 09:09 noamr

Do we really need the explicit runScripts? You can use a lenient Sanitizer instance together with removeUnsafe.

evilpie avatar Sep 23 '25 09:09 evilpie

@evilpie runScripts is about whether or not scripts get executed post-insertion. They currently do not.

annevk avatar Sep 23 '25 11:09 annevk

Over in WordPress we’ve been stewing on this issue for some time now, ever since we built a pure-PHP HTML5 parser.

We are wanting to provide a convenient API for server-side code to understand and manipulate HTML, ideally to do things like stitch in HTML fragments from one document into another.

For us, this means we are fundamentally less concerned with script execution and more concerned with escaping outside of the context in which an edit is made. For instance, when inserting into the DOM it’s impossible to escape from the parent. It’s possible to construct invalid DOM trees which cannot be re-serialized into HTML. One example of this is creating a P element as a child inside of another P element.

But in the HTML world we have this conundrum where if we allow the insertion of such a <p> inside another open P element, we have allowed server-side code to create HTML which will parse differently once it reaches the browsers. This is why this issue has been delayed and held up for so long for us, because the fragment parsing algorithm isn’t relevant for this case, but it seems like the issue at hand with streaming HTML is identical or close enough to be relevant.

The fundamental question I would love to see specified and standardized is what happens when something appears which should breach a context. Maybe here it’s still left to the DOM and unrepresentable DOM trees are the answer, but I see three possible ways to handle this on the server:

  • Outright reject updates which would cause any open element to close, if that open element is a parent or ancestor of the node in which inner HTML was set or changed.
  • Remove elements which would trigger modifications to the open elements. This could lead to extreme behavior in edge cases and distorts updates.
  • Wrap the offending elements and reconstruct them via DOM operations once the page loads. E.g. if setting inner HTML to <h2> when already inside an H1 element, wrap it as <invalid-element data-tag-name=h2>.

This is pretty relevant for any system that brings together separate HTML snippets or templates, and I would imagine even those which heavily modify the page via JS. It’s nice that the DOM allows for unrepresentable constructions, but it would be ideal to be able to agree on the server and the client on what the rendered HTML will do.

Some API which provides for a kind of isolation of a parent element for inner markup changes would be extremely helpful. “No matter what HTML is set to the inside of this element, it will not bleed into the surrounding page.” This is helpful for proper understanding, it’s helpful for security issues, and it’s helpful for the resilience of sites which deal with user-supplied content.

dmsnell avatar Sep 23 '25 18:09 dmsnell

Over in WordPress we’ve been stewing on this issue for some time now, ever since we built a pure-PHP HTML5 parser.

We are wanting to provide a convenient API for server-side code to understand and manipulate HTML, ideally to do things like stitch in HTML fragments from one document into another.

For us, this means we are fundamentally less concerned with script execution and more concerned with escaping outside of the context in which an edit is made. For instance, when inserting into the DOM it’s impossible to escape from the parent. It’s possible to construct invalid DOM trees which cannot be re-serialized into HTML. One example of this is creating a P element as a child inside of another P element.

But in the HTML world we have this conundrum where if we allow the insertion of such a <p> inside another open P element, we have allowed server-side code to create HTML which will parse differently once it reaches the browsers. This is why this issue has been delayed and held up for so long for us, because the fragment parsing algorithm isn’t relevant for this case, but it seems like the issue at hand with streaming HTML is identical or close enough to be relevant.

The fundamental question I would love to see specified and standardized is what happens when something appears which should breach a context. Maybe here it’s still left to the DOM and unrepresentable DOM trees are the answer, but I see three possible ways to handle this on the server:

  • Outright reject updates which would cause any open element to close, if that open element is a parent or ancestor of the node in which inner HTML was set or changed.
  • Remove elements which would trigger modifications to the open elements. This could lead to extreme behavior in edge cases and distorts updates.
  • Wrap the offending elements and reconstruct them via DOM operations once the page loads. E.g. if setting inner HTML to <h2> when already inside an H1 element, wrap it as <invalid-element data-tag-name=h2>.

This is pretty relevant for any system that brings together separate HTML snippets or templates, and I would imagine even those which heavily modify the page via JS. It’s nice that the DOM allows for unrepresentable constructions, but it would be ideal to be able to agree on the server and the client on what the rendered HTML will do.

Some API which provides for a kind of isolation of a parent element for inner markup changes would be extremely helpful. “No matter what HTML is set to the inside of this element, it will not bleed into the surrounding page.” This is helpful for proper understanding, it’s helpful for security issues, and it’s helpful for the resilience of sites which deal with user-supplied content.

Over in WordPress we’ve been stewing on this issue for some time now, ever since we built a pure-PHP HTML5 parser.

Thanks for sharing this in detail!

But in the HTML world we have this conundrum where if we allow the insertion of such a <p> inside another open P element, we have allowed server-side code to create HTML which will parse differently once it reaches the browsers. This is why this issue has been delayed and held up for so long for us, because the fragment parsing algorithm isn’t relevant for this case, but it seems like the issue at hand with streaming HTML is identical or close enough to be relevant.

Perhaps this is what I don't understand. Why is the fragment parsing algorithm isn't relevant here? Sounds like it's exactly tackling the use case you've described, of parsing a piece of HTML in the context of where it is applied to?

noamr avatar Sep 23 '25 19:09 noamr

Perhaps this is what I don't understand. Why is the fragment parsing algorithm isn't relevant here? Sounds like it's exactly tackling the use case you've described, of parsing a piece of HTML in the context of where it is applied to?

@noamr I think the following should illustrate the problem. Live DOM

<div>
   <h1>This is a page</h1>
   <p id=p1>And this is a paragraph.</p>
</div>
<script>
   document.getElementById( 'p1' ).innerHTML = 'And <p>another paragraph</p> is here now.';
</script>

In this setup we’re calling the fragment parsing algorithm, if I understand it properly, which runs the steps on the inner HTML and creates the DOM nodes accordingly. It creates an invalid DOM with nested paragraphs, which is fine. That works in the browser, but it also means that this edit operation is not possible via HTML. There is no way to represent it.

Making this same edit within the HTML domain would lead to the premature closing of the containing P element, were it allowed.


The problem extends beyond basic nesting rules, however. Because the fragment parsing algorithms creates a new Document and stack of open elements, there are parsing rules which depend on the stack of open elements and will behave differently in the fragment parsing context than in the normal context. Here is one where the containing H1 element is not present in the fragment context. Live DOM

<div>
   <h1>This is a <span id=p1>page</span></h1>
   <p>And this is a paragraph.</p>
</div>
<script>
   document.getElementById( 'p1' ).innerHTML = 'And <h2>another paragraph</h2> is here now.';
</script>

And in this case the DOM allows nesting an H2 inside an H1 which would be interpreted differently were the browser to see <h1>This is a <span id=p1>And <h2>another paragraph</h2> is here now.</h1>

In our experimentation we have found cases where the missing elements on the stack of open elements is relevant for tokens which impact the insertion mode, affect integration points and name spacing, and a number of other state values.

In WordPress/wordpress-develop#7777 we started looking at ways to extend the fragment parser to clone the stack of open elements and provide guards which would detect if the parsing algorithm attempted to close any open element. I have explored a weaker approach in a builder class which simply examines whether there are any remaining tokens in the new inner HTML after the outer-most stack is back to the depth it had when the parsing started.

In all of the work I’ve explored, the one simple rule that I think is relevant to isolate changes is to somehow freeze the stack of open elements, run the parse algorithm in place, and if there are changes or pops to the stack of open elements above the point of setting inner HTML, it’s a violation. Probably other things need to be examined as well, such as whether the parse changed the insertion mode, updated the form pointer, changed the document encoding, etc…


Thanks for asking. I hope this clarifies things instead of making them less clear. The entire problem remains that the DOM operations allow creating DOM trees which cannot be represented in HTML. Even the latest work to bring in DOM\HtmlDocument in PHP leave us without reasonable tools to mix and match HTML from different sources because updates through its internal DOM don’t round-trip (we can’t send it’s internal DOM to a browser to we have to serialize, which creates these situations where the browser reads a different DOM than we had on the server).

It seems like streaming into the DOM via HTML is going to present similar quandaries. Suppose we load in the HTML for a comment on a blog. Should that comment be allowed to interrupt the flow of the page? How can we ensure that it remains isolated and inert?

dmsnell avatar Sep 23 '25 22:09 dmsnell

I think for streaming we should guarantee you stay within a parent, although runScripts might run afoul of that. But generally enforcing idempotency would require much more invasive changes and I don't think is easily doable.

annevk avatar Sep 24 '25 06:09 annevk

If you fully want to match what insertAdjacentHTMLdoes, why not use the same terms? Is there a universal understanding that it's a bad pattern that shouldn't be continued that I am not aware if?

For consistency, I would have considered a position key with the afterend, beforeend, afterbegin, beforebegin terms.

Nevermind. I found https://github.com/whatwg/html/issues/10122.

mozfreddyb avatar Sep 24 '25 06:09 mozfreddyb

  void setHTML((DOMString or TrustedHTML) html, SafeHTMLSetterOptions options);

should be

  void setHTML((DOMString or TrustedHTML) html, optional SafeHTMLSetterOptions options);

We want to fall back to secure browser defaults if nothing is specified. Indeed, the happy route for most applications should be el.setHTML(html) and everything is good 🙂

mozfreddyb avatar Sep 24 '25 06:09 mozfreddyb

I think for streaming we should guarantee you stay within a parent, although runScripts might run afoul of that. But generally enforcing idempotency would require much more invasive changes and I don't think is easily doable.

Agreed, it feels a bit beyond the scope of this. Perhaps open a separate issue?

noamr avatar Sep 24 '25 07:09 noamr

Has there been much consideration for the Document level parsing? We have parseHTMLUnsafe and will have parseHTML, should we add a streamHTML pair too?

lukewarlow avatar Oct 02 '25 12:10 lukewarlow

Has there been much consideration for the Document level parsing? We have parseHTMLUnsafe and will have parseHTML, should we add a streamHTML pair too?

Streaming into a new document doesn't require most of the things in this issue, like appending/prepending, script execution (the document is not connected), so in many ways it's simpler.

Also the streamHTML methods would likely still return a Document rather than a Writable, and in some way they are already streamable because you can document.write into this new document.

So the more I dive into the detail, the more that feels like a separate issue altogether.

noamr avatar Oct 02 '25 12:10 noamr

See https://github.com/whatwg/html/issues/2142#issuecomment-3376495808 for more details about streaming conflict resolution.

noamr avatar Oct 07 '25 11:10 noamr

In terms of API design, I think it would help to summarize what variables we have, and which ones only apply based on others.

I’ll take a stab, feel free to correct this as needed:

  • Append vs replace
    • If append, where to insert content (outside/inside, start/end — or do we also need something more granular?)
    • If replace, what to replace (entire element vs contents)
  • What to insert: HTML string/DOM node(s)/stream
    • Tricky bit here is that in some of these cases we often want to treat strings as text nodes, but in others we want to treat strings of HTML as HTML. Kooky idea: Perhaps a native html template tag could work to mark a string as HTML, and as a bonus, it already makes the contents highlight as HTML in today's editors.
  • Safe vs unsafe vs custom
  • Whether to run scripts
  • Whether to do stable reparenting of any DOM nodes that are already connected
  • How this works with TrustedTypes

Am I missing any variables? Are any of these variables out of scope for what we're doing here?

In terms of the multiple methods vs options debate, some thoughts:

  • Different methods are generally better when the return types are different, or when vastly different overloads are desired.
  • Different methods work better when the use cases are significantly different
  • Options are more extensible: far easier to add a new option, than a new set of methods
  • The more the variables, the more the pendulum swings towards options, because:
    • With many variables, you end up getting combinatorial explosion if you try to encode the parameters in method names
    • Options are generally more discoverable than method names, as people look up the docs of the method they're using anyway, but not necessarily docs for adjacent methods
    • With more than one variable, it often becomes hard to remember the order in which parameters are encoded in the method name
    • Options work better when variables may be set dynamically (dynamic method names are possible, but more awkward)
  • Different methods work well when you want to allow an arbitrary number of arguments (like in element.append()) which precludes using an options bag. However, in many cases, the need to parameterize further via options comes up later, so I think designing an API that can accommodate an options bag (even if the MVP doesn't include it) tends to be a safer bet. The arbitrary number of arguments can often be accommodated via an iterable (e.g. an array of nodes).

LeaVerou avatar Oct 09 '25 14:10 LeaVerou

In terms of API design, I think it would help to summarize what variables we have, and which ones only apply based on others.

I’ll take a stab, feel free to correct this as needed:

  • Append vs replace

    • If append, where to insert content (outside/inside, start/end — or do we also need something more granular?)
    • If replace, what to replace (entire element vs contents)
  • What to insert: HTML string/DOM node(s)/stream

    • Tricky bit here is that in some of these cases we often want to treat strings as text nodes, but in others we want to treat strings of HTML as HTML. Kooky idea: Perhaps a native html template tag could work to mark a string as HTML, and as a bonus, it already makes the contents highlight as HTML in today's editors.
  • Safe vs unsafe vs somewhere in between

  • Whether to run scripts

  • Whether to do stable reparenting of any DOM nodes that are already connected

Am I missing any variables? Are any of these variables out of scope for what we're doing here?

This is listed in the OP...

noamr avatar Oct 09 '25 15:10 noamr

This is listed in the OP...

Oops, I had missed it, thanks for pointing me to it. I just added TrustedTypes to my list. Are there no other variables that have come up since?

LeaVerou avatar Oct 09 '25 16:10 LeaVerou

Just to reiterate what I said about script running at TPAC:

Take:

<script>
  document.currentScript.parentElement.append('\nhello from inline script');
</script>
<script src="script-1.js"></script>
<script src="script-2.js"></script>

And assume each script is:

document.currentScript.parentElement.append('\nhello from script-(n)');

The HTML parser will produce:

<script>…</script>
hello from inline script
<script src="script-1.js"></script>
hello from script-1
<script src="script-2.js"></script>
hello from script-2

I think it's important for streamHTMLUnsafe({ runScripts: true }) to produce the same result.

Right now, createContextualFragment will produce:

<script>…</script>
<script src="script-1.js"></script>
<script src="script-2.js"></script>
hello from inline script
hello from script-2
hello from script-1

…where the order of script-2 and script-1 is racy.

My goal here is that people can take a site that's built from HTML includes, and stream them client side, without surprising different parsing behaviours breaking their content.

jakearchibald avatar Nov 16 '25 02:11 jakearchibald

@jakearchibald the current API design should not change the order of scripts (they should run as they are discovered, not at the end of the stream). In general where the current proposals are there are very few changes to any kind of ordering if at all.

Though the particular use case in the comment would require a WPT to validate :)

noamr avatar Nov 17 '25 08:11 noamr

@jakearchibald the current API design should not change the order of scripts (they should run as they are discovered, not at the end of the stream)

This would be different to setHTMLUnsafe etc, but I welcome it.

I guess the next question is, if you have <script src="script-1.js"></script><script src="script-2.js"></script>, would it support something like a 'preload scanner', or would they download in series?

jakearchibald avatar Nov 18 '25 00:11 jakearchibald

@jakearchibald the current API design should not change the order of scripts (they should run as they are discovered, not at the end of the stream)

This would be different to setHTMLUnsafe etc, but I welcome it.

It's different by design. In setHTMLUnsafe all the scripts are added into a fragment first and that counts as their initial addition.

I guess the next question is, if you have <script src="script-1.js"></script><script src="script-2.js"></script>, would it support something like a 'preload scanner', or would they download in series?

In series. An author could implement a preload scanner in a transform stream though.

noamr avatar Nov 18 '25 07:11 noamr

In series. An author could implement a preload scanner in a transform stream though.

It'd be nice to avoid having to do that, given it's already a feature of the document parser.

jakearchibald avatar Nov 19 '25 00:11 jakearchibald

Expecting web developers to implement a parser themselves without a SAX-like interface is not realistic and also not desirable. (We had this discussion before with Trusted Types.)

annevk avatar Nov 19 '25 09:11 annevk

Expecting web developers to implement a parser themselves without a SAX-like interface is not realistic and also not desirable. (We had this discussion before with Trusted Types.)

Sure; I guess that speculative HTML parsing can work here though in the same way that it works for the main parser; The HTML sink can tee its input to the regular parser and the speculative parser like today.

I don't know if streaming changes that considerably.

noamr avatar Nov 19 '25 09:11 noamr

While refactoring HTML fragment parsing algorithm to support streamHTMLUnsafe() a few issues have surfaced that I'd like to document here for discussion.

Inert document vs. document fragment

The spec uses an inert document as the parser document, and first inserts nodes into that document. The innerHTML setter then moves the nodes to a DocumentFragment, and finally uses replace all to finish the job.

No implementations seem to use an inert document, and there are observable side effects of this. @noamr has tested new Range().createContextualFragment('<img src="/bla.png">') and this fetches the image in all browsers, which wouldn't happen if an inert document was used.

My thinking is now to align the spec with implementations, so that the parser document is the main document and the fragment parser adds nodes to a DocumentFragment.

streamHTMLUnsafe() insertion point

If we do the above, a natural next step is to make the insertion point a parser argument/setting, so that the fragment parser can insert either into DocumentFragment or directly into an element in the document in the case of streamHTMLUnsafe().

runScripts details

Supporting runScripts for setHTMLUnsafe() and streamHTMLUnsafe() raises the question of parser-blocking scripts, i.e. when <script src="external.js"> would run. For streamHTMLUnsafe() I'd ideally like to match the main parser and let such scripts block further parsing. Since the API is async this is OK.

setHTMLUnsafe() and streamHTMLUnsafe() alignment

It would make a lot of sense if these two methods were identical apart from the streaming aspect. The above then implies that setHTMLUnsafe() should also insert nodes directly into the target and not use an DocumentFragment, and that parser-blocking scripts should block. However, the API is sync so this doesn't seem workable. How to align setHTMLUnsafe() and streamHTMLUnsafe() on this specific point is still TBD.

foolip avatar Nov 28 '25 13:11 foolip

I think it's ok for setHTMLUnsafe() and streamHTMLUnsafe() to be different in this case. I can't see a way to do it like the main parser but be synchronous. In fact, I thought that was one of the reasons why things like innerHTML have this divergent behaviour.

It'd be nice if setHTMLUnsafe() still resulted in scripts executing in the 'right' order, but I can't see how it can 'block' the parser.

Perhaps setHTMLUnsafe() could have an option that resulted in it being async, as in returning a promise?

jakearchibald avatar Nov 28 '25 16:11 jakearchibald

I agree, setHTMLUnsafe() can't behave like the main parser in this regard while being sync.

In order to minimize differences, however, I wonder if the default for both setHTMLUnsafe() and streamHTMLUnsafe() should be non-parser-blocking, and that streamHTMLUnsafe() could opt in to it to be more like the main parser?

I think letting setHTMLUnsafe() return a promise, especially based on a parameter, is perhaps giving a bit too much space in a new API to an old behavior that we regret in the platform. If it can be done by streamHTMLUnsafe() with an opt-in, then wrapping that in a helper that takes a single string and returns a promise would be trivial.

foolip avatar Nov 28 '25 17:11 foolip