svelte Buggy @html tag on page load/refresh

Describe the bug

The @html tag is buggy on initial page load/refresh.

For example, in the reproduction repo (https://github.com/kelvinsjk/sveltekitHTMLTag), we have const a = 1, b = 2, c = 3 and {a} {@html b} {c}.

The SSR rendered page correctly shows 123 but upon page load and hydration it becomes 13 instead.

On HMR it then works as intended, and certain combination of text around those tags could lead it to work as well, but other times it doesn't.

Reproduction

https://github.com/kelvinsjk/sveltekitHTMLTag

Logs

No response

System Info

System:
    OS: Linux 5.11 Pop!_OS 20.04 LTS
    CPU: (4) x64 Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz
    Memory: 4.05 GB / 14.60 GB
    Container: Yes
    Shell: 5.0.17 - /bin/bash
  Binaries:
    Node: 16.9.0 - ~/.nvm/versions/node/v16.9.0/bin/node
    Yarn: 1.22.11 - ~/.nvm/versions/node/v16.9.0/bin/yarn
    npm: 7.21.1 - ~/.nvm/versions/node/v16.9.0/bin/npm
  Browsers:
    Firefox: 88.0.1
  npmPackages:
    @sveltejs/kit: next => 1.0.0-next.180 
    svelte: ^3.42.6 => 3.43.1

Severity

annoyance

Additional Information

Can be worked around by re-assigning the variable in action onMount, but rather annoying as I haven't been able to discern a pattern as to when this will strike.

Oct 09 '21 17:10 kelvinsjk

I cloned your repo and did a few tests. Here are my findings so far:

The HTML delivered by the SSR includes 2. This looks correct.
There's a claim_html_tag function in the Svelte code that takes an array of HTML nodes (a mix of element nodes and comment nodes), and is supposed to find those HTML_TAG_START and HTML_TAG_END comment nodes and strip them off before taking the rest of the nodes between those comments (which should be elements) and "claiming" them for the Svelte code to handle.
I added a console.log call at the start of claim_html_tag and it showed [ (comment node) "HTML_TAG_START", (comment node) "HTML_TAG_END", (text node) " " ]. That is, the last text node contained whitespace. By that point in the code flow, the text node containing "2" that should have been between the START and END comments had already disappeared, and it had grabbed the whitespace node after the END comment because it was supposed to grab three nodes. That fact will probably lead to finding the bug location, as the "2" node probably disappeared after the nodes-to-grab were counted .

I'm running out of time right now before I need to go do something else, but I'll come back to this. In the meantime, I'm posting these incomplete findings so that anyone else who wants to troubleshoot this will have a starting point to work from.

Oct 10 '21 02:10 rmunn

Okay, I'm pretty sure I've identified the precise cause of the bug. I'm not familiar enough with the Svelte code to fix the bug, but I know why it's happening, which will hopefully give the Svelte maintainers a head start on writing a fix.

When Svelte compiles your repro to Javascript code, it looks like this:

function create_fragment(ctx) {
	let t0;
	let t1;
	let html_tag;
	let t2;
	let t3;

	const block = {
		c: function create() {
			t0 = text(/*a*/ ctx[1]);
			t1 = space();
			html_tag = new HtmlTagHydration();
			t2 = space();
			t3 = text(/*c*/ ctx[0]);
			this.h();
		},
		l: function claim(nodes) {
			t0 = claim_text(nodes, /*a*/ ctx[1]);
			t1 = claim_space(nodes);
			html_tag = claim_html_tag(nodes);
			t2 = claim_space(nodes);
			t3 = claim_text(nodes, /*c*/ ctx[0]);
			this.h();
		},
		h: function hydrate() {
			html_tag.a = t2;
		}
        // Rest of block omitted for space
    }
}

When the component is created by client-side Svelte code (for example, when you load a different page and navigate to the page with this component, or after HMR), the create() function is called. It creates an HTML structure like this: text node "1", whitespace node, @html node (which will contain the text node "2"), whitespace node, text node "3". When the component takes server-side rendered HTML, the claim() function is called. It expects an HTML structure of the same shape as the structure the create() function would produce: text 1, space, @html containing text 2, space, text 3.

Here's the HTML that's produced by the server-side rendering, exactly as it was delivered to the browser (including the extra newlines and one line that's nothing but three tabs and a newline):

		<div id="svelte">


1 <!-- HTML_TAG_START -->2<!-- HTML_TAG_END --> 3



			
		</div>

There's more HTML, of course, but that's the important part for understanding the cause of this bug. You'd think that this would produce the same node structure as create(), i.e. text 1, space, @html, space, text 3. But in fact, when I set a breakpoint at the start of claim() and inspected the nodes array in the Firefox debugger, this is what it contained:

text node: "\n\n\n1 "
comment node: " HTML_TAG_START "
text node: "2"
comment node: " HTML_TAG_END "
text node: " 3\n\n\n\n\t\t\t\n\t\t"

Notice how the first and last text nodes don't just contain "1" and "3", they contain "(whitespace)1(whitespace)" and "(whitespace)3(whitespace)". That's the root cause of this bug.

The way the rest of the bug plays out is that the line t0 = claim_text(nodes, /*a*/ ctx[1]) looks at first glance to be doing exactly what it should. Before it runs, nodes = [text 1, HTML_TAG_START, text 2, HTML_TAG_END, text 3]. And after it runs, t0 = text 1 and nodes = [HTML_TAG_START, text 2, HTML_TAG_END, text 3]. It has claimed and removed the correct node. But in fact, claim_text has not done the right thing, because of the whitespace around the node. First let's look at the source of claim_text, and then we'll see how its implementation is failing in this case.

function claim_text(nodes, data) {
  return claim_node(nodes, (node) => node.nodeType === 3, (node) => {
    const dataStr = "" + data;
    if (node.data.startsWith(dataStr)) {
      if (node.data.length !== dataStr.length) {
        return node.splitText(dataStr.length);
      }
    } else {
      node.data = dataStr;
    }
  }, () => text(data), true);
}

When t0 = claim_text(nodes, ctx[1]) gets run, ctx[1] contains the value of a (the integer 1), so dataStr above becomes the string "1" with no space around it. The HTML node being claimed at that point is the text node "\n\n\n1 ", so node.data contains "\n\n\n1 ". Since that does not start with "1", the if block is false and the contents of that text node are set to "1" instead, with no whitespace before or after it. The whitespace after it, though, was important, as we'll see in just a couple of paragraphs.

What should have happened here was that the SSR should have returned a div with no extraneous whitespace around the contents. Instead of the HTML I linked above with several blank lines around the <div id="svelte">, it should have contained exactly this: <div id="svelte">1 2 3</div>. Because then, inside claim_text, the node would have been a text node containing exactly "1 " (a single trailing space after the 1). Then the node.data.startsWith(dataStr) test would have been true, the node.data.length !== dataStr.length test would have been true, and the node.splitText(dataStr.length) function would have run. That would have done the right thing, because then there would have been a text node containing "1" followed by a whitespace node. I.e., the nodes array after claim_text was finished should have contained [text " ", HTML_TAG_START, text 2, HTML_TAG_END, text 3].

But instead, because the splitText function never ran, the nodes array ended up containing [HTML_TAG_START, text 2, HTML_TAG_END, text 3]. And the next function call, claim_space, is where the bug's effects start to be visible. Because its implementation is just claim_text(nodes, " "). And claim_text calls claim_node looking for a node of nodeType 3 (a text node). I haven't shown you the claim_node implementation because it's long, but it moves forward through the node list looking for one that matches the predicate, which here is just "a text node". And therefore, when it finds the text node "2", it says "Okay, that's a text node, sot hat's the one I'm looking for". It then checks whether "2" starts with " " and when the answer is no, it runs the else block, which sets node.data (which was "2") to dataStr (which is " "). This is the part where the "2" disappears from your HTML, because the claim_text function has incorrectly converted it to a space.

But the root cause of the bug is the extraneous whitespace before the text node "1", because that made the claim_text function not split off the whitespace after the node "1". And therefore the whitespace node that should have been claimed by claim_space was missing, and so claim_space incorrectly claimed the text node "2".

To solve this bug, the SSR needs to not render extraneous whitespace around the contents of <div id="svelte">...</div>. Then this whole thing would have worked correctly.

Oct 10 '21 16:10 rmunn

Thank you for the excellent investigation @rmunn !!

CC @tanhauhau who has been tracking whitespace issues CC @hbirler who has made a number of hydration fixes

Oct 10 '21 18:10 benmccann

Hey, I haven't looked at Svelte in a while and I have not analyzed this issue in detail, but I feel this issue might be connected to a HTML_TAG_* claim issue that was discussed earlier https://github.com/sveltejs/svelte/issues/6463 https://github.com/sveltejs/svelte/issues/6463#issuecomment-889430684 https://github.com/sveltejs/svelte/pull/6602#issuecomment-892212885 . Was that issue addressed?

@rmunn if I remember correctly, the claim_* functions do not need to be precise. Actually, Svelte hydration (at least back when I worked on it a couple months ago) never assumes anything about the html page that it receives as input. Svelte should still work properly even if it is called to hydrate an entirely different html page of an entirely different website. So, claims not functioning properly due to extra spacing is likely something that could be improved upon, but not something that should cause correctness problems in page rendering. Thus, I believe a fix for this issue should likely target another part of Svelte that might be erroneously assuming that claims always do their job correctly.

Additionally, errors in claims generally don't result in correctness problems (but of course, this might not be the case with this particular issue). After the claim phase, all nodes that were originally on the page but were left unclaimed are deleted, and nodes that we want to claim but were unable to find on the page are newly created (and placed in the positions that we expected them to be). So whitespace issues as you describe generally only lead to performance problems (having to delete a node that we could have claimed, and creating a new node since we couldn't claim any node) rather than correctness problems (after nodes have been deleted, created, and repositioned, the page looks as we expect).

Your description of wrong nodes getting claimed, erased, changed, reordered is actually expected behavior (even though it is weird and not that performant). It would, however, certainly be cool to have even better claim implementations that handle more edge-cases so that less deleted/reorders happen.

What is weird here in this issue is that 2 is not filled back in after it has been deleted. Does @html not hydrate its content? Does it need the claims to function perfectly? If so, we may need to rewrite claims such that HTML_TAGs are claimed first so that no other claims touch nodes within HTML_TAGs. Or we should change HTML_TAGs to be proper nodes (maybe we can use <meta itemprop>?), where their contents become child nodes (not sure if this would break other things).

Oct 10 '21 20:10 hbirler

Is this issue fixes? I encountered a similar problem using svelte 4

Jul 12 '23 23:07 elron

Svelte 4.0.4 added some hydration fixes, so if you're using Svelte 4, please make sure you're testing with the latest version and let us know if it works with that specific version

Jul 13 '23 11:07 benmccann

I'm using the latest version: svelte 4.0.5

Here's a replicate: https://www.sveltelab.dev/tfpbg025yxmqvpd Check out +page.svelte

is this svelte or the library @macfja/svelte-persistent-store?

Update: This seems to be a problem with svelte and not the library. Here's how to reproduce: https://www.sveltelab.dev/iziz4nvzfq2g6l9

Jul 13 '23 12:07 elron

Is this issue a duplicate of https://github.com/sveltejs/svelte/issues/8213 ?

Jul 14 '23 17:07 elron

Svelte 4.0.4 added some hydration fixes, so if you're using Svelte 4, please make sure you're testing with the latest version and let us know if it works with that specific version

Have updated the minimal reproduction https://github.com/kelvinsjk/sveltekitHTMLTag to run on svelte 4.0.5 and the whitespace claiming hydration problem as described in the original issue ticket is still present

Jul 18 '23 04:07 kelvinsjk

Svelte 4.0.4 added some hydration fixes, so if you're using Svelte 4, please make sure you're testing with the latest version and let us know if it works with that specific version

Have updated the minimal reproduction https://github.com/kelvinsjk/sveltekitHTMLTag to run on svelte 4.0.5 and the whitespace claiming hydration problem as described in the original issue ticket is still present

Here's a live demo of your repository: https://www.sveltelab.dev/0lxbt6lnamtzq1f

Jul 18 '23 13:07 elron

I also have this problem in combination with an i18n solution which returns HTML. Translations for normal elements are coming but HTML is missing often.

Hopefully this could be fixed, soon.

Aug 22 '23 09:08 alinex

svelte svelte copied to clipboard

Buggy @html tag on page load/refresh

Describe the bug

Reproduction

Logs

System Info

Severity

Additional Information

svelte
svelte copied to clipboard