svelte icon indicating copy to clipboard operation
svelte copied to clipboard

Better whitespace handling

Open Rich-Harris opened this issue 7 years ago • 59 comments

While fixing #178 it occurred to me that there are some nodes, like <datalist>, where it doesn't make sense to have text node children. The browser agrees:

document.body.innerHTML = `
  <div>
    <span>a</span>
    <span>b</span>
  </div>
`;

console.log( document.body.querySelector( 'div' ).childNodes );
// [ text, span, text, span, text ]

document.body.innerHTML = `
  <datalist>
    <option value='a'/>
    <option value='b'/>
  </datalist>
`;

console.log( document.body.querySelector( 'datalist' ).childNodes );
// [ text, option, option ]

Not sure what the first text node is doing there in the second case. Anyway, Svelte should be smart enough not to create text nodes inside elements where they're meaningless.

Additionally, we could collapse excess whitespace between nodes that aren't inside a <pre> element, since these are equivalent:

<p>one</p> <p>two</p>
<p>  one  </p>

    <p>  two  </p>

(That's not strictly true, since it's dependent on CSS, so there probably needs to be a preserveWhitespace: true option if we did that.)

Rich-Harris avatar Dec 11 '16 19:12 Rich-Harris

I personally would like to avoid all inter-element whitespace as much as possible.

Maybe do something similar to pug and jsx (at least I think thats what it does), where whitespace is only added if the elements are on the same line (inline). Tags with newlines in between get that whitespace stripped completely.

Pug even has special syntax for inline tags: https://pugjs.org/language/interpolation.html

So for example:

<ul>
  <li>
    Some <strong>inline</strong> tags.
  </li>
</ul>

would generate the following html:

<ul><li>Some <strong>inline</strong> tags.</li></ul>

Swatinem avatar Dec 12 '16 09:12 Swatinem

That actually happens already – whitespace inside either end of an element is removed: https://svelte.technology/repl/?gist=f4657520185203c009a9116568ac5ba2

The problems start when you have siblings separated by newlines:

<ul>
  <li>
    Some
    <strong>inline</strong>
    <em>newline separated</em>
    tags.
  </li>
</ul>

In that case you can't remove whitespace before the <strong>, or between the <strong> and the <em>, or after the <em>. You could collapse the intermediate whitespace to a single space character and it would behave identically (assuming no white-space: pre CSS or similar), but you can't remove it. You can probably remove it if the elements are block-level, but it's unsafe because it depends on display: block.

Rich-Harris avatar Dec 12 '16 13:12 Rich-Harris

I thought about using a whitelist of inline elements as well, but I don’t think that’s a good idea. I rather think we should do the same as react/jsx and pug do here and remove all whitespace when elements are on a newline. If you explicitly want to have it on a newline, one could render the text as an explicit snippet <span>{{" like so, with whitespace "}}</span>. I have seen this pattern a few times in react codebases, although I admit its ugly.

Swatinem avatar Dec 12 '16 13:12 Swatinem

Trouble is pug and JSX are different languages, so they can get away with different semantics. Svelte templates are just HTML (plus tags, blocks and directives), so from the user's perspective if markup behaves differently then it's a bug. Maybe it could be an opt-in thing? whitespace: 'aggressive' or something?

If we do collapse whitespace to a single character, then when we get round to implementing helpers it won't be quite so bad:

// instead of this...
var text4 = document.createTextNode( '\n\t\t\t' );

// ...this
var text4 = whitespace();

Rich-Harris avatar Dec 12 '16 13:12 Rich-Harris

Its really a question what the user expects I guess… More often things are breaking because of unintended whitespace. And depending on the users background, they might be used to template system that have a more aggressive whitespace handling.

Swatinem avatar Dec 12 '16 14:12 Swatinem

@Rich-Harris wrote;

Not sure what the first text node is doing there in the second case.

I'm quite sure it is a line break.

Django and Twig have a spaceless environment tag.

Ryuno-Ki avatar Dec 13 '16 20:12 Ryuno-Ki

What I'm currently doing is sending my templates through https://github.com/kangax/html-minifier before sending them through svelte -

HTMLMinifier.minify(html, {
	caseSensitive: true,
	collapseWhitespace: true,
	conservativeCollapse: true,
	ignoreCustomFragments: [ /{{[^]*?}}/ ],
	keepClosingSlash: true,
})

which seems to work well enough. conservativeCollapse makes runs of whitespace get collapsed down to one space instead of zero. caseSensitive and keepClosingSlash are necessary to keep svelte happy with the template. And ignoreCustomFragments is necessary to keep html-minifier from trying to parse tags.

Conduitry avatar Jan 08 '17 17:01 Conduitry

I think if you look at the HTML 5 spec they have an interesting way of handling this. You can have different insertion modes where character tokens are handled differently. See "in-select" insertion mode for example: https://www.w3.org/TR/html5/syntax.html#parsing-main-inselect

camwest avatar Jan 11 '17 05:01 camwest

Removing whitespace between tags is important for layouts using display: inline-block.

I agree that preserving the HTML expectations is important, so I think that whitespace removal should be explicit. As @Ryuno-Ki mentioned, the Twig spaceless tag is a great solution.

{% spaceless %}
    <div>
        <strong>foo</strong>
    </div>
{% endspaceless %}

{# output will be <div><strong>foo</strong></div> #}

cristinecula avatar Mar 13 '17 09:03 cristinecula

Could we do something like

export default {
  keepWhitespace: true
}

then check that at compile time?

PaulBGD avatar Mar 28 '17 17:03 PaulBGD

It seems we can no longer naively pass Svelte components through html-minifier because of the new Svelte-specific tags (eg. <:Head>).

Does anyone have a workaround for that?

mrkishi avatar Jan 22 '18 20:01 mrkishi

My comment here could probably be extended to use ignoreCustomFragments to also ignore <:Head>. Another regex could be added, something like /<:Head>[^]*?<\/:Head>/. I'm no longer using the method from that comment, however, so I haven't tried this.

Conduitry avatar Jan 22 '18 21:01 Conduitry

First time user, obvious question: why not just copy React?

@Rich-Harris referring back to a comment you made in 2016:

Svelte templates are just HTML (plus tags, blocks and directives), so from the user's perspective if markup behaves differently then it's a bug.

I'm skeptical -- maybe because other frameworks have changed my expectations. I'm new to Svelte (and love it): I think of it as a transpiler. I don't feel I'm writing HTML: I feel I'm writing JavaScript with an abundance of < and > characters.

When I first used React (very recently), I had no preconceptions about whitespace. I learned to my delight that JSX is not HTML.

And now I've just learned Svelte, and I see myself writing huge statements on one line. Here's code from my second-ever Svelte component, which happens to use the dreaded <pre> tag:

<pre ref:pre class={{wrapText ? 'wrap' : ''}}>{{#each plainAndHighlightedPairs as [ plain, highlighted ], i}}{{plain}}{{#if highlighted}}<em class={{highlightIndex === i ? 'current' : ''}}>{{highlighted}}</em>{{/if}}{{/each}}</pre>

Svelte is getting in the way. Can that line be sensible?

If Svelte followed React's whitespace rules, it would be legible:

<pre ref:pre class={{wrapText ? 'wrap' : ''}}>
  {{#each plainAndHighlightedPairs as [ plain, highlighted ], i}}
    {{plain}}
    {{#if highlighted}}
      <em class={{highlightIndex === i ? 'current' : ''}}>{{highlighted}}</em>
    {{/if}}
  {{/each}}
</pre>

Svelte's choice is between "pretend to be HTML" and "pretend to be easy." They're mutually exclusive. People choose Svelte because HTML is too complicated: it could be easier here. And yes, changing whitespace rules is backwards-incompatible; but hey, React did it.

This is an awfully long comment, so I'll finish with a rhetorical flourish: can any reader here describe HTML's whitespace rules from memory?

adamhooper avatar Feb 06 '18 18:02 adamhooper

Just rediscovered this issue via #1236. @adamhooper you make a strong case; the <pre> example is gnarly. I have a counter-example though — I see this sort of thing a lot in React codebases:

<p>
  click
  {' '}
  <a href={`/foo/${this.props.bar}`}>here</a>
  {' '}
  for more information
</p>

The Svelte/HTML equivalent:

<p>
  click
  <a href='/foo/{{bar}}'>here</a>
  for more information
</p>

Having to insert {' '} in order to preserve spaces around the <a> is utterly grotesque, and I'd argue that the JSX rules (whitespace is significant unless it happens to contain a newline) are ultimately no more intuitive than HTML rules.

Not suggesting that we have to comply strictly with HTML, just that we need to weigh up the consequences of deviating from it.

Rich-Harris avatar Mar 14 '18 11:03 Rich-Harris

I agree with @adamhooper on this and would love to see Svelte's HTML output with collapsed whitespace as React does it. Things like inline-block menus, tab headers, and breadcrumbs become a big mess of code when you want the whitespace eliminated. For example:

<ul class="breadcrumbs">
{#each breadcrumbs as item, i}<li>{#if i !== breadcrumbs.length - 1}<a href="{item.url}">{item.title}</a>{:else}{item.title}{/if}</li>{/each}
</ul>

morleytatro avatar May 02 '18 06:05 morleytatro

Having to insert {' '} [in JSX] in order to preserve spaces around the <a> is utterly grotesque

I type it out all on one line and then let Prettier do the {' '} for me ¯\_(ツ)_/¯

lydell avatar Apr 24 '19 19:04 lydell

Angular has similar feature of removing whitespaces. https://angular.io/api/core/Component#preserving-whitespace

Key takeaways:

  • ngPreserveWhitespaces directive that affects whole element subtree can be applied
  • To force single space, use &ngsp; entity - that's certainly better than {' '}
  • There's also global and per component flag preserveWhitespace. I see that's it's supported already in Svelte
  • #2258 says that <pre> tags are handled specially. Angular documentation also mentions <textarea> and more tags (if any) could be looked up in their repo.

I'd like to note that as a longtime user of Angular I never had problems stemming from this particular feature. It's enabled by default and optimises generated code a lot. Behaviour becomes natural like export let for props and on rare occasion if something doesn't look right you notice it during development and apply appropriate override.

tomblachut avatar Apr 26 '19 21:04 tomblachut

Having to insert {' '} in order to preserve spaces around the <a> is utterly grotesque

Spaces between elements seems to be the exception rather than the rule, so you don't see too much {' '}.

I'd argue that the JSX rules (whitespace is significant unless it happens to contain a newline) are ultimately no more intuitive than HTML rules.

They're very intuitive. I've been caught out many times by extraneous spaces between and within HTML elements. I've had no such surprises with JSX.

steve-taylor avatar Apr 28 '19 14:04 steve-taylor

It looks like this one is still heavily under discussion. I can echo @tomblachut 's experience from Angular. What Angular does "just works" for almost all cases, producing efficient output by default. "Efficient by default" is an important trade-off against what @Rich-Harris mentioned earlier, behaving as much as possible like ordinary HTML by default.

Every once in a while it causes trouble - rarely enough that switching to "preserve whitespace" for a subtree or entire component fixes it easily with only a tiny impact on an overall applications compiled output size.

kylecordes avatar Apr 28 '19 14:04 kylecordes

I saw this but I'm asking if this is the same for my problem: https://github.com/sveltejs/svelte/issues/2745

frederikhors avatar May 13 '19 00:05 frederikhors

I opened an issue on VSCode Svelte plugin repo (https://github.com/UnwrittenFun/svelte-vscode/issues/50).

The problem is I need this code to stay like this:

<div>
  Test (<span class="color">one</span>)
</div>

It becomes this instead:

<div>
  Test (
  <span class="color">one</span>
  )
</div>

Is it related to Svelte compiler or an issue with ghost spaces added by vscode plugin?

What can I do?

I'm not understanding.

frederikhors avatar Jun 08 '19 10:06 frederikhors

@frederikhors It sounds like the VSCode Svelte plugin's formatter is not following the current whitespace rules of Svelte (basically how whitespace works in HTML). I think this is the issue you encounter: https://github.com/UnwrittenFun/prettier-plugin-svelte/issues/24

This thread is about potentially changing how whitespace works in Svelte in the future.

lydell avatar Jun 08 '19 13:06 lydell

I found a ready to use preprocessor for that - https://www.npmjs.com/package/@minna-ui/svelte-preprocess-markup


import svelte from 'rollup-plugin-svelte';
import resolve from '@rollup/plugin-node-resolve';
import minify from 'rollup-plugin-babel-minify';
import sveltePreprocess from 'svelte-preprocess';
import preprocessMarkup from '@minna-ui/svelte-preprocess-markup';


export default {
  input: 'src/index.js',
  output: {
    name: 'viewflow',
    file: 'dist/viewflow-components.js',
    format: 'iife',
  },
  plugins: [
    svelte({
      include: 'src/*.svelte',
      preprocess: [sveltePreprocess(), {'markup': preprocessMarkup()}],
      customElement: true,
    }),
    resolve(),
    minify(),
  ],
};

kmmbvnr avatar Dec 19 '19 01:12 kmmbvnr

I've managed to hack it abit to remove all the spaces between the tags with this:

const tagsRegex1 = /(>)[\s]*([<{])/g;
const tagsRegex2 = /({[/:][a-z]+})[\s]*([<{])/g;
const tagsRegex3 = /({[#:][a-z]+ .+?})[\s]*([<{])/g;
const tagsRegex4 = /([>}])[\s]+(<|{[/#:][a-z][^}]*})/g;
const tagsReplace = '$1$2';

const opts = {
  preprocess: {
    style: processStyle,
    markup({content}) {
      const code = content
        .replace(tagsRegex1, tagsReplace)
        .replace(tagsRegex2, tagsReplace)
        .replace(tagsRegex3, tagsReplace)
        .replace(tagsRegex4, tagsReplace)
      ;

      return {code};
    },
  },
};

But when I run the app, it still creates those text nodes.. this is a major problem WordPress on IE11 since they use some kind of MutationObserver to replace emoji with concatenating all adjacent text nodes even if it's empty and crashes.

https://github.com/WordPress/WordPress/blob/19fd7a0da93bcbab94396fb7bc2f26e56a4c1668/wp-includes/js/wp-emoji.js#L141

shirotech avatar Feb 14 '20 07:02 shirotech

I've finally nailed it down to the root cause of the empty text nodes, a suggested fix is here https://github.com/sveltejs/svelte/issues/4423

shirotech avatar Feb 17 '20 01:02 shirotech

Is there currently a workaround to force svelte to preserve whitespace? Server-side rendering currently correctly preserves whitespace when doing

<pre>foo

</pre>
<p>bar</p>

but strips a newline during hydration.

This is especially frustrating dealing with code-highlighting where

- script: |
    npm pack
    mv dom-accessibility-api-*.tgz dom-accessibility-api.tgz
  displayName: "Create tarball"

becomes

<span class="hljs-bullet">-</span> <span class="hljs-attr">script:</span> <span class="hljs-string">|
      npm pack
      mv dom-accessibility-api-*.tgz dom-accessibility-api.tgz
</span>    <span class="hljs-attr">displayName:</span> <span class="hljs-string">&#39;Create tarball&#39;</span>

The result changes between hydration. Before: Screenshot from 2020-05-16 22-07-39 After: Screenshot from 2020-05-16 22-08-26

eps1lon avatar May 16 '20 20:05 eps1lon

Just stumbled across this issue. FWIW I'm using a regex to collapse whitespace which reduces the JS bundle size (minified, uncompressed) from 40.22kB to 39.82kB. I might give svelte-preprocess-markup a shot though.

nolanlawson avatar Jul 24 '20 13:07 nolanlawson

Can we just have an option (per-component like immutable, or whole project), which prevents something like t1 = space(); from being added to the JS output of the compiler? It feels like doing this in the preprocess step with regex is asking for a trouble. Something which works now could break later, when we will have more special syntax.

Alternatively, if that compiler option is undesirable, could we have community approved preprocessing minifier config to get rid of all unnecessary whitespace while introducing no minify-related bugs?

Here is the code to test in repl.

<script>
	const log = node => console.log(node.childNodes);
</script>

<div use:log>
	<div on:click={() => 'preventMeSomeInnerHTML'}>haha</div>
	<div>haha</div>
</div>

Personally, I would choose stripping whitespace and entering it with {' '} anyday.

non25 avatar Jul 24 '20 15:07 non25

https://github.com/firefish5000/svelte-trim Can we adopt something like this in the compiler? A customizable solution, which will avoid generating AST twice. It will improve DX a lot.

non25 avatar Jan 16 '21 14:01 non25

Hello,

If svelte would – in the HTML part – remove <newline><all whitespace until next non whitespace> but not the whitespace before <newline>, then one could write ...

<span>
    1
    <span>2</span>
    <span>3</span>
    4
</span>

... and it would output 1234 (currently 1 2 3 4).

If he would write ...

<span>1 <span>2</span> <span>3</span> 4</span>

... or (_ means whitespace) ...

<span>
    1_
    <span>2</span>_
    <span>3</span>_
    4
</span>

... he would get 1 2 3 4.

Looks like the simplest solution to me.

nilslindemann avatar Feb 28 '21 03:02 nilslindemann