stage0 icon indicating copy to clipboard operation
stage0 copied to clipboard

Question: purpose of whitespace pruning

Open nicketson opened this issue 5 years ago • 3 comments

I am currently working on some fixes that will allow stage0 to work in IE11 and I have a few questions.

  1. What is the purpose for the whitespace pruning in h()? Is this simply an optimization/cleanup to remove useless TextNodes and produce a cleaner DOM structure?

  2. What is the purpose of the .replace(/\n\s+/g, '<!-- -->') replacement introduced in this commit: https://github.com/Freak613/stage0/commit/856e732fb978110e11fe4c22bb08660c1a615832 what bug does this fix?

Thanks! I'm loving stage0 so far!

nicketson avatar Jun 20 '19 15:06 nicketson

Thanks)

  1. Yes, whitespace pruning is required to remove unnecessary TextNodes. It's performance optimization. Developer doesn't have intention to produce them using spacing only for formatting purposes.
// I'm writing
h`
<div>
  <div>
    <div></div>
  </div>
</div>
`

// What I want
"<div><div><div></div></div></div>"

// What I get without pruning
"<div><Text "  "/><div><Text "    "/><div></div><Text "  "/></div></div>"
  1. It was for splitting mixed text and tags identifiers. To get working something like:
h`
<div>
  My name is
  #name
</div>
`

Without this code it will produce

"<div><Text "My name is"><Text "   #name"></div>"

And it will break parser, that's skipping text nodes that doesn't start with #.

Also, worth noting that stage0 doesn't support inlined # tags.

// Won't work
h`
<div>
  My name is #name
</div>
`

Freak613 avatar Jun 23 '19 22:06 Freak613

Here is an improvement suggestion:

export function compile(node) {
  // backwards compatibility: if only one node child then unwrap the element from fragment
  if (node.nodeType === 11 && node.childNodes.length === 1) {
    node = node.firstChild
  }
  node._refPaths = genPath(node)
  node.collect = walker
  // because we might unwrap firstChild from fragment we have to tell which element was used
  return node
}

const compilerTemplate = document.createElement('template')
export function h(strings, ...args) {
  const template = String.raw(strings, ...args)
    // remove whitespace after line changes when not within < and >, and all empty lines at end
    // use https://regex101.com/ if you need to understand better
    .replace(/^[\n\s]+|\n(\s+|(?=<)|\n{0,}$)(?![^<]*>)/g, '')
    // surround all # tags with comment elements within element contents
    .replace(/#[^#<>\n\s]+(?![^<]*>)/g, '<!---->$&<!---->')
    // remove unnecessary comments at: beginning of element contents, next to another comment, end of element contents
    // however will preserve comments at very beginning and end of template (so there can be # tags)
    .replace(/><!---->/g, '>')
    .replace(/<!----></g, '<')
  compilerTemplate.innerHTML = template
  return compile(compilerTemplate.content)
}

This will then allow for syntax like this:

const view = h`
  #hello
  <div #element>
    I can write #stuff here and it goes #exactly where I want.
  </div>
  #world
`

Resulting HTML that is passed to template:

<!---->#hello<div #element="">I can write <!---->#stuff<!----> here and it goes <!---->#exactly<!----> where I want.</div>#world<!---->

This gives exact control of whitespace while also allowing for inline tags. And you can still omit whitespace just like before by having tags on their own line:

const view = h`
<div>
  #hello
  #world
</div>
`
<div>#hello<!---->#world</div>

The comment tags also prevent any browser from collapsing the resulting text nodes together, even if .normalize is called - although it does remove empty text nodes.

An issue with the above suggestion is that <style>, <script>, <noscript> and <template> element contents are also processed which will lead to invalid CSS/JS or unintended comment tags to noscript and template. I don't want to know how complex handling those via regex would be! :D In the other hand those elements are likely to be out-of-scope for this library anyway.

Merri avatar Nov 17 '19 17:11 Merri

Yeah, this is exactly what lit-html doing. I don't like idea of filling with comment nodes, as they're also DOM nodes, it becomes pretty easy to create simply looking but awfully organized inside template. And will hurt performance significantly, which is the main point of this library.

As for inline tags, I found

const view = h`
  <div #element>
  </div>
`
...
view.element.textContent = `I can write ${stuff} here and it goes ${exactly} where I want.`

better, as it will result in only one dom update without introducing complex logic for handling text insertions. Most probably, if you will need to insert some dynamic strings in text, on the next step you will want to style it, that will require wrapping such text in its own dom node and removing need for insertions one more time.

Freak613 avatar Feb 12 '20 11:02 Freak613