dom icon indicating copy to clipboard operation
dom copied to clipboard

Proposal to improve the DOM creation api

Open straker opened this issue 9 years ago • 38 comments

I'm sorry if this topic has been discussed already, I tried to do due diligence but couldn't find a similar proposal on the forums or the html or dom github repos.

The DOM creation api is a bit cumbersome to work with. To create a single element with several attributes requires several lines of code that repeat the same thing. The DOM selection api has received needed features that allow developers to do most DOM manipulation without needing a library. However, the DOM creation api still leaves something to be desired which sways developers from using it.

There are several use cases where the api is cumbersome to use. I have compiled a gist of just a few of them. It shows several use cases where the current api requires an awkward solution, and demonstrates a few common hacks of working around the native api to get a more manageable result. It also provides several examples of how popular libraries handle the same use case, usually in a simpler manner.

I would like to propose that the DOM creation api be improved so that developers have a cleaner interface into DOM creation and no longer need libraries to do it.

straker avatar Jan 14 '16 06:01 straker

There has been discussion here: https://lists.w3.org/Archives/Public/www-dom/2011OctDec/thread.html#msg20. There was also http://www.hixie.ch/specs/e4h/strawman though TC39 didn't really like it. There is https://github.com/domenic/element-constructors by @domenic which should be finished at some point.

I don't think libraries have really converged on something here either, other than innerHTML-like approaches (which would suggest the E4H approach, perhaps amended to use template strings).

Also note that the libraries that use object-like notation often confuse content attributes and IDL attributes (aka JavaScript properties), which makes this rather tricky.

annevk avatar Jan 14 '16 11:01 annevk

Thanks for those resources, they were very enlightening.

From those discussions, it seems that template strings (or quasi-literal templates from the public archives) seem to be where the discussions keep coming back to, and seem to have generated the most consensus. With ES2015 template strings appearing to fulfill the role of the quasi-literal template proposal, I'm guessing the discussion should be focused on how to use the template string on the parent node?

Domenic's element constructors sounds a bit like Dart's solution to create a new constructor for elements (Dart went a bit farther and made one for each element). It looks like he uses the object-like notation to define namespaces or attribute declarations, so would it suffer from the same problem of confusing content attributes and IDL attributes?

straker avatar Jan 14 '16 18:01 straker

Template strings seem like a way forward, though making them work with the HTML parser seems tricky. I think that is why nobody has tried thus far. Element constructors might indeed have that same problem, though they are also solving a more fundamental problem, that these classes don't have constructors. Which I think is why we want them either way, even if they didn't have convenient syntax for attributes and such.

annevk avatar Jan 14 '16 18:01 annevk

Ok. I'll go play with how browsers interpret different template strings and the HTML parser and see where it doesn't behave as expected. I'll post my findings here and focus on HTML parsing of template strings. Once a consensus on how to handle HTML parsing of template strings has been reached, the discussion could then turn into how to implement a convenient syntax for using the template string on the parent node.

Do you know of any more resources to discussions on template strings and the HTML parser?

straker avatar Jan 14 '16 19:01 straker

Searching around I found https://lists.w3.org/Archives/Public/public-script-coord/2013JanMar/thread.html#msg263 and https://lists.w3.org/Archives/Public/public-script-coord/2013JanMar/thread.html#msg297.

annevk avatar Jan 15 '16 10:01 annevk

It seems that the discussion around E4H and template strings is more complicated than I would like this proposal to address. In the end, both template strings and E4H rely on the already existent DOM apis of appendChild or innerHTML to work with the DOM and don't introduce any new ones. https://lists.w3.org/Archives/Public/www-dom/2011OctDec/thread.html#msg20 eventually wound up going towards template strings, but it looked like they were proposing a new browser implemented function called html that would convert a string using the HTML parser (which reverts back to the E4H and template string debate).

With that in mind, I would like to propose adding new DOM apis that make these use cases much cleaner and easier to carry out:

  1. creating a single node with attributes
    • if this could also handle the use case of creating a single node and it's children in a single step, that'd be great
  2. creating a series of sibling nodes to be appended to a parent

These apis could then be used by either E4H or template strings, which ever one wins out in the end. If this new api is the html function, it shouldn't go through the HTML parser if it'll have problems such as https://lists.w3.org/Archives/Public/www-dom/2011OctDec/0170.html was describing. Instead, it should just accept the string it's given and create a tr with it's children, regardless if the node doesn't have context.

(a very simple approach that mimics the behavior would be)

function html(str) {
  var match = str.match(/<([^>]*)>/);
  var rootStr = match[1].trim().split(/\s/);
  var tagName = rootStr[0];
  var root = document.createElement(tagName);

  // attributes
  for (var i = 1; i < rootStr.length; i++) {
    var attr = rootStr[i];
    var name = attr.substring(0, attr.indexOf('='));
    var value = attr.match(/=['"]?([^'"]*)/)[1];

    root.setAttribute(name, value);
  }

  // children
  root.innerHTML = str.replace(match[0], '');  // kinda hakcy as it will ignore the orphaned closing tag

  return root;
}

console.log(html(`<tr class="foo" data-config=bar>
  <td class="hello">
    <span class="foo">bar</span>
  </td>
</tr>`));
console.log(html(`<tr>`));

straker avatar Jan 19 '16 20:01 straker

So I stumbled upon tagged template strings, which makes it seem that template strings and E4H can live in harmony. It would seem that you could remove (deprecate, etc.) the innerHTML function that takes a string and instead strictly use an HTML tagged function. The function can now understand the insertion points of the template string and can do E4H style processing to return safe DOM. This tagged function could also be used to just generate dom by itself.

This would be the new API to creating DOM that would satisfy my uses cases.

var text = "foo";

// `html` is the tagged template function that runs E4H

// create a single node with attributes
document.body.appendChild( html`<div class="bar">${text}</div>` );

// create a series of sibling nodes
var element = html`<div>${text}</div><div>bar</div>`;

straker avatar Feb 09 '16 00:02 straker

Yeah, I think that kind of API would be ideal. @freddyb, any sanitizing library should look like the above ^^. Making this work with the HTML parser is a lot of work unfortunately.

annevk avatar Feb 09 '16 11:02 annevk

I thought one of the benefits of E4H was that you didn't have to go through the HTML parser. Either way, I would be happy to help get something like this working. What can I do to help?

straker avatar Feb 09 '16 16:02 straker

That is true, E4H had a much simpler grammar. I don't know if folks would find that acceptable though. They probably expect similar parsing rules to <template> so you can write <img> as <img> and not <img/>, omit </td>, etc.

I don't know how familiar you are with the HTML parser, but figuring out what adjustments would need to be made to make html...`` work would be a good first step. Coupled perhaps with a basic algorithm for that template string function.

annevk avatar Feb 09 '16 17:02 annevk

Personally I think the best first step is to produce a library with the desired semantics and have it get reasonable adoption, before we consider standardizing it. It's still early days for template strings and to me it doesn't make sense to talk about standardizing a template string tag yet.

domenic avatar Feb 09 '16 17:02 domenic

Yeah, maybe. At some point we need to make sanitizing easier and add it to browsers. That will require a similar API of sorts. And also, this is hugely complicated to get right. @straker buliding a prototype on top of https://github.com/inikulin/parse5 (or similar library) might be a good first step here.

annevk avatar Feb 10 '16 08:02 annevk

Sounds good, I'll get to work getting a prototype using a tagged template function built on top of a parsing library. I'll also see if I can incorporate some of the E4H ideals into it just as a proof of concept.

straker avatar Feb 10 '16 18:02 straker

I'm not sure why you need a parsing library exactly? Can't you just use

document.createElement("template");
template.innerHTML = passedInStringAfterSubstitutions;
return template.contents;

Maybe the problem is safely generated passedInStringAfterSubstitutions. but maybe it's not; just do HTML escaping and then concatenation. See e.g. https://github.com/domenic/count-to-6/blob/master/lib/exercises/tagged_template_strings/solution/solution.js

domenic avatar Feb 10 '16 18:02 domenic

If you didn't care that you always escaped the substituted DOM that would work, but what happens when I trust the substitution (e.g. I created it) and don't want to escape it so it will create DOM instead of string escaped text?

I'm guessing that this contributes to what makes this problem hugely complicated to get right.

straker avatar Feb 10 '16 18:02 straker

I guess that's where contextual auto escaping comes in.

domenic avatar Feb 10 '16 18:02 domenic

@domenic you want to be able to set an attribute value without having to account for whether or not the passed in value included " or ' or whitespace (in case it's unquoted). Similar for the contents of an element and such.

annevk avatar Feb 10 '16 19:02 annevk

Alright, here is the initial draft of the html tagged template https://github.com/straker/html-tagged-template.

I combined the best principles of E4H and contextual auto escaping to prevent XSS attacks, and it turned out pretty well if I do say so myself. What I would love now is help from security experts, like Mike Samuel who wrote about contextual auto escaping, to further the XSS prevention since I don't have a lot of experience in that area.

Also, I'm not sure the best way to allow HTML variable substitution to be marked as safe so it isn't escaped when added to the DOM.

straker avatar Feb 20 '16 00:02 straker

This does look pretty cool. I hope people use it.

My one big problem is that I don't like the overloaded return type (sometimes a node, sometimes an array). I think I would probably prefer a DocumentFragment all the time. Or maybe two helpers, one that throws if more than one element is parsed, and one that always gives an array? or always gives a document fragment?

domenic avatar Feb 20 '16 00:02 domenic

So I'm not sure how to proceed from here. There's been some nice discussions on the repo, but I feel that unless something changes, it won't go much further than where it currently is. Do you have any suggestions?

straker avatar Mar 02 '16 08:03 straker

I think the main hindrance is it becoming a popular way to create elements, maybe even the defacto way. And probably if we were to add an API like that I'd like the default to be safer than innerHTML. E.g., have XSS filtering by default and require the usage of unsafeHTML...`` to do what the library does now.

(Then there's also other activity around the HTML parser that likely takes priority for implementers, such as making custom elements work and providing a streaming API around the HTML parser (once we have sorted out network streaming).)

annevk avatar Mar 02 '16 08:03 annevk

I find it pretty bad to make string templates, as soon as it's on client-side, (and you more likely need to keep references, add/remove event listeners, ...

I'd really like that document.createElement become like React.createElement

and be able to do: (with const h=document.createElement below)

h('ul', {className:'something'},
  items.map( ({text})=> h('li', {onClick: e=>{/*..*/}}, 
      text,
      h('span', {onClick: close}, '✕')
    )
  )
)

caub avatar Aug 11 '16 12:08 caub

That seems way worse than

h`<ul class="something">${items.map(({text}) => h`
  <li onclick=${e=>/*..*/}>
    text
    <span onclick=${close}>✕</span>
  </li>`}
</ul>`

domenic avatar Aug 11 '16 12:08 domenic

Ah thanks, indeed, https://github.com/straker/html-tagged-template (is this the right link?) looks interesting, similar to jsx. I'm just sceptic on how events are added

After seeing https://github.com/straker/html-tagged-template/blob/master/index.js I don't see anything for event listeners, and all thoses regex feel quite dirty (ofc jsx parser might be similar).. My approach seems unnatural at first-see, but pretty practical, if not more after a bit of training

caub avatar Aug 11 '16 12:08 caub

The proposal only dealt with the creation of DOM. Adding event listeners can still be done through the normal ways once the DOM is created.

var dom = html`<div>
  <button>Click me</button>
</div>`

dom.querySelector('button').addEventListener('click', function() { /* ... */ });

straker avatar Aug 11 '16 22:08 straker

I saw your implementation, here's a short one that does Domenic's example https://gist.github.com/caub/da489a286b0098d0fcd799b66a252196#file-h-js

caub avatar Aug 12 '16 00:08 caub

I've been working on a similar library recently and was pointed to this issue as a place that might be interested.

The library is called lit-html, and it uses template literals, but not to create Nodes immediately, but to create templates that can be efficiently updated with new data later.

https://github.com/PolymerLabs/lit-html

The syntax is quite similar, though the templates are usually going to be in a function:

const template(data) => html`<ul>${data.map(d=>html`<li>${d}</li>`}</ul>`;

The big difference is that the result of the tag is a TemplateResult containing a Template and the expression values. This can then be rendered multiple times to the same container:

render(data) {
  const result = html`<ul>${data.map(d=>html`<li>${d}</li>`}</ul>`;
  result.renderTo(document.body);
}

You can call render() multiple times and only the dynamic parts/expressions will be updated.

The updating strategy and API is based on discussion on standardizing <template> expressions had here: https://github.com/whatwg/html/issues/2254 When a template is cloned/instantiated, it creates both nodes and Part objects, which can be updated independently of the rest of the template instance.

lit-html is really a layering of two parts which could be looked at for platform support:

  1. Parsing special JS template literals as <template> with placeholders for expressions.
  2. Something like https://github.com/whatwg/html/issues/2254 to allow efficient updates of <template> clones.

I've tried to make the design extensible so that additional opinionated features can be layered on top. There's an included extension that allows templates to set properties on elements by default instead of attributes, and supports declarative event handlers.

const button = (data) => html`
  <my-button
      class$="${data.isPrimary ? 'primary' : 'secondary'}"
      on-click=${_=>data.onClick}
      someProperty=${data.state}>
    ${data.label}
  </my-button>
`;

(you can see that here: https://github.com/PolymerLabs/lit-html/blob/master/src/labs/lit-extended.ts )

I also prototyped stateful template helpers, with an implementation of a repeat() function that renders a list of keyed data and will reuse and reorder the DOM nodes from previous renders:

const list = (items) => html`
  <ul>
    ${repeat(items, (i) => i.uniqueId, (i, index) => html`
      <li>${index}. ${i.title}</li>
    `}
  </ul>
`;

(repeat is here: https://github.com/PolymerLabs/lit-html/blob/master/src/labs/repeat.ts )

Definitely looking for feedback. As far as standardization, I know we'd need to get this to be popular first, which I think is possible...

/cc @rniwa

justinfagnani avatar Jul 26 '17 00:07 justinfagnani

Template Literals are awesome, but I have a non-template string proposal at https://github.com/whatwg/dom/issues/477 which may align more closely with how this discussion started.

jonathantneal avatar Jul 26 '17 01:07 jonathantneal

it's this more or less?

/*
// html helper, usage
var span = $('span', 'Hello')
var div = h('div', {id:'test', onClick:console.log}, span, 'test2')
*/

const safeAttrs = new Set(['textContent', 'id', 'className', 'htmlFor', 'disabled', 'checked', 'autocomplete', 'crossorigin', 'async', 'innerHTML']); // probably missing some

function $(tag, ...o){
	const el = document.createElement(tag);
	const childrenIndex = o.findIndex(x => typeof x=='string' || typeof x=='number' || x instanceof Node);
	const props = Object.assign({}, ...o.slice(0,childrenIndex);
	for (var k in props) {
		const value = props[k];
		if (typeof value=='function') {
			const name = k.slice(2).toLowerCase();
			el.addEventListener(name, value);
		} else {
			if (k=='style') Object.assign(el.style, value);
			else if (safeAttrs.has(k)) el[k] = value;
			else if (typeof value=='string') el.setAttribute(k, value);
		}
	}
	el.append(...o.slice(childrenIndex+1));
	return el;
}

caub avatar Jul 26 '17 01:07 caub

@caub, in a few ways it’s similar, like the node name as the first argument and recognizing functions as events. The business with safe attributes to cleverly assign CSS seems like the stuff of DOM libraries. I don’t want clever, and I separated my proposal into parts just in case I had anything similarly over-reaching.

Everyone’s got a gimmick now

I just want to easily create elements. Elements typically host attributes, events, and other nodes. That’s it. Doing this natively right now is a behemoth.

This proposal doesn’t even solve for namespaced tags or namespaced attributes. And CSS can use selectors.

.append() is a fantastic example of a well constructed API. The motivation is simple — I want to append nodes easily. Nodes are typically elements or text.

I think the similarities in libraries does represent a cowpath. I just also think libraries suffer because they are always trying to be so clever.

jonathantneal avatar Jul 26 '17 03:07 jonathantneal