dom Consider specifying document.evaluate and document.createNSResolver

These are currently left to DOM 3 XPath. However, that spec is (a) very old, and thus wrong in a lot of ways; (b) not very large. It could maybe be subsumed and thus give implementations an actual non-crazy reference.

XPathEvaluator.prototype.evaluate has ~0.7% usage and isn't going anywhere. XPathEvaluator.prototype.createNSResolver has ~0.04% and so is also likely here to stay. However XPathEvaluator.prototype.createExpression is at 0.001% and could probably be left out. Which is great because that means we can very likely kill XPathExpression.

Other features of the spec that don't seem to be implemented are XPathException and XPathNamespace.

According to a comment in Blink's source code, XPathEvaluator has a constructor in reality, even if not in the spec.

Credit to @sideshowbarker for bringing this up.

Sep 03 '15 02:09 domenic

See https://wiki.whatwg.org/wiki/DOM_XPath.

Sep 03 '15 03:09 annevk

Also see https://blogs.windows.com/msedgedev/2015/03/19/improving-interoperability-with-dom-l3-xpath/

Feb 24 '16 12:02 zcorpan

XPathEvaluator.prototype.evaluate has ~0.7% usage and isn't going anywhere. XPathEvaluator.prototype.createNSResolver has ~0.04% and so is also likely here to stay. However XPathEvaluator.prototype.createExpression is at 0.001% and could probably be left out. Which is great because that means we can very likely kill XPathExpression.

More modern numbers seem to be ~2.3%, ~0.3%, ~0.1%, all vastly higher than four years ago. (Interestingly, createExpression seems to have gone up from ~0.01% to ~0.1% over the past few months very suddenly; some major site now using it?)

That said, to try and write up some to-do list:

[ ] Try and get consensus on what the WebIDL should look like (WebKit has contextNode optional, defaulting to document; Blink changed this to match Gecko a while back because the contextNode default was totally non-obvious and undocumented)
[ ] Define the DOM -> XPath data model (note this includes several intentional violations of the XPath data model, as the wiki page notes)
[ ] Define what each WebIDL operation/attribute does
[ ] Integrate the HTML DOM special case into the DOM spec from HTML

May 29 '19 11:05 gsnedders

I made https://github.com/whatwg/dom/pull/763 today. Didn't know about this issue, but I'll link it.

May 29 '19 13:05 foolip

@gsnedders in https://github.com/whatwg/dom/pull/763#discussion_r288570832:

XPathException is gone (just use DOMException)

The query is matched against the DOM, and therefore contrary to the XPath 1.0 data model the root element has a parent (the Document) and text nodes can be adjacent to one another. [wilful violation]

Aug 28 '19 09:08 foolip

I've updated https://wiki.whatwg.org/wiki/DOM_XPath to point at https://dom.spec.whatwg.org/#xpath for the Web IDL definitions.

Aug 30 '19 09:08 foolip

FWIW, I've just refactored some code from this:

function notifyIfMatchesXPath(query, notify) {
  const flag = XPathResult.ORDERED_NODE_SNAPSHOT_TYPE;
  const callback = () => {
    const result = document.evaluate(query, document, null, flag, null);
    for (let i = 0, {snapshotLength} = result; i < snapshotLength; i++)
      notify(result.snapshotItem(i));
  };
  new MutationObserver(callback).observe(
    document,
    {characterData: true, childList: true, subtree: true}
  );
  callback();
}

to this:

function notifyIfMatchesXPath(query, notify) {
  const evaluator = new XPathEvaluator();
  const expression = evaluator.createExpression(query, null);
  const flag = XPathResult.ORDERED_NODE_SNAPSHOT_TYPE;
  const callback = () => {
    const result = expression.evaluate(document, flag, null);
    for (let i = 0, {snapshotLength} = result; i < snapshotLength; i++)
      notify(result.snapshotItem(i));
  };
  new MutationObserver(callback).observe(
    document,
    {characterData: true, childList: true, subtree: true}
  );
  callback();
}

assuming the cost of parsing, and validating, the XPath query would've been removed from the mutation dance/equation, but I've discovered only recently XPathEvaluator and createExpression, and I'm sure if other developers knew about it, its usage would be closer to the document.evaluate one, which is also now at 2.x%.

If the direction is to nuke XPathEvaluator though, I rather would like to know it before landing such refactoring, thanks.

Oct 13 '20 11:10 WebReflection

Hmm, createExpression looks to have hit 0.3% recently: V8XPathEvaluator_CreateExpression_Method.

DocumentXPathCreateExpression is also up to about 0.025%.

I'm not quite sure what the deal is with these two feature values; it's best I post this before I look too far down that rabbit hole or I might forget entirely ...

Feb 27 '22 22:02 SamB

An attempt at a DOM to XPath XDM has been made at https://qt4cg.org/specifications/xpath-functions-40/Overview.html#html. Feedback is welcome.

Note that this needs to add a conversion mode for not processing namespaces so that when namespaces are enabled it will correctly map XHTML documents and when disabled it will correctly map HTML documents. -- The exact mechanism for this (likely an additional processing option) has not been agreed yet.

It attempts to deal with the various willful violations of the XPath/XML data model in a way that it can handle the different flavours of (X)HTML in the corresponding fn:parse-html function. This includes things like the non-conforming handling of template elements.

Jul 17 '23 08:07 rhdunn

An attempt at a DOM to XPath XDM has been made at https://qt4cg.org/specifications/xpath-functions-40/Overview.html#html. Feedback is welcome.

Note existing behaviour in browsers cannot be described by the DOM -> XDM conversion alone.

For example, matching //FooBar will match a (http://www.w3.org/1999/xhtml, foobar) element, but it won't match a (http://www.w3.org/1999/xhtml, FooBar) element.

Oct 24 '23 17:10 gsnedders

See https://github.com/qt4cg/qtspecs/issues/296 for making the XPath matching side of this work, given a HTML document mapped to the XDM.

Oct 24 '23 17:10 rhdunn

It might be worth moving at least the XPath section of https://html.spec.whatwg.org/multipage/infrastructure.html#interactions-with-xpath-and-xslt into DOM, because it's really about matching XPath against HTML Documents, which we nowadays define in DOM.

Oct 24 '23 22:10 gsnedders

dom dom copied to clipboard

Consider specifying document.evaluate and document.createNSResolver

dom
dom copied to clipboard