incubator-annotator icon indicating copy to clipboard operation
incubator-annotator copied to clipboard

Interleaving selections within the DOM

Open BigBlueHat opened this issue 6 years ago • 6 comments

Finding selections within the DOM and even wrapping them in an element is easy enough, and most developers just "roll their own" highlighter/selector for things like that--hence, they don't "shop" for tools like Apache Annotator for that.

However, juggling interleaved selections in the DOM is tricky and not standardized.

The DOM is a tree. Selections point at regions all over that tree, often intermixed.

We should build tooling to handle that interleaving to manage the display, removal, eventing, etc, for such selections.

See also #45 and #22.

Example:

<div>
<mark id="a1">Call me <mark id="a2">Ishmael</mark></mark>. Some years ago—never mind how long precisely—<mark id="a3">having little or <mark id="a4">no money in my purse</mark></mark><mark id="a4">, and nothing particular to interest me on shore</mark>, I thought I would sail about a little and see the watery part of the world.
</div>
  • a2 is within a1 and so will have eventing and display related trickiness
  • a4 is made up of 2 marks, but is currently invalid as they share an id--which conceptually "relates" them as a unit, but the DOM doesn't work that way.
    • both sets of <mark/> elements would need shared events, display, removal, etc.
  • a3 also includes 1 part of a4, but not all of it, so weird eventing and display issues again

Solving this (or even just exploring it) is something developers know they need, so likely it should be near the top of our list to solve. 😄

BigBlueHat avatar Nov 16 '18 14:11 BigBlueHat

@BigBlueHat You're thinking here about sections that can be described by element boundaries, but not arbitrary indexing (character-by-character) into texts, right?

ajs6f avatar Nov 16 '18 14:11 ajs6f

@ajs6f arbitrary indexing.

tilgovi avatar Nov 20 '18 20:11 tilgovi

@ajs6f mainly selections that traverse multiple element boundaries--i.e. selecting part of A and part of B. Trees don't do that so good.

BigBlueHat avatar Nov 26 '18 14:11 BigBlueHat

This seems pretty challenging. I did some work in a similar area years ago but it was simple and text-only, and it wasn't completely trivial. Is this really meant for text (or text-y) documents, or anything that could be addressed by the DOM (SVG, other things like that)?

ajs6f avatar Jan 02 '19 21:01 ajs6f

Originally posted this as a separate issue in #78 (but closing and moving here to keep @Treora sane 😉):

https://www.w3.org/TR/intersection-observer/

Might help with highlighter and other anchoring implementations in the DOM.

BigBlueHat avatar Jun 04 '20 15:06 BigBlueHat

Since recently we have a simple highlighter, which wraps text nodes in <mark> elements, and ignores any existing <mark>s to allow for nested use. I just created some simple tests (see PR #84), including ones inspired by the examples above, that deal with situations like this: '<b>lorem ipsum <mark>dolor <mark2>am</mark2></mark><mark2>et yada</mark2> yada</b>'.

Note there is a difficulty with how a Range behaves when the DOM is modified: running highlight with our current highlighter can mess up other Range objects that point at the same text nodes. So this can cause trouble:

range1 = anchor(annotation1);
range2 = anchor(annotation2);
highlight(range1); // this may mess up range2
highlight(range2); // highlights some unintended target.

A solution is to anchor&highlight as a single action: considering the Range a ephemeral pointer:

range1 = anchor(annotation1);
highlight(range1);
range2 = anchor(annotation2);
highlight(range2);

Our generator-based approach to anchoring should help to do this right, but still it is a pitfall that I’m not very happy with. Some ideas for avoiding this problem:

  • Using a highlighter that does not modify the document content; some highlighter approaches add an (svg) element to the end of the <body> and display it on the right spot using absolute positioning. While solving this issue, it does create others (e.g. need to reposition the element when text reflows).
  • Stop using Range for our ‘hydrated’ selector, i.e. as our way to point at a part of the DOM. We could e.g. try implement a ‘RobustRange’ that updates its start-&endContainer&-Offset as needed in one way or another.

For the time being, or if we decide not to fix this, we should probably warn users in any documentation and examples that Ranges are perishable.

Note that the behaviour of Range actually differs between the current jsdom and web browsers, so it is important to run relevant tests in a browser (use our yarn start command) to ensure the tests pass there too.

On the bright side, I added tests to check if one can remove highlights in arbitrary order, and that seems to work as intended. This should give the freedom to ignore the tree structure and treat highlights as being independent from each other. Please suggest/write other scenarios that we should include in our tests.

Treora avatar Jul 24 '20 12:07 Treora