h icon indicating copy to clipboard operation
h copied to clipboard

annotate text in SVG documents

Open milahu opened this issue 1 month ago • 3 comments

annotating text in SVG documents is currently not possible with the hypothesis browser extension

challenge: visualize annotated text parts in SVG to make this consistent with annotations in HTML we would have to add a hypothesis-highlight rectangle behind the SVG text element

challenge: handle annotations not aligning with SVG text element boundaries

challenge: handle darkmode (prefers-color-scheme, darkreader, ...)

use case: hocr2epubfxl - EPUB-FXL ebooks based on SVG example book: the preparation: hocrepub

test.svg solutions.svg source

test.svg
<?xml version="1.0" encoding="UTF-8"?>
<svg version="1.1" viewBox="0 0 200 50"
 xmlns="http://www.w3.org/2000/svg"
 xmlns:xlink="http://www.w3.org/1999/xlink"
 xmlns:dc="http://purl.org/dc/elements/1.1/"
 xmlns:cc="http://creativecommons.org/ns"
 xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns">
>

<metadata>
  <rdf:RDF>
    <cc:Work rdf:about="">
      <!-- canonical URL -->
      <!-- <html><head><link rel="canonical" href="https://milahu.github.io/hypothesis-annotations-svg-text/test.svg"/> -->
      <dc:identifier>https://milahu.github.io/hypothesis-annotations-svg-text/test.svg</dc:identifier>
      <dc:source>https://milahu.github.io/hypothesis-annotations-svg-text/test.svg</dc:source>
    </cc:Work>
  </rdf:RDF>
</metadata>

<g fill="#000000" font-family="sans-serif" transform="translate(0 10)">

<text x="10" y="10">some annotated text</text>

</g>
</svg>
solutions.svg
<?xml version="1.0" encoding="UTF-8"?>
<svg version="1.1" viewBox="0 0 200 100"
 xmlns="http://www.w3.org/2000/svg"
 xmlns:xlink="http://www.w3.org/1999/xlink"
 xmlns:dc="http://purl.org/dc/elements/1.1/"
 xmlns:cc="http://creativecommons.org/ns"
 xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns">
>

<metadata>
  <rdf:RDF>
    <cc:Work rdf:about="">
      <!-- canonical URL -->
      <!-- <html><head><link rel="canonical" href="https://milahu.github.io/hypothesis-annotations-svg-text/solutions.svg"/> -->
      <dc:identifier>https://milahu.github.io/hypothesis-annotations-svg-text/solutions.svg</dc:identifier>
      <dc:source>https://milahu.github.io/hypothesis-annotations-svg-text/solutions.svg</dc:source>
    </cc:Work>
  </rdf:RDF>
</metadata>

<defs>
  <filter id="text-bg" x="0" y="0" width="1" height="1">
    <feFlood flood-color="yellow"/>
    <feComposite in2="SourceGraphic" operator="over"/>
  </filter>
</defs>

<g fill="#000000" font-family="sans-serif" transform="translate(0 10)">

<text x="10" y="10">some annotated text</text>

<!-- NOTE rect must be outside of the text element -->
<rect id="rect12345" class="hypothesis-highlight" x="0" y="0" fill="yellow" />
<text x="10" y="30">
 <tspan>some</tspan>
 <a class="hypothesis-highlight" href="javascript:alert('todo show annotation')">
  <tspan id="tspan12345">annotated</tspan>
  <script>
    var r = document.getElementById("rect12345");
    var t = document.getElementById("tspan12345").getBBox();
    var d = 0; // overfill
    console.log("r", r);
    console.log("t", t);
    r.setAttribute('x', t.x - 2*d);
    r.setAttribute('y', t.y - 1*d);
    r.setAttribute('width', t.width + 2*d);
    r.setAttribute('height', t.height + 1*d);
  </script>
 </a>
 <tspan>text</tspan>
</text>

<g transform="translate(10 50)">
 <text x="0" y="0">some</text>
 <a class="hypothesis-highlight" href="javascript:alert('todo show annotation')">
  <rect id="rect123456" class="hypothesis-highlight" x="0" y="0" fill="yellow"/>
  <!-- FIXME moving the annotated word to a separate text element
   - introduces linebreaks between words
   - break text search across word boundaries. example: search for "some annotated"
   some
   annotated
   text
  -->
  <text id="text123456" x="45" y="0">annotated</text>
  <script>
    var r = document.getElementById("rect123456");
    var t = document.getElementById("text123456").getBBox();
    var d = 0; // overfill
    console.log("r", r);
    console.log("t", t);
    r.setAttribute('x', t.x - 2*d);
    r.setAttribute('y', t.y - 1*d);
    r.setAttribute('width', t.width + 2*d);
    r.setAttribute('height', t.height + 1*d);
  </script>
 </a>
 <text x="120" y="0">text</text>
</g>

<text x="10" y="70">
 <tspan>some</tspan>
 <a class="hypothesis-highlight" href="javascript:alert('todo show annotation')">
  <!-- FIXME this hides some text -->
  <!-- FIXME this fills the whole text element, not just the tspan element -->
  <tspan id="tspan12345" filter="url(#text-bg)">annotated</tspan>
 </a>
 <tspan>text</tspan>
</text>

</g>
</svg>

milahu avatar Nov 29 '25 13:11 milahu

You can use image annotations to annotate SVGs https://web.hypothes.is/image-annotations/

acelaya avatar Dec 01 '25 08:12 acelaya

With our new image annotation feature, users can place pins on specific parts of images — just like highlighting text.

https://www.youtube.com/watch?v=mzfei71rtew&t=27

the image must be in PDF format

SVG/HTML/EPUB are better than PDF documents because they support more image formats, allowing better compression so EPUB-FXL is the future of scanned ebooks

in my hocr2epubfxl i use SVG as the default "text format" because chrome browsers allow text search across word boundaries in SVG tspans <text><tspan>some</tspan><tspan>word</tspan></text> which does not work with absolute-positioned HTML spans <div><span>some</span><span>word</span></div> (also: clipboard text, triple-click behavior)

ideally, hypothesis annotations should work for HTML/PDF/SVG documents (standalone SVG documents, not SVG images in HTML documents)

milahu avatar Dec 01 '25 12:12 milahu

Sorry, my mistake. I forgot we never got to finish and enable image annotations in non-PDF documents.

Unfortunately it's also not part of our short or mid term plan.

acelaya avatar Dec 01 '25 13:12 acelaya