mirador-textoverlay icon indicating copy to clipboard operation
mirador-textoverlay copied to clipboard

Looking for IIIF 3.0 annotation example

Open joesong168 opened this issue 3 years ago • 11 comments

I've tried following annotationPage with no luck

{
    "id": "https://api.dev.etu.wiki/c/3/9a2febe6-8569-4ab3-b4e1-19e0ad53cbd8/ap/c",
    "type": "AnnotationPage",
    "items": [
        {
            "id": "https://api.dev.etu.wiki/c/3/9a2febe6-8569-4ab3-b4e1-19e0ad53cbd8/ap/c/a/043a7bb7-d77b-44bd-9517-71bf6f551a1a",
            "type": "Annotation",
            "motivation": "supplementing",
            "body": {
                "type": "TextualBody",
                "value": "authorA",
                "format": "text/plain",
                "language": "zh-Hants"
            },
            "target": "https://api.dev.etu.wiki/c/3/9a2febe6-8569-4ab3-b4e1-19e0ad53cbd8#xywh=958,5101,493,493"
        },
        {
            "id": "https://api.dev.etu.wiki/c/3/9a2febe6-8569-4ab3-b4e1-19e0ad53cbd8/ap/c/a/24453f8f-9c8b-4c82-8ae9-ffa8a779f8a6",
            "type": "Annotation",
            "motivation": "supplementing",
            "body": {
                "type": "TextualBody",
                "value": "authorB",
                "format": "text/plain",
                "language": "zh-Hants"
            },
            "target": "https://api.dev.etu.wiki/c/3/9a2febe6-8569-4ab3-b4e1-19e0ad53cbd8#xywh=968,4544,490,462"
        }
    ],
    "@context": "http://iiif.io/api/p/3/context.json"
}

joesong168 avatar Sep 20 '21 15:09 joesong168

Thanks for reporting,I think I've only really tested it with v2 annotations so far, although the code already has paths for v3 support. Should probably be only a question of fixing some small bugs/inconsistencies. Can you also provide a Manifest URL for your fixture so I can test it end-to-end?

jbaiter avatar Sep 29 '21 19:09 jbaiter

Hello Have you had the time to take a look into this ? I am having some issues as well with the Presentation API V3 semantics for the OCR data (supplementing annotation) => nothing is displayed not even the textoverlay tool box. You can find a complete manifest example in the official IIIF cookbook : https://iiif.io/api/cookbook/recipe/0068-newspaper/newspaper_issue_1-manifest.json (https://iiif.io/api/cookbook/recipe/0068-newspaper/)

HenryH09 avatar Oct 18 '21 13:10 HenryH09

Thanks for providing the full manifest, I'll try to find time to work on this, this week or next.

jbaiter avatar Oct 18 '21 13:10 jbaiter

Hi There

With the hope to help @jbaiter to find and fix the culprits faster, here are some findings we had debugging what goes wrong when using IIIF 3.0:

  • The plugin assumes the existence of a resources array on annotations, which was renamed to items in IIIF 3.0. See the according definition.
  • In [saga.js:83] there is a non-version specific @id (IIIF v2), which should be replaced by seeAlso.id ?? seeAlso['@id']
  • fetchExternalAnnotationResources is also very specific to IIIF v2, in particular the usage of resources and resource instead of items and body.
  • Last but not least, as far as I understood the IIIF v3 documentation, supplementing motivated Annotations could also directly link to external OCR resources (e.g. a hocr file), however processTextsFromAnnotations assumes directly OCR content. (i.e. the condition anno.motivation === 'supplementing' is too weak for this particular use.

sauterl avatar Oct 28 '21 11:10 sauterl

I'm not entirely sure whether this is proper IIIF v3 usage, please bare with me, but adding a seeAlso on each canvas with an id pointing towards an external OCR resource (i.e. hOcr or ALTO) the plugin works as intended with some fixes as outlined in my previous comment.

Particularly what did the trick for our use case and manifest is really to tweak the IIIF v3 condition.

@stone12379 For your use case, I guess the findings from the previous comment should already help a lot. However, as far as I can tell, for a more robust IIIF v3 support, processTextsFromAnnotations (as written above), needs a stricter condition in order to filter out external OCR resources (which it currently does not).

sauterl avatar Oct 28 '21 12:10 sauterl

So, some long overdue updates on this front, sorry it took so long, thanks to everybody for the feedback!

The example from the IIIF Cookbook now renders the annotations, but:

  • By default it will use the ALTO in seeAlso for rendering ('proper' OCR always is preferrered to annotations in this plugin)
  • The Annotations are not line-level and thus text rendering is pretty much broken by design, since we rely on the text to be at least structured into lines for some rendering hints that make text selection in SVG work. Additionally, the segmentation in the annotations is not even at the word-level, some annotations contain parts of multiple words.
  • The annotations do not match the canvas, so the overlay does not match the underlying image. For example, the first word annotation is 84 at xywh=182,476,59,43, but it's actually at approximately xywh=143,377,51,39. I assume this happens because the annotations were generated 1:1 from the ALTO which targets a 4562x6282 image, while the IIIF Canvas is 3602x500. The plugin scales down the coordinates when it renders text from the ALTO XML, so it renders just fine. As per the spec annotations are always relative to the dimensions of the canvas they target, so this adjustment is not done for annotations. tl;dr The Annotations in the cookbook example are broken and should be fixed

For comparison, here are two screenshots, one showing the text rendering from the ALTO and one with the annotations:

ALTO text

image

Annotation text

image

I have pushed my changes to the iiifv3 branch, could you please test this version with your manifests @sauterl @joesong168?

jbaiter avatar Jun 22 '22 06:06 jbaiter

Hi @jbaiter, we've recently updated the Newspaper recipe with the following changes:

  • (hopefully) Fixed the annotations
  • Moved the Alto to rendering rather than seeAlso
  • Changed the target of the annotations to include a link to the Manifest

Let us know if you spot any further problems.

Also would it be possible to add a iiif-content parameter to your demo so that we can pass in a manifest and include a link from the cookbook to your plugin? For info Mirador uses the following:

https://projectmirador.org/embed/?iiif-content=https://iiif.io/api/cookbook/recipe/0068-newspaper/newspaper_title-collection.json

glenrobson avatar Aug 07 '23 16:08 glenrobson

Thanks for letting me know! I've updated the code to also look at rendering to discover referenced OCR files and fixed some other IIIF3 stuff related to annotations along the way. The iiif-content parameter is now included in the demo as well.

The ALTO from the Cookbook example, however, doesn't fully match up with the Canvas anymore, something's off:

grafik

https://iiifv3--mirador-textoverlay.netlify.app/?iiif-content=https://iiif.io/api/cookbook/recipe/0068-newspaper/newspaper_issue_2-manifest.json

jbaiter avatar Aug 08 '23 22:08 jbaiter

Thanks for including the iiif-content link and looking at the rendering! Ill see if I can figure what is going on with the ALTO. It was generated using tesseract but maybe I used the wrong sized image or something.

glenrobson avatar Aug 09 '23 10:08 glenrobson

Found the problem: There's a mismatch between the Canvas size and the Image and OCR size:

  • Canvas: 3602x5000px
  • Image: 3517x5000px
  • OCR: 3517x5000px

jbaiter avatar Aug 09 '23 19:08 jbaiter

That is weird! but thank you Ill look at updating the ALTO (and annotations).

glenrobson avatar Aug 10 '23 00:08 glenrobson