iiif-stories icon indicating copy to clipboard operation
iiif-stories copied to clipboard

I would like to access the OCR text at a specific granularity

Open altomator opened this issue 7 years ago • 2 comments

Description

IIIF exposes OCR text as annotations on images. But OCR text is generally produced by OCR systems with a structure at character/word/line/paragraph levels

--> I would like to get OCR text on a specific level

Example: A marginal note (http://gallica.bnf.fr/ark:/12148/bpt6k96006893/f20) recognized by the OCR as 2 paragraphs: http://gallica.bnf.fr/iiif/ark:/12148/bpt6k96006893/f20/529,1076,287,203/full/0/native.jpg http://gallica.bnf.fr/iiif/ark:/12148/bpt6k96006893/f20/526,1281,287,123/full/0/native.jpg

Proposed Solutions

Let the user the ability to choose the granularity of the OCR text

Additional Background

ALTO and IIIF on-going work

altomator avatar Mar 01 '17 09:03 altomator