iiif-stories
iiif-stories copied to clipboard
I would like to access the OCR text at a specific granularity
Description
IIIF exposes OCR text as annotations on images. But OCR text is generally produced by OCR systems with a structure at character/word/line/paragraph levels
--> I would like to get OCR text on a specific level
Example: A marginal note (http://gallica.bnf.fr/ark:/12148/bpt6k96006893/f20) recognized by the OCR as 2 paragraphs: http://gallica.bnf.fr/iiif/ark:/12148/bpt6k96006893/f20/529,1076,287,203/full/0/native.jpg http://gallica.bnf.fr/iiif/ark:/12148/bpt6k96006893/f20/526,1281,287,123/full/0/native.jpg
Proposed Solutions
Let the user the ability to choose the granularity of the OCR text
Additional Background
ALTO and IIIF on-going work