dinglehopper
dinglehopper copied to clipboard
Feature request: status line text with segment IDs
To make navigation in the source annotation easier, show the current TextRegion / TextLine / Word / Glyph ids in the browser's status bar.
Yes, that would be useful. This feature is somewhat connected to #10 and #5 as I need to do text extraction differently to retain the segment ids.
Or you wait for the PAGE-XML DOM to give us references upwards in the hierarchy. See OCR-D/core#313 and discussion there.
Latest master now has a tooltip to display the segment id:

This is currently the region id for PAGE, and the texline id for ALTO as these are the levels we are currently extracting from. (Upcoming feature #5 will give more options.)
Support to display this for the word differences is also upcoming, until then this issue should stay open.
(This feature took a while because the internal text representation needed some new plumbing.)