gImageReader icon indicating copy to clipboard operation
gImageReader copied to clipboard

open .hocr files

Open milahu opened this issue 4 months ago • 0 comments

continue #729 part of #438

ideally i want to edit the hocr files like

gimagereader-qt6 001.hocr 001.jpg
gimagereader-qt6 002.jpg 002.hocr 003.jpg 003.hocr

this already works with .html files but .hocr files are ignored

gimagereader-qt6 001.hocr.html
gimagereader-qt6 002.hocr.html 003.hocr.html

extra image files are counted as separate pages but the page images referenced in the hocr files are used

<div class='ocr_page' id='page_1' title='image "001.tiff"; bbox ...'>

Would actually be trivial to also allow the .hocr file extension, but I'm not sure that's actually a standardized extension?

sounds like youre waiting for the central committee of file extensions to allow this use case... ; )

see also https://github.com/kba/hocr-spec/issues/115

milahu avatar Aug 19 '25 10:08 milahu