Mike Gerber
Mike Gerber
> But how to save a global JSON report in the METS? It would not "manifest a physical page" which OCR-D seems to demand for any file @kba @bertsky @cneud...
(Closing the issue was an accident, I often hit the wrong buttons)
I do think that COMBINING LATIN SMALL LETTER O and COMBINING RING ABOVE are entirely different characters and not equivalent. However, I do think that the normalization behaviour should be...
> @cneud, yes, the issue can be solved with substitutions which can be configured by the users. Exactly. I aim to support (Unicode terms) canonical equivalence (= using NFC consistently)...
I agree mostly with what @bertsky and @cneud said. I just want to throw in some doubt on the belief that CERs are somehow comparable when produced by different tools....
The header is in `TextRegion` `r3`, but the `ReadingOrder` only includes the main text in `r1`, so dinglehopper does only extract the main text. This means: The file is buggy,...
I will definitely support line level because I'd like to display line images for OCR errors (a feature that would save me a lot of time), so this will come...
Unfortunately I did some unmerged work on the text extraction due to #9 so that I need to reimplement your changes when my changes are merged. Could you provide some...
> Do you have any "real" examples for word level extraction? @wrznr I'd really like some real world example for this :) I understand the feature request but I'd _really_...
@wrznr I understand the use case, and I will implement it. I've done some work on the extraction due to #9 that I need to finish, next step is implementing...