Robert Sachunsky
Robert Sachunsky
> Perhaps we should open an issue in core for the general scenario of early downsampling (as a derived image) and then re-using that image instead of the original (with...
@kba Point 3 was about representation of the _output_ of multi-OCR alignment on the line level (not on the page level), so it's about word segmentation (not line segmentation). @finkf...
Some considerations which might help to swing the decision between a specialised PAGE annotation (with deviating semantics) and a new customised XML format: pro PAGE: 1. We already use it...
Sorry to get back so late, but this problem seems to be a Gordic knot of sorts. Getting good real-life example data entails having some OCR which can already give...
Sure! So here is what the above (artificial) example could look like: ```XML m n r i r y v p , o a e y my pay ```  from the latter. @kba I updated (with forced push) the lattice extension...
@cneud thanks for bringing this to attention. Yes, I am aware of CITlab's confmat approach/format. (In fact, I have [linked](https://github.com/OCR-D/spec/issues/72#issuecomment-469019453) to it above and mentioned the possibility to extend PAGE...
I am much in favour of a uniform parameter name for this in the spec. I would even argue that usually processors should provide that parameter (with a default) even...