Konstantin Baierer

Results 183 issues of Konstantin Baierer

While checking https://github.com/OCR-D/core/pull/1066, I noticed that we have the rule in the validator but AFAICT not in the specs that the `@pcGtsId` of a PAGE document should be the same...

We briefly talked about those in the Tech Call today and decided to make these part of the spec, hence this PR. I took the liberty of updating the list...

More references you might want to include: - [Leifert & Labahn 2019: End-to-End Measure for Text Recognition](https://ieeexplore.ieee.org/abstract/document/8978155) (on CER and derived metrics, analysis of reading order, segmentation and geometry influences)...

_Originally posted by @bertsky in https://github.com/OCR-D/spec/pull/225#discussion_r1086173671_ > Speaking of: IMHO it would be quite relevant to offer a CER metric under level-2 (or even level-1) equivalency. Not exclusively (because this...

During debugging bertsky/ocrd_detectron2#14 I realized that my assumption that every archive would only contain a single resource was wrong. The detectron2 models consist of a pytorch NN and a YAML...

We forgot to formally specify the behavior here, cf. OCR-D/core#929

The meta-documenations is a bit meagre at the moment, adding (better) descriptions to all fields will help implementers write more expressive ocrd-tool.json. https://github.com/OCR-D/spec/pull/121#discussion_r309177809 ff.

We need to specify how these constructs are related, which one to use, how to handle contradictions.

discussion

Basis for #134 and OCR-D/core#376 Will require a new major version and some seriously frantic pull requesting to all the processors but it's worth it for sustainability IMHO.