Konstantin Baierer issues

Results 183 issues of


                                            Konstantin Baierer

Specify the "mets:file[@ID] should match pc:Page[@pcGtsId]" rule

While checking https://github.com/OCR-D/core/pull/1066, I noticed that we have the rule in the validator but AFAICT not in the specs that the `@pcGtsId` of a PAGE document should be the same...

Specs for Continous Integration and Code Reviews

We briefly talked about those in the Tech Call today and decided to make these part of the spec, hence this PR. I took the liberty of updating the list...

Additional literature for QA Spec

More references you might want to include: - [Leifert & Labahn 2019: End-to-End Measure for Text Recognition](https://ieeexplore.ieee.org/abstract/document/8978155) (on CER and derived metrics, analysis of reading order, segmentation and geometry influences)...

GT-level specific metrics

_Originally posted by @bertsky in https://github.com/OCR-D/spec/pull/225#discussion_r1086173671_ > Speaking of: IMHO it would be quite relevant to offer a CER metric under level-2 (or even level-1) equivalency. Not exclusively (because this...

ocrd_tool: allow object for path_in_archive of resources

During debugging bertsky/ocrd_detectron2#14 I realized that my assumption that every archive would only contain a single resource was wrong. The detectron2 models consist of a pytorch NN and a YAML...

Add --dump-module-dir to the spec

We forgot to formally specify the behavior here, cf. OCR-D/core#929

ocrd-tool schema: Describe "description" and the semantics of in/output_file_grp

The meta-documenations is a bit meagre at the moment, adding (better) descriptions to all fields will help implementers write more expressive ocrd-tool.json. https://github.com/OCR-D/spec/pull/121#discussion_r309177809 ff.

Relation of METS and PAGE ReadingOrder

We need to specify how these constructs are related, which one to use, how to handle contradictions.

discussion

define Standard Parameters for dpi, input-level, output-level

Basis for #134 and OCR-D/core#376 Will require a new major version and some seriously frantic pull requesting to all the processors but it's worth it for sustainability IMHO.

add 1st draft line GT/training specs

enhancement