spec icon indicating copy to clipboard operation
spec copied to clipboard

Specification of the @OCR-D technical architecture, interface definitions and data exchange format(s)

Results 53 spec issues
Sort by recently updated
recently updated
newest added

Currently, we only specify how to describe the hierarchy of pages (represented by a set of files under `mets:structMap/mets:div/mets:div`) and their order. But nothing so far on logical structure **across...

It would be very useful to introduce a new mets:fileGrp especially for ground truth datasets. This new group includes both Region and Line level segmentations. Suggested name: ```xml ```

While checking https://github.com/OCR-D/core/pull/1066, I noticed that we have the rule in the validator but AFAICT not in the specs that the `@pcGtsId` of a PAGE document should be the same...

We briefly talked about those in the Tech Call today and decided to make these part of the spec, hence this PR. I took the liberty of updating the list...

instead of https://github.com/OCR-D/ocrd-website/pull/354

The specification currently makes no suggestion on how to deal with more than one consecutive white space character.

From my and @bertsky's discussion at https://github.com/qurator-spk/eynollah/issues/67: >> Yes, it should be possible to skip pages marked as certain types in the logical structmap – not just in any one...

According to this [discussion on the Processing Server implementation](https://github.com/OCR-D/core/pull/974#discussion_r1138901846), we should simplify here. (But it must be clear at all times what is a workflow job ID and what is...