Robert Sachunsky
Robert Sachunsky
Even if we lack a free GUI, we should strive to generate [the layouteval schema](http://schema.primaresearch.org/PAGE/eval/layout/2019-07-15) from [PAGE](https://github.com/PRImA-Research-Lab/PAGE-XML/blob/master/layout-evaluation/schema/layouteval.xsd) as evaluation report. Currently, our output looks like this: example report ```JSON {...
We should have heuristics to check for - polygon containment (overlapping regions, word outside line etc.) - artifacts from annotation like point or line-like regions - lines with (way) too...
- supports recursive ReadingOrder (can be disabled via param order=document) - supports setting the hierarchy level to extract from (default level=highest behaves as before) - supports setting line/paragraph boundary strings...
There seem to have been upstream changes recently that prevent our usual deployments for Python version 3.6, with `libgeos` (which we need for `shapely` in `ocrd_validators`) being the culprit. See...
The normal convention for new file IDs in OCR-D is (due to `make_file_id` implementation) the pattern `grp + '_' + page`. But the current bulk-add behaviour automatically assigns `file_id` based...
The spec states that `--page-id` is _both_ a multi-value option (i.e. comma-separated) and a range option (i.e. ellipsis allowing). Above that, core also implements the `//` prefix for regex values....
All our processors now add provenance information about the run to the METS via `mets:agent`. This is useful for diagnosis and reconstruction of the workflow later-on. However, the METS actions...
fixes #916
Since #904 we look up … https://github.com/OCR-D/core/blob/71d295ac1fccbeb4164e230bd584e1920b9ab3c8/ocrd/ocrd/processor/base.py#L246-L250 … to determine the moduledir location. But that fails if the processor module itself is distributed via PEP-420 namespace packages (so the top...
A typical output of `ocrd workspace remove` looks like this: ``` today at 14:28:02Oct 4 12:28:01 ocrd-manager 12:28:01.985 WARNING ocrd.workspace.remove_file - File not locally available today at 14:28:02Oct 4 12:28:01...