Robert Sachunsky
Robert Sachunsky
Since the last update I am getting ``` 09:12:50.405 INFO eynollah - INPUT FILE PHYS_0001 (1/3) 09:12:50.972 INFO eynollah - Resizing and enhancing image... 09:12:50.972 INFO eynollah - Detected 300...
This reverts commit fd56b86acf55677dc7a8bfb9e2737c3cc167327a, reversing changes made to ea792d1e4ac4a722770b82dc91e71f84d5beb212. This is the second attempt, same reasoning as in #107 (creating an upstream ref for ocrd_all), but different target: We want...
The only documentation for this kwarg is in the standalone CLI: > in general, eynollah uses RGB as input but if the input document is strongly dark, bright or for...
Sometimes one does have prior knowledge about the overall page layout of a document. For example, for historical monographs, there will often be _no columns_, but perhaps a few _tables_...
Currently, running `ocrd workspace find --download` will download files, but print `None` for every file it downloaded (instead of the new `local_filename`). The cause is that `ret_entry` does not get...
AFAICS the ocrd-tool.json's `parameter_usage` is nowhere used (except in a log message), and resolving resources `without-extension` does not work. It looks like instead of `os.path.exists`, in this case one would...
How about setting an optional upper [limit on memory usage](https://docs.python.org/3.9/library/resource.html#resource.RLIMIT_RSS) for processors via another environment variable, say `OCRD_MAX_RSS`? This could also be done externally (e.g. via Docker in [ocrd_all](https://github.com/OCR-D/ocrd_all/issues/280)), but...
From a `workspace validate` I got: ```XML METS has no unique identifier Validation aborted with exception: Traceback (most recent call last): File "/data/ocr-d/ocrd_all/venv38/lib/python3.8/site-packages/ocrd_validators/workspace_validator.py", line 149, in _validate self._validate_mets_files() File "/data/ocr-d/ocrd_all/venv38/lib/python3.8/site-packages/ocrd_validators/workspace_validator.py",...
Somehow the recent changes made `ocrd log` calls take several seconds to complete, rendering usage in scripts like ocrd-import prohibitive.
We now have OCR-D GT under [Github](https://github.com/OCR-D/gt_structure_text/releases) (the old KIT repo has been down for a while, so this is the only place to get the data). It gets created...