core icon indicating copy to clipboard operation
core copied to clipboard

Collection of OCR-related python tools and wrappers from @OCR-D

Results 215 core issues
Sort by recently updated
recently updated
newest added

@bertsky I could not completely reproduce what has been mentioned here (https://github.com/OCR-D/core/pull/770): > take the default ocrd_utils/ocrd_logging.conf and modify it in some places, say by defining a logger for qualname=PIL...

Python 3.6 will reach end-of-life at the end of 2021 and more and more libraries and tools that we rely on in OCR-D stop supporting Python 3.6 in favor of...

With this in place, users can use URL directly for parameter values: ``` ocrd-tesserocr-recognize -P model https://github.com/tesseract-ocr/tessdata_best/raw/main/bos.traineddata ``` and it should download on demand the first time it encounters and...

Until now, the `ocrd resmgr download --overwrite` flag only had an effect if it was not set: Download would abort because the file already exists. This PR adds an `overwrite`...

As an idea to make sharing knowledge about (combinations of) external resources easier: It would be great if `processor.resolve_resource` would not only search the locally installed resources, but also try...

enhancement

PAGE-XML uses the XMLSchema `ID` type for all structure segments, which precludes duplicates (makes such usage invalid). However, when doing segmentation on a `OcrdPage` instance, if some segments already exist...

Note: In METS, the labels are a flat sequence of `gt:state` elements with `@prop` from the above mentioned schema file, one per page. ```XML ``` These are then referenced under...

``` ocrd resmgr download -n ocrd-detectron2-segment ... ocrd resmgr list-installed -e ocrd-detectron2-segment - All_X152.yaml @ data (https://layoutlm.blob.core.windows.net/tablebank/model_zoo/detection/All_X152/All_X152.yaml) Found at data on 2022-01-19 12:44:17.192397 - model_final_01ca85.pkl @ data (???) Found at...

Currently `ocrd workspace find --download` is the way to download the files of a workspace. I propose aliasing this to an - arguably - more user-friendly `ocrd workspace download` command.

enhancement