core icon indicating copy to clipboard operation
core copied to clipboard

Collection of OCR-related python tools and wrappers from @OCR-D

Results 215 core issues
Sort by recently updated
recently updated
newest added

Noticed while fixing the broken tests in https://github.com/OCR-D/ocrd_kraken/pull/42: Here, we use `Resolver.workspace_from_url` without `download`, which copies the `mets.xml` and nothing else. ```python @pytest.fixture() def workspace(tmpdir): if os.path.exists(tmpdir): shutil.rmtree(tmpdir) workspace =...

Some service providers (e.g. VL) use authentication for web access to digitisations that are still being processed and not ready for public access. Without authentication, such digitisations are not visible...

The 2.30 release of requests is now based on the urllib3 release 2.0, which brought some breaking changes with it. For now, I've pinned requests to < 2.30 because google-auth...

We have used `mets:note` to record the workflow provenance now, in accordance with [this proposal](https://github.com/OCR-D/spec/issues/108). Alas, currently, we define the namespace prefix `ocrd="https://ocr-d.de"` ad-hoc on each leeve element. This can...

For running jobs, we agreed there is a need for some kind of timeout anyway, universally. It gets enforced by - Processor Server model: the calling Processing Server (currently [with...

From https://github.com/OCR-D/core/pull/974#issuecomment-1483588724 @MehmedGIT: > Okay, another note that will potentially be raised soon. Currently, the supported processing-worker CLI is still the old one: > > ```shell > # 1. >...

In the Processing Server, we currently add jobs to the queue unconditionally, without checking whether any job is already running on the respective workspace: https://github.com/OCR-D/core/blob/b7130307ce68c5f074c1ceea24b66ae9ee9ef289/ocrd_network/ocrd_network/processing_server.py#L281 Obviously, this will create inconsistent...

Currently, our precaution of only downloading to those `resource-locations` which the processor supports resolving, is quite user-unfriendly: ``` ERROR ocrd.cli.resmgr - The selected --location {location} is not in the {this_executable}'s...

![image](https://user-images.githubusercontent.com/273367/228526558-bf531997-4d15-45ef-8b59-fb9227037b6e.png) @bertsky: > METS server implementation is still ongoing, but if you follow my [advice](https://github.com/OCR-D/core/pull/966#pullrequestreview-1261544355), and agree with my interpretation… > > > I believe the METS server is an...

IMO we should strive to support much more validation and repair features in ocrd_validators.page_validator – esp. functionality known from [PRImA Converter and Validator](https://primaresearch.org/tools/PAGEConverterValidator) (PCV) and [HTR United VX](https://github.com/HTR-United/HTRVX) (HTRVX). From...