Konstantin Baierer
Konstantin Baierer
> A question. Is the --allow-enhancement option resultant image intended to be used in an OCR-D workflow? No, we currently do not use the intermediate results of eynollah, only the...
If you want to try out more OCR-D tools, then you're really best of with a dedicated Python 3.7 environment for that because there are other processors that will not...
> could something like `pip > 19` be added to `requirements.txt`? It could but then you still have to run it twice because the first time round you're still using...
[@mikegerber](https://gitter.im/OCR-D/Lobby?at=5f197bbc15e5e72fb36edf7a) > here is my collection of METS/PAGE file fixer scripts, as mentioned in the call: https://github.com/mikegerber/sbb-useful-hacks/tree/master/mets-fixers - not to be used lightly, no warranty, you have been warned 🚧...
> Should these use case groups maybe put into two separate processors/tools? Yes, probably. Or even task-specific processors (`ocrd-sanitize-prune-filegroups`, `ocrd-sanitize-textequiv` ...)
Of interest in this context: https://github.com/tboenig/AletheiaTools
Another useful operation: Assign `pcGtsId` from the `mets:file/@ID`
> Something related: extract METS/MODS from xml_doc created from OAI-Response like this: > > ``` > mets_root_el = xml_root.find('.//mets:mets', XMLNS) > if mets_root_el is not None: > return ET.ElementTree(mets_root_el) >...
> does not cost much It's not a question of performance, just maintainability and documentation. But I hear you, we'll keep it available for now and revisit the issue in...
I suspect that during validation, data is persisted at the wrong place. We had that problem before with [spurious `TEMP` folders inside workspaces after validation](https://github.com/OCR-D/core/issues/324). But I haven't yet investigated,...