Konstantin Baierer comments

Results 279 comments of


                                            Konstantin Baierer

trafficstars

Use of image from --allow-enhancement option in OCR-D workflow

> A question. Is the --allow-enhancement option resultant image intended to be used in an OCR-D workflow? No, we currently do not use the intermediate results of eynollah, only the...

Eynollah on Python 3.8

If you want to try out more OCR-D tools, then you're really best of with a dedicated Python 3.7 environment for that because there are other processors that will not...

Eynollah on Python 3.8

> could something like `pip > 19` be added to `requirements.txt`? It could but then you still have to run it twice because the first time round you're still using...

RFC: ocrd-sanitize script to preprocess/postprocess OCR-D workspaces`

[@mikegerber](https://gitter.im/OCR-D/Lobby?at=5f197bbc15e5e72fb36edf7a) > here is my collection of METS/PAGE file fixer scripts, as mentioned in the call: https://github.com/mikegerber/sbb-useful-hacks/tree/master/mets-fixers - not to be used lightly, no warranty, you have been warned 🚧...

RFC: ocrd-sanitize script to preprocess/postprocess OCR-D workspaces`

> Should these use case groups maybe put into two separate processors/tools? Yes, probably. Or even task-specific processors (`ocrd-sanitize-prune-filegroups`, `ocrd-sanitize-textequiv` ...)

RFC: ocrd-sanitize script to preprocess/postprocess OCR-D workspaces`

Of interest in this context: https://github.com/tboenig/AletheiaTools

RFC: ocrd-sanitize script to preprocess/postprocess OCR-D workspaces`

Another useful operation: Assign `pcGtsId` from the `mets:file/@ID`

RFC: ocrd-sanitize script to preprocess/postprocess OCR-D workspaces`

> Something related: extract METS/MODS from xml_doc created from OAI-Response like this: > > ``` > mets_root_el = xml_root.find('.//mets:mets', XMLNS) > if mets_root_el is not None: > return ET.ElementTree(mets_root_el) >...

[RFC] Drop workspace backup mechanism?

> does not cost much It's not a question of performance, just maintainability and documentation. But I hear you, we'll keep it available for now and revisit the issue in...

ocrd workspace validate: strange behaviour with symbolic links

I suspect that during validation, data is persisted at the wrong place. We had that problem before with [spurious `TEMP` folders inside workspaces after validation](https://github.com/OCR-D/core/issues/324). But I haven't yet investigated,...