Robert Sachunsky
Robert Sachunsky
@MehmedGIT what's your status of the OcrdMets profiling experiment?
@MehmedGIT I don't understand: the [benchmark-mets](https://github.com/OCR-D/core/tree/benchmark-mets) branch does not seem to contain any actual changes to the modules, only additional tests in `benchmarks` (i.e. outside the normal test set). Also,...
@MehmedGIT I see, thanks.
But we first need the extraction Python API in core. Currently, it is only in `common` of OCR-D/ocrd_tesserocr, right? (related: OCR-D/ocrd_tesserocr#56)
Also, bashlib needs to offer more than just _extraction_: we also have to create PAGE-XML output with Olena binarization (referencing the new file in `AlternativeImage` and in the METS). So...
I fully agree. FYI, for Olena I am in the middle of a PR that will allow querying from and appending to PAGE's `AlternativeImage` with xmlstarlet – tentatively only one...
@kba, see bashlib-related FIXMEs in OCR-D/ocrd_olena#5
> can this be closed? I don't think so. We should at least re-visit. What we have now as a proof of concept in `ocrd-olena-binarize` and `ocrd-im6convert` is based on...
> If we had a [generic image processor in Python](https://github.com/OCR-D/core/issues/385), we could probably reduce the need for that shell API greatly. We do have that (as part of [ocrd_wrap](https://github.com/bertsky/ocrd_wrap)'s `ocrd-preprocess-image`),...
There's also ocrd_im6convert. And I don't think we should restrict bashlib in any way just because there are no more processors using it right now. In fact, I think it's...