Robert Sachunsky

Results 272 issues of Robert Sachunsky

We build Docker by setting `ocrd/core:$CORE_VERSION` as base stage, with `CORE_VERSION` being the ref currently checked in as submodule. That's not ideal, since we then `make all`, which installs core...

Besides addressing the currently open issues… - [x] #335 - [ ] #318 - [ ] #317 - [ ] #302 - [ ] #297 - [x] #249 - [x]...

Among the [published models](https://github.com/doc-analysis/DocBank/blob/master/MODEL_ZOO.md#models), [the one for Detectron2](https://layoutlm.blob.core.windows.net/docbank/model_zoo/X101.zip) recently stopped working. The server now responds with `409 Public access is not permitted on this storage account.` Could this be hosted...

Since `backtraceFrom` is implemented by recursion (instead of iteration), calling the aligner on "long" sequences (more than 1000 items) results in a `RecursionError` with Python defaults. Extending stack depth limit...

I am not sure whether there even exists a metric for this: if there is a segment where both sequences deviate a lot in their length, then the current algorithm...

When globally aligning sequences that deviate much, combinatory explosion can quickly leed to excessive runtime memory consumption in the current implementation. And it is not always easy to detect those...

I am trying to `pixConvertToPdf` a 1bpp image (with cmap I think), but the created PDF file seems to be invalid: ``` GPL Ghostscript 9.26 (2018-11-20) Copyright (C) 2018 Artifex...

bug

See details under https://github.com/sirfz/tesserocr/issues/274

https://github.com/wincentbalin/pytesstrain/blob/b6a85dec3a02b878f8cee7d8170a75e7dabaeca6/pytesstrain/metrics/cer.py#L6 This definition is common, but flawed IMHO: the numerator being a Levenshtein distance, i.e. a sum of costs along a path through the confusion matrix, the natural denominator for...