sbb_binarization icon indicating copy to clipboard operation
sbb_binarization copied to clipboard

Document Image Binarization

Results 10 sbb_binarization issues
Sort by recently updated
recently updated
newest added

When processing a document of 1.5k pages of medium size (1-2 MP each), I am observing a slow but steady increase in RSS from 4 GB up to 14 GB...

adds the option to use a directory as input for batch processing

``` $ ocrd resmgr download ocrd-sbb-binarize default-2021-03-09 14:29:34.198 INFO ocrd.cli.resmgr - Downloading registered resource 'default-2021-03-09' (https://github.com/qurator-spk/sbb_binarization/releases/download/v0.0.11/saved_model_2021_03_09.zip) ``` ``` $ ocrd-sbb-binarize -P model default-2021-03-09 -I OCR-D-IMG -O TEST-OCRD-SBB-BINARIZE Traceback (most recent...

Minimum Tensorflow version sticks with 2.4 but should be 2.12.1 to be in line with [eynollah](https://github.com/qurator-spk/eynollah/tree/main).

I would like to fine-tune the model towards the data that I will be feeding it. My pipeline would be to binarize the images using sbb_binarize, then manually edit them...

``` ❯ sbb_binarize --model-dir saved_model_2021_03_09 actevedef_718448162.first-page/OCR-D-IMG/OCR-D-IMG_00000024.tif test.tif Traceback (most recent call last): File "/home/b-mg106/.virtualenvs/sbb_binarization_issue-47/bin/sbb_binarize", line 8, in sys.exit(main()) [...]File "/home/b-mg106/.virtualenvs/sbb_binarization_issue-47/lib/python3.9/site-packages/tensorflow/python/saved_model/loader_impl.py", line 116, in parse_saved_model raise IOError( OSError: SavedModel file does...

adds a hybrid CNN-Transformer model

In order to batch-binarize thousands of images, I've rewritten the prediction script to allow us to predict around 1500-2000 images per hour on a decent machine with two GPUs. The...

I have material with typewritten forms that is very challenging (to any binarization method), because the typewriter sometimes fades out, while the printing ink near it blasts in a dark...

question

When the model is applied in patch mode (the default), a loop over the windows is run (on CPU / in Numpy) and passed to `model.predict()` as a single image...

enhancement