tesseract icon indicating copy to clipboard operation
tesseract copied to clipboard

Provide ability to disable DebugPixa

Open robertknight opened this issue 2 years ago • 4 comments

I am working on a WebAssembly build of Tesseract targeted at browsers. To reduce the download size I have compiled Leptonica with all optional image formats disabled, since browsers already include code to read images in various formats.

Even though I am not using DebugPixa, this code still gets invoked and it produces warnings about missing PNG support. To work around this I have patched Tesseract to avoid instantiating the DebugPixa object entirely. I'm happy to tidy up the relevant parts of the patch and submit it as a PR, but I wanted to check how it should work first. Is avoiding instantiating DebugPixa if tessedit_dump_pageseg_images is not set the right approach, or is there a better way?

This relates to https://github.com/tesseract-ocr/tesseract/projects/1#card-78973352 on the project board.

robertknight avatar May 31 '22 07:05 robertknight

The card on the project boards suggests a different solution which should also work for you. It requires a configure option to disable DebugPixa and conditional code which removes the relevant code if DebugPixa is disabled.

stweil avatar May 31 '22 09:05 stweil

https://github.com/janis91/ocr is another project with uses WebAssembly via tesseract.js.

stweil avatar May 31 '22 09:05 stweil

It requires a configure option to disable DebugPixa and conditional code which removes the relevant code if DebugPixa is disabled.

Yes, either a runtime or a build-time option works for me.

robertknight avatar May 31 '22 11:05 robertknight

I'd prefer the build-time variant which allows minimized code = minimized runtime resources.

stweil avatar May 31 '22 11:05 stweil