ocrs icon indicating copy to clipboard operation
ocrs copied to clipboard

Automatically downscale large input images

Open robertknight opened this issue 1 year ago • 8 comments

Input images from cameras etc. often have a much higher resolution than is needed to read the text. Downscaling the image can often produce the same output in much less time. This is because all of the steps in the pipeline that work directly on the input image have a lot less memory to move around and less computation to do if it is smaller.

As an example, I ran ocrs on an invoice I'd received from a tradesman recently. The photo of the invoice was 2479 x 3337 pixels and ocrs takes about 1.5s to process it on my Intel Mac. Downsizing to 30% of the original input size produces the same extracted output but runs nearly twice as fast (800-900ms).

In some cases the input image really does need high resolution to make the text legible, so some mechanism to control this would be useful.

robertknight avatar Jan 07 '24 12:01 robertknight