docling icon indicating copy to clipboard operation
docling copied to clipboard

Customizing dpi value for ocr

Open harinisri2001 opened this issue 10 months ago • 4 comments

@dolfim-ibm How can I pass the customized dpi value for ocr?

harinisri2001 avatar Jan 30 '25 10:01 harinisri2001

No, currently this is not exposed as a parameter. Each OCR engine has a given image scaling factor in its implementation.

In most cases it is

self.scale = 3  # multiplier for 72 dpi == 216 dpi

dolfim-ibm avatar Jan 30 '25 10:01 dolfim-ibm

@dolfim-ibm Thank you for the quick reply.Even with TesseractCliOcrOptions() it is not possible?

harinisri2001 avatar Jan 30 '25 11:01 harinisri2001

It is possible to add it as an option to all OCR models, but at the moment the value is hard-coded.

dolfim-ibm avatar Jan 30 '25 13:01 dolfim-ibm

More precisely, the DPI value is hardcoded individually per OCR backend, since every OCR backend has its own expectations regarding resolution.

cau-git avatar Jan 30 '25 14:01 cau-git

@harinisri2001 Closing this issue, as it appears there is no further follow-up required. Please re-open if you have further input.

cau-git avatar May 23 '25 08:05 cau-git