tesseract icon indicating copy to clipboard operation
tesseract copied to clipboard

Isolated character problem

Open kvcuong opened this issue 2 years ago • 1 comments

Hi,

A lot of isolated characters are lost in the Version 4.1, and even with the latest version. May be they are relatively small (at 200 DPI).

Do you know where is the problem ? Segmentation or connected component filtering ?

Environment

Tesseract Version : 4.1 Window 10

Current Behavior:

isolated_char Ouput text: No. TAG page 07396

Expected Behavior:

No. TAG page 07396 1

Suggested Fix:

kvcuong avatar Jul 18 '22 20:07 kvcuong

I'm also facing the same problem Increasing the DPI doesn't help In my case if I highlight the word Serie: image

It does work, but if I highlight "Serie" and the Letter "B" It doesn't read anything image

danielscatigno-ncpc avatar Aug 10 '23 14:08 danielscatigno-ncpc