tesseract icon indicating copy to clipboard operation
tesseract copied to clipboard

Lines with small line height not recognized

Open torava opened this issue 4 years ago • 3 comments

Lines with low line spacing are missing. I haven't found a way to recognize them. If line is not separated enough it won't be detected. Could Tesseract allow recognizing text from dense lines as well?

Tesseract recognizes none of this this text with a bar code close to it:

86_cropped

torava avatar May 04 '20 18:05 torava

I am having the same problem (one year later). An example image is attached. The last three lines of the table have cramped spacing, with the last two lines actually touching. These three lines are not correctly recognized as separate lines. sss88_p_186

carolinering avatar May 12 '21 16:05 carolinering

Tesseract's layout analysis module can't handle touching blocks or text lines.

amitdo avatar Jun 13 '22 00:06 amitdo

This issue is unlikely to be fixed in the foreseeable future.

amitdo avatar Jun 17 '22 14:06 amitdo