tesseract
tesseract copied to clipboard
Lines with small line height not recognized
Lines with low line spacing are missing. I haven't found a way to recognize them. If line is not separated enough it won't be detected. Could Tesseract allow recognizing text from dense lines as well?
Tesseract recognizes none of this this text with a bar code close to it:
I am having the same problem (one year later). An example image is attached. The last three lines of the table have cramped spacing, with the last two lines actually touching. These three lines are not correctly recognized as separate lines.
Tesseract's layout analysis module can't handle touching blocks or text lines.
This issue is unlikely to be fixed in the foreseeable future.