gImageReader icon indicating copy to clipboard operation
gImageReader copied to clipboard

worse results than previous version

Open u6aab opened this issue 2 years ago • 1 comments

I'm getting a bunch of random letters and symbols included in my results after upgrading from 3.3.1 to 3.4.0, both times using default options running on windows 10. it seems to have something to do with image quality (maybe sensitivity to grain/noise?) since running a screenshot of a pdf through doesn't produce the same issues.

3 4 0 3 3 1

u6aab avatar Apr 09 '22 15:04 u6aab

I am encountering exactly the same problem. I use gImageReader for OCR of old German (Fraktur) texts. I expected some improvement with version 3.4.0, but while I get almost 100% correct results with version 3.3.1, version 3.4.0 delivers lots of mistakes - with the same image to read, the same *.traineddata file and the same program options. Is there a possible reason for this, e.g. the change from tessdata 4 to tessdata 5? Result with 3.3.1:

OCR 3 3 1

Result with 3.4.0: OCR 3 4 0

Jossi2 avatar Aug 16 '22 18:08 Jossi2

Same issue for me. About 90% of all output is random letters. File, file type, and dpi make no difference. Occurs every time.

ToxicSmurf avatar Sep 20 '22 15:09 ToxicSmurf

I can't tell from the screenshots, are you using the Windows version? I use version 3.4 in linux and I have no issues, but the portable windows version produces the bad OCR results, with the same source material.

hendrack avatar Dec 28 '22 09:12 hendrack

Issue seems to be solved with version 3.4.1. Almost faultless results again. Thank you for taking the trouble to further improve this excellent program!

Jossi2 avatar Jan 29 '23 20:01 Jossi2

Thanks for your feedback!

manisandro avatar Jan 29 '23 20:01 manisandro