can you recommend the best traineddata for numbers and latin letters

Open JenyaKirmizaTripTop opened this issue 8 years ago • 4 comments

I'm using tesseract for reading mrz codes and sometimes it gives me incorrect symbols eg. instead of "I" it gives me "1" or instead of "5" it gives "S"

Feb 03 '18 13:02 JenyaKirmizaTripTop

Which version of tesseract and traineddata are you using?

Mar 20 '18 15:03 Shreeshrii

I'm using tess2 and i want to select which traindata will fit bettter cause i need to read from MRZ passport code, and it has such symbols '<,>', numbers and latin letters I see some mistakes while reading <,> and also while reading latin letters. Sometimes it replaces the numbers with letters. For example 5 as S

Mar 20 '18 17:03 JenyaKirmizaTripTop

You may get better response to such questions on the tesseract-ocr forum.

Or open an issue on for tesseract.

You can also search for mrz.traineddata, it won't be in official repo but user contribution

Mar 20 '18 17:03 Shreeshrii

See https://github.com/tesseract-ocr/tesseract/wiki/AddOns

Mar 21 '18 04:03 Shreeshrii