tessdata_best icon indicating copy to clipboard operation
tessdata_best copied to clipboard

can you recommend the best traineddata for numbers and latin letters

Open JenyaKirmizaTripTop opened this issue 7 years ago • 4 comments
trafficstars

I'm using tesseract for reading mrz codes and sometimes it gives me incorrect symbols eg. instead of "I" it gives me "1" or instead of "5" it gives "S"

JenyaKirmizaTripTop avatar Feb 03 '18 13:02 JenyaKirmizaTripTop

Which version of tesseract and traineddata are you using?

Shreeshrii avatar Mar 20 '18 15:03 Shreeshrii

I'm using tess2 and i want to select which traindata will fit bettter cause i need to read from MRZ passport code, and it has such symbols '<,>', numbers and latin letters I see some mistakes while reading <,> and also while reading latin letters. Sometimes it replaces the numbers with letters. For example 5 as S

JenyaKirmizaTripTop avatar Mar 20 '18 17:03 JenyaKirmizaTripTop

You may get better response to such questions on the tesseract-ocr forum.

Or open an issue on for tesseract.

You can also search for mrz.traineddata, it won't be in official repo but user contribution

Shreeshrii avatar Mar 20 '18 17:03 Shreeshrii

See https://github.com/tesseract-ocr/tesseract/wiki/AddOns

Shreeshrii avatar Mar 21 '18 04:03 Shreeshrii