tessdoc icon indicating copy to clipboard operation
tessdoc copied to clipboard

Provide more information on LANG_TYPE in the documentation

Open giri-kum opened this issue 2 years ago • 2 comments

Is there a lookup table of LANG_TYPE for all the languages that tesseract support?

giri-kum avatar Aug 17 '22 17:08 giri-kum

Please give more details. What do you mean by "lookup table of LANG_TYPE"?

stweil avatar Aug 17 '22 17:08 stweil

@stweil By that I meant, what LANG_TYPE is used for each languages during training. The documentation here says that https://github.com/tesseract-ocr/tesstrain defines the LANG_TYPE which can take Indic, RTL or blank.

I assume it is blank for English, Indic for Hindi, RTL for Arabic. It would be helpful while finetuning if we have this list as a lookup table for all the traineddata files that are present in the tessdata repository.

giri-kum avatar Aug 18 '22 18:08 giri-kum