tesstrain icon indicating copy to clipboard operation
tesstrain copied to clipboard

how to prepare the data for new tessdata images in khmer lang

Open mengleang-ngoun opened this issue 2 years ago • 1 comments

how to prepare the data for new tessdata images in khmer

mengleang-ngoun avatar Aug 10 '22 10:08 mengleang-ngoun

Khmer or Arabic?

In principle, it's the same process for every language, you need pairs of lines images and their transcription. AFAIK Khmer is a left-to-right script with distinct characters, so you basically only need ground truth data and you can train both from scratch or fine-tune the existing Khmer model.

kba avatar Aug 10 '22 15:08 kba

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Nov 02 '22 01:11 stale[bot]