tesstrain
tesstrain copied to clipboard
how to prepare the data for new tessdata images in khmer lang
how to prepare the data for new tessdata images in khmer
Khmer or Arabic?
In principle, it's the same process for every language, you need pairs of lines images and their transcription. AFAIK Khmer is a left-to-right script with distinct characters, so you basically only need ground truth data and you can train both from scratch or fine-tune the existing Khmer model.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.