stanza
stanza copied to clipboard
Syllable+character tokenizer (for, e.g., Thai)
Description
This push adds the thai syllable+character token model framework to stanza. Note that this does not replace the existing models present in the library.
Potential changes required: removing pythainlp syllable_segmenter in models/syllabletok/data.py
@FTdiscovery What is the corpus's source?
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
This issue has been automatically closed due to inactivity.