stanza icon indicating copy to clipboard operation
stanza copied to clipboard

Syllable+character tokenizer (for, e.g., Thai)

Open gsychi opened this issue 2 years ago • 2 comments

Description

This push adds the thai syllable+character token model framework to stanza. Note that this does not replace the existing models present in the library.

Potential changes required: removing pythainlp syllable_segmenter in models/syllabletok/data.py

gsychi avatar Sep 20 '21 19:09 gsychi

@FTdiscovery What is the corpus's source?

wannaphong avatar Sep 09 '22 21:09 wannaphong

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Aug 11 '23 22:08 stale[bot]

This issue has been automatically closed due to inactivity.

stale[bot] avatar Apr 22 '24 09:04 stale[bot]