hanzi-tools icon indicating copy to clipboard operation
hanzi-tools copied to clipboard

Does segment support splitting Traditional Chinese into words?

Open ShawTim opened this issue 4 years ago • 1 comments

as title.

or, do i need to convert it into simplified chinese first, and then convert it back?

ShawTim avatar Dec 26 '20 11:12 ShawTim

It probably won't work very well for traditional characters because the segmentation library used (jieba) is trained on simplified texts. For now you'll probably have to convert to simplified first.

peterolson avatar Dec 26 '20 11:12 peterolson