johnfarina
Results
2
comments of
johnfarina
The same is true for both Chinese and Korean as well. sacremoses splits all characters: Here's some Chinese: ``` >>> mt = MosesTokenizer(lang='zh') >>> mt.tokenize("记者 应谦 美国") ['记', '者', '应',...
Oh wow, comment on a github issue, go to bed, wake up, bug is fixed! Thanks so much @alvations !!