sacremoses icon indicating copy to clipboard operation
sacremoses copied to clipboard

can't tokenise the period properly

Open gdxie1 opened this issue 3 years ago • 0 comments

text = "will not be the true meaning. always remember that our mind" print(moses_tokenizer.tokenize(text, escape=False)) I get the following output ['will', 'not', 'be', 'the', 'true', 'meaning.', 'always', 'remember', 'that', 'our', 'mind'] The period adjacent with "meaning" was not divided

gdxie1 avatar May 05 '21 19:05 gdxie1