python-bpe icon indicating copy to clipboard operation
python-bpe copied to clipboard

Consider using `tok` as tokenizer; faster and more customizable

Open kootenpv opened this issue 4 years ago • 1 comments

Simple example would be to import word_tokenize from tok instead of from nltk.

See: https://github.com/kootenpv/tok

kootenpv avatar Jul 04 '19 18:07 kootenpv

Hey @kootenpv, thanks for the suggestion! Care to put up a PR?

soaxelbrooke avatar Jul 07 '19 17:07 soaxelbrooke