words_counted
words_counted copied to clipboard
support Japanese grammar
Good code.
require 'words_counted'
counter = WordsCounted.count("私の见た寿司の神の映画、だから私も恋に落ちて寿司") puts counter.token_frequency
then ,puts:
だから私も恋に落ちて寿司 1 私の见た寿司の神の映画 1 [Finished in 0.1s]
Thanks for the feature request. I'll have to have a look to see how the tokeniser works again. But I don't think it should be an issue. If you use a regular expression, can't you tokenise the string as desired?
Thank you for your reply, I know that there are some simple python regular libraries, so ruby must have, thank you.