words_counted icon indicating copy to clipboard operation
words_counted copied to clipboard

support Japanese grammar

Open iostreamatlab opened this issue 6 years ago • 2 comments

Good code.

require 'words_counted'

counter = WordsCounted.count("私の见た寿司の神の映画、だから私も恋に落ちて寿司") puts counter.token_frequency

then ,puts:

だから私も恋に落ちて寿司 1 私の见た寿司の神の映画 1 [Finished in 0.1s]

iostreamatlab avatar Nov 08 '18 15:11 iostreamatlab

Thanks for the feature request. I'll have to have a look to see how the tokeniser works again. But I don't think it should be an issue. If you use a regular expression, can't you tokenise the string as desired?

abitdodgy avatar Nov 26 '18 10:11 abitdodgy

Thank you for your reply, I know that there are some simple python regular libraries, so ruby must have, thank you.

iostreamatlab avatar Nov 26 '18 15:11 iostreamatlab