sentiment icon indicating copy to clipboard operation
sentiment copied to clipboard

Allow tokenizer to split emoji without space delimiter

Open thisandagain opened this issue 8 years ago • 1 comments

PR #74 initially incorporated a change which used spliddit's hasPair method to split emoji without a "space" or other delimiter. While useful, this change had fairly serious performance consequences. I think it's worthwhile to include this functionality, but it should be optional.

thisandagain avatar Mar 29 '17 13:03 thisandagain

I think the mod in PR #74 needs a better implementation anyway. We won't only want to split on surrogate pairs. Sometimes we'll probably want to split when there isn't a pair.

thorbert avatar Mar 31 '17 23:03 thorbert