langutils
langutils copied to clipboard
Period not correctly tokenized?
Here's an example:
LANGUTILS> (tokens-for-ids (vector-document-words (vector-tag "Hello world. I'm here."))) ("Hello" "world." "I" "'" "m" "here.")
I think it should be:
("Hello" "world" "." "I" "'" "m" "here" ".")