spacyr
spacyr copied to clipboard
Hyphenated words
The spaCy tokenizer splits hyphenated words by inserting a space before and after the hyphen. For example, "eye-opening" becomes "eye - opening". Is there a way to keep hyphenated words together, like with the quanteda tokenizers? (@JBGruber : Any idea? :))