segtok
segtok copied to clipboard
bug: split_contractions fails for certain patterns
For example:
split_contractions(word_tokenizer("OʼHaraʼs"))
<<< ['OʼHara', 'O', 'ʼHaraʼs']