divec
divec
This is analogous to [**sentencex-js**/pull/3](https://github.com/wikimedia/sentencex-js/pull/3) — both remove the `terminators` codepoint list module, and instead use an external library that tracks the Unicode Character Database.
Thanks @santhoshtr. If we maintain the list of terminators within sentencex, then we probably also need to maintain the list of close punctuation and spaces in order to [expose boundary...
> The terminators collection was originally sourced from [wikinlp tools repo](https://gitlab.wikimedia.org/repos/research/wiki-nlp-tools). I re-ran unichars program and I don't see these Amharic/Ethiopic chars in it. That must be a descrepancy in...