wordninja
wordninja copied to clipboard
[Feature Request] Exception List
Hello,
I have got an idea. If an exception list or whitelist is implemented such that the word sequence is not split, it'll be very good. For example, "tensorflow" is split into "tensor" and "flow". So, we can provide entities that we don't want them to be split and after the algorithm runs, they remain the same.
this would be a great feature addition