Santhosh Thottingal
Santhosh Thottingal
Hi @divec, Sorry for delay in review. Holidays.. All the tests are passing with this pull request, but I noticed a performance regression. Running accuracy.py in the benchmarks folder of...
The terminators collection was originally sourced from [wikinlp tools repo](https://gitlab.wikimedia.org/repos/research/wiki-nlp-tools). I re-ran unichars program and I don't see these Amharic/Ethiopic chars in it. That must be a descrepancy in the...
Thanks a lot for this initiative. :heart: FYI, The js port is available at https://github.com/wikimedia/sentencex-js - (I should update the docs with link to this) What is the meaning of...
As per https://phabricator.wikimedia.org/T372753 removing cxserver related routes is fine.
Thanks for reporting. I guess I need to prepare platform specific builds for that index.node. The one I published in npm is based on Linux build. Need to setup CI...
Platform specific builds were done in pull request #26 . Not published in npm yet
Good catch!
man is given as an abbreviation in English's abbreviations list. https://github.com/wikimedia/sentencex/blob/master/src/languages/abbrev/en.txt#L107 I am considering a re-review and removing abbreviations that are full words like 'wash', 'man', 'mass'
If a word is a valid full word and an abbreviation at the same time, we will need a completely different strategy to classify them as abbreviation or word. That...
I installed 0.4.1 using `curl -fsSL https://install.danklinux.com | sh` method in Ubuntu 25.10. `dms version` shows 0.4.1, even after I ran `dms update` or `dms` and chose update options with...