Adrien Barbaresi
Adrien Barbaresi
I'll now close the PR for the sake of clarity and move on with the rest. We can come back to it later if you find a solution which does...
I guess we can use an alias an import it during init. This was a question we discussed with @juanjoDiaz but something must have got lost around the way.
This has already been mentioned in #64. - The `0.9.1` readme says: `from simplemma.langdetect import in_target_language, lang_detector` - The current readme says `from simplemma import in_target_language, lang_detector` So we could...
See also https://github.com/adbar/simplemma/commit/58b3ee7430568738f306a5386085eda6628c47d4: `from simplemma.langdetect` → `from simplemma.language_detector`
Tough one, this is an absolute borderline case since multiple matches are usually not present in lists and they may be annotated differently. Concerning the "noun vs. verb" issue this...
The approach you suggest would probably give better results but memory is already a concern for the available dictionaries. One way or the other there is always a tradeoff between...
I now seriously plan a next release before the summer, if everything goes well even earlier than that. The remaining issues will (hopefully) be addressed further along the way, from...
Additional note: dictionaries can be more compact if keys and values are `bytes` instead of `str`. This would be a first step to decrease memory footprint.
@Dunedan First I'd like to say I play 0 A.D. from time to time so it is really nice to see you find Simplemma useful in this context. Thanks for...
Concerning the first point and before you draft a PR: how about using pure-Python tries? Maybe the slowdown is acceptable considering the portability of this solution? Breaking down language data...