Eric Kafe
Eric Kafe
This PR is now superseded by #3245.
Converted to draft, as it seems possible to handle more of the issues related to WordNetLemmatizer,.
The historical WordNet lemmatizer is Morphy, so many users would intuitively expect a more standard behaviour from WordNetLemmatizer.lemmatize(). But instead, that wrapper is defined by non-standard features: it defaults to...
WordNetLemmatizer.lemmatize() always returns the shortest possible lemma. Admittedly, this is often not ideal, but it is _the_ defining feature of that lemmatizer. So rather than changing it, I would prefer...
The docstring for the Lemma class still claims that lemmas can directly return the relations defined for their synset. However, this is not true in practice, and such calls only...
On the other hand, most people probably think of hypernymy as a relation between words, rather than synsets. WordNet could be easier to use, and more widely accepted, if less...
Anyway, just editing the docstring would not be sufficient for fixing this issue. Currently, all the relations are defined at the level of the WordnetObject class, so the Lemma class...
Here's a small experiment to verify the unintuitive behaviour of the recursion limit in Python 3: ``` from sys import setrecursionlimit setrecursionlimit(10) def recurse(n): print(n) recurse(n+1) recurse(1) ``` 1 2...
Strangely, CI failed only with Python 3.11 on macos-latest.
This doesn't solve the full problem: _word_tokenize_ would still fail to split contractions followed by [\a\b\v]. So the first step would be trying to find the extent of the problem....