Eric Kafe comments

Results 106 comments of


                                            Eric Kafe

Silence verbose warnings in closure

This PR is now superseded by #3245.

Avoid recursive suffix stripping in wordnet morphy

Converted to draft, as it seems possible to handle more of the issues related to WordNetLemmatizer,.

Avoid recursive suffix stripping in wordnet morphy

The historical WordNet lemmatizer is Morphy, so many users would intuitively expect a more standard behaviour from WordNetLemmatizer.lemmatize(). But instead, that wrapper is defined by non-standard features: it defaults to...

A potential edge case for WordNetLemmatizer.lemmatize()

WordNetLemmatizer.lemmatize() always returns the shortest possible lemma. Admittedly, this is often not ideal, but it is _the_ defining feature of that lemmatizer. So rather than changing it, I would prefer...

WordNet: Synsets' relations are absent from their corresponding lemmas

The docstring for the Lemma class still claims that lemmas can directly return the relations defined for their synset. However, this is not true in practice, and such calls only...

WordNet: Synsets' relations are absent from their corresponding lemmas

On the other hand, most people probably think of hypernymy as a relation between words, rather than synsets. WordNet could be easier to use, and more widely accepted, if less...

WordNet: Synsets' relations are absent from their corresponding lemmas

Anyway, just editing the docstring would not be sufficient for fixing this issue. Currently, all the relations are defined at the level of the WordnetObject class, so the Lemma class...

Set reachable depth for generate

Here's a small experiment to verify the unintuitive behaviour of the recursion limit in Python 3: ``` from sys import setrecursionlimit setrecursionlimit(10) def recurse(n): print(n) recurse(n+1) recurse(1) ``` 1 2...

Set reachable depth for generate

Strangely, CI failed only with Python 3.11 on macos-latest.

fix for word_tokenize() Failing to Split English Contractions When Followed by [\t\n\f\r]

This doesn't solve the full problem: _word_tokenize_ would still fail to split contractions followed by [\a\b\v]. So the first step would be trying to find the extent of the problem....