Eric Kafe

Results 106 comments of Eric Kafe

@Higgs32584, I don't know the full problem scope, there could be more... Neither do I know the best place to do the substitution, but I have verified that it works...

The cause of the problem is that the two last lines under ENDING_QUOTE are handling contractions, using a regular expression that requires the contraction to be followed by a plain...

Thanks @Higgs32584, this looks good. Test cases are always much appreciated everywhere.

@alvations and @53X, a more consistent interpretation of pos=None could be nice, but in that case, the default should not be "n", but rather "Any pos". Please consider the morphy()...

Ideally, to get a consistent behaviour across the Wordnet Morphy-related wrappers, "WordNetLemmatizer.lemmatizer()" could just be an alias for the morphy() wrapper from wordnet.py. Actually, I find that the name "WordNetLemmatizer"...

[PR #3225]( https://github.com/nltk/nltk/pull/3225#issuecomment-1890890747) proposes to add two standard "morphy" modes to the WordNetLemmatizer class, for users who want a standard _morphy_ lemmatizer with a more consistent pos argument. On the...

Yes @ndvbd,, the "use_morphy" argument is not even in the latest NLTK version, though was proposed in [issue 18](https://github.com/nltk/wordnet/pull/18#issue-513042920l), and sounds like a good idea...

Actually, the problem is not whether or not to use morphy, but rather to prevent morphy from recursively stripping the same suffix many times. PR #3225 fixes it.

_word_tokenize_ also fails to split contractions followed by [\a\b\v].

This needs more work in order to return the lemmas of the synset targets, as in the last example from [this comment](https://github.com/nltk/nltk/issues/1970#issue-301709671) by @marcevrard. Alternatively, depending on the consensus, the...