nlp-js-tools-french
nlp-js-tools-french copied to clipboard
a couple of problematic results from lemmatizer
- I'm not sure what's happening here, but I was trying to lemmatize the word "écœurante," with config set to { tagTypes: ['adj', 'ver', 'nom'], strictness: false, minimumLength: 3, debug: true }; I had tried with strictness set to true first, then false, but it doesn't seem to matter. The result I get from
var nlpToolsFr = new NlpjsTFr(s, config); var lemmatizedWords = nlpToolsFr.lemmatizer();
is [{"id":0,"word":"urante","lemma":"urante"}], with the écœ at the beginning removed. I can't tell why. Other words beginning with é seem OK.
- [{"id":0,"word":"épaules","lemma":"épaules"}] Shouldn't the lemma be "épaule"? This was with the same config object as above.
Hello mcthulhu,
- It kind of makes sense since I didn't anticipate this specific case, I'll patch it soon :)
- Weird, I'll have a look
I'll keep you informed.
Bastien