pke icon indicating copy to clipboard operation
pke copied to clipboard

Add lemmatization option for normalizing loaded documents

Open yetra opened this issue 2 years ago • 0 comments

According to #75, there used to be a lemmatization option for the load_document() method's normalization parameter.

This doesn't seem to be the case any longer - stemming is applied or word surface forms are used as stems - even though lemmas are extracted during text loading.

I'm (re)adding the lemmatization option as it would be very useful to have for e.g. TF-IDF.

yetra avatar Mar 30 '22 08:03 yetra