pke
pke copied to clipboard
Add lemmatization option for normalizing loaded documents
According to #75, there used to be a lemmatization
option for the load_document()
method's normalization
parameter.
This doesn't seem to be the case any longer - stemming
is applied or word surface forms are used as stems - even though lemmas are extracted during text loading.
I'm (re)adding the lemmatization
option as it would be very useful to have for e.g. TF-IDF.