Add stemming and lemmatisation section
According to the List_of_unsolved_problems_in_computer_science
Is there any perfect stemming algorithm in the English language?
I believe that lemmatization is not solved too.
It would be wonderful to add the states of the arts in both tasks. BTW, lemmatization consists for example of transforming the conjugated verb: jumped to his noun form: jump. Does a tool that takes in argument a word e.g fast and another argument specifying the requested part of speech form an e.g adverb which would output fastly. In fact, stemming and lemmatization are a special case of the NLP task I need. If it exists, does someone know how it's called? Where could I ask? Sorry for the digression.
benchmarks: http://universaldependencies.org/conll18/results-lemmas.html?source=post_page--------------------------- BTW great writeup at https://towardsdatascience.com/state-of-the-art-multilingual-lemmatization-f303e8ff1a8
so if en mean english: SOTAs -> en_ewt: 97.23 en_gum: 96.18 en_lines: 96.56 en_pud: 96.39
which are not that much accurate...
Thanks for the note! Would you mind taking the lead on this, i.e. adding some state-of-the-art results for lemmatization and/or stemming? I think the task that you're looking for is morphological reinflection. Note that you need not only the part-of-speech but the remaining morphosyntactic features (otherwise the problem is underspecified).