natural
natural copied to clipboard
POS Tagger accuracy?
Do you have data on POS tagging performance?
- en-pos is 96.43% https://github.com/FinNLP/en-pos
- Stanford POS is around 97% https://nlp.stanford.edu/~manning/papers/CICLing2011-manning-tagging.pdf
No, but we can run it on a corpus to see how it performs. Do you have a suggestion for such a data set?
Hugo
Good question. Never ran tests myself, but the test collections are referenced here https://aclweb.org/aclwiki/POS_Tagging_(State_of_the_art)#Test_collections
There are problematic tags (digital marketing is greatwill give marketing as a verb) but in a large corpus, it should not matter too much. In smaller texts, it will be problematic.
@giorgio79 did you ever get any updates on this? I'd like to also use an accurate pos tagger.