universal-pos-tags icon indicating copy to clipboard operation
universal-pos-tags copied to clipboard

Spanish (eagle) and French (paris) Mappings updated based on rules, plus Crabbe and Candito FR POS tag mappings

Open sujitpal opened this issue 8 years ago • 1 comments

I was trying to convert POS tags in the Freeling dictionaries [1] for Spanish and French words to the universal tags using the es-eagle.map and fr-paris.map files respectively and I found many words with missing POS tags. So I used the information about Spanish (eagle) and French (paris) tagset descriptions on the Freeling website [2, 3] to generate an exhaustive list of POS tag mappings for these two mappings. Code to do so is pretty trivial, but I have included them in the generators subdirectory.

[1] http://nlp.lsi.upc.edu/freeling/node/12 [2] https://www.sketchengine.co.uk/spanish-freeling-part-of-speech-tagset/ [3] https://www.sketchengine.co.uk/french-freeling-part-of-speech-tagset/

sujitpal avatar Oct 25 '17 20:10 sujitpal

Also needed the Crabbe + Candito FR POS tags to convert POS tags generated by pretrained OpenNLP French POSTagger to Universal POS tags, so found this mapping contributed by @duhaime on this SO page - https://stackoverflow.com/questions/27513185/simplifying-the-french-pos-tag-set-with-nltk, converted to .map format.

sujitpal avatar Nov 10 '17 22:11 sujitpal