lemmatizer-pl icon indicating copy to clipboard operation
lemmatizer-pl copied to clipboard

Train disambiguation model for Polimorf tagset

Open dzieciou opened this issue 4 years ago • 0 comments

We have Polimorf-compatible dictionary but we not have train data. Original data are for NKJP tagset.

However, http://clip.ipipan.waw.pl/NationalCorpusOfPolish provides NKJP corpus reanalyzed with Polimorf tagset. That could be one training set. Another option is to convert PolEval train dataset to Polimorf tagset.

dzieciou avatar Sep 19 '19 05:09 dzieciou