jitar icon indicating copy to clipboard operation
jitar copied to clipboard

(Enhancement) Add "hasCapitalizationInfo" option

Open dlutz2 opened this issue 9 years ago • 0 comments

jitar seems to be tagset agnostic except for one line in HMMTagger which assumes that tags are prefixed with the capitalization info added by the FrequenciesCollector :

String tag = d_numberTags.get(tagNumber).substring(2);

An option to trim the first 2 characters or not would allow an arbitrary tagset to be used as well as supporting models created in earlier jitar versions.

dlutz2 avatar Jul 27 '16 17:07 dlutz2