crfsuite
crfsuite copied to clipboard
Question: Is "n-best" tagging possible with CRFSuite?
The Wapiti CRF toolkit has a neat feature called N-best Viterbi output which returns the n-best label sequences for an input sequence. Is there a similar functionality in crfsuite
?
Thanks for your hints!
CRFSuite does not support n-best output. The decoder algorithm is Viterbi which appears to not too difficult to make it n-best (especially for short sequences).
Did you manage to get meaningful n-best outputs with Wapiti on your data? I looked at it a while ago and realized that on my data n-best outputs not always make sense (NER).
How about looking at marginal probabilities for all possible labels in a given position (that functionality exists in the Python wrapper as pycrfsuite.Tagger.marginal() so I presume also in the CFRSuite itself) and picking the best n values?
@ZmeiGorynych Unfortunately marginals is not enough to compute the n-best sequence taggings.