mindmeld
mindmeld copied to clipboard
Add sequence probability support for PyTorch CRF model
For active learning, there a couple of strategies that we currently utilize using the CRF suite model's marginal probabilities. Studies have shown that returning sequence-level probabilities instead of token-level marginal probabilities works much better and this is something that can be implemented in a future release. So in order to modify this function, in addition to the best transition score (t,j) and corresponding backward link for the transition, we’ll have to store top-n transition scores and the corresponding n backward links, and then trace all n-paths, resulting in the n-best sequences and corresponding probabilities.
https://github.com/chokkan/crfsuite/issues/84 https://github.com/TeamHG-Memex/eli5/blob/417358f0ece0587fa18e9926756b7a9842776b72/eli5/sklearn_crfsuite/explain_weights.py#L68
Issues and resources with the previous CRFSuite package