medaCy icon indicating copy to clipboard operation
medaCy copied to clipboard

Is it possible to enforce constraint on transition matrix of CRF?

Open lkqnaruto opened this issue 2 years ago • 9 comments

Hi

I wonder Is it possible to enforce constraint on transition matrix of CRF? For example, given BIO scheme, O -> I should not happen in practice, so is it possible to enforce such constraint on transition matrix?

Thanks!

lkqnaruto avatar Dec 08 '21 06:12 lkqnaruto

Hello, are you referring to the CRF learner itself, or either of the CRF layers in the BiLSTM and BERT?

If the former, we are using the sklearn's CRF implementation (see here), so it may be up to what functionality their implementation supports.

swfarnsworth avatar Dec 08 '21 18:12 swfarnsworth

I'm referring to the CRF layer in the BiLSTM and BERT

lkqnaruto avatar Dec 08 '21 18:12 lkqnaruto

They both use the CRF implemented in pytorch-crf. Does it appear to natively support what you are trying to do?

swfarnsworth avatar Dec 08 '21 18:12 swfarnsworth

They both use the CRF implemented in pytorch-crf. Does it appear to natively support what you are trying to do?

I don't think it is support what I want to do, in pytorch-crf package, the author initialized the transition matrix without any constraint (I think)

lkqnaruto avatar Dec 08 '21 19:12 lkqnaruto

They both use the CRF implemented in pytorch-crf. Does it appear to natively support what you are trying to do?

I wonder in pytorch-crf, the row index represents the current state and column index represents the next state in CRF transition matrix? or the other way around? I'm actually very confused about that.

lkqnaruto avatar Dec 08 '21 19:12 lkqnaruto

I am not sure. If you create or obtain a torch-compatible CRF implementation that does what you need, I may be able to discuss with you how to use it in medaCy, or at least point out where in the code base things would need to be changed.

swfarnsworth avatar Dec 08 '21 19:12 swfarnsworth

I am not sure. If you create or obtain a torch-compatible CRF implementation that does what you need, I may be able to discuss with you how to use it in medaCy, or at least point out where in the code base things would need to be changed.

Thank you, but in the medaCy crf layer, there is no constraint on the transition matrix, right? But If I want to enforce such constraint, where should I modify the code?

lkqnaruto avatar Dec 08 '21 19:12 lkqnaruto

None of the CRFs used in medaCy are implemented within medaCy, so the only changes one would make to medaCy code would be replacing the CRF implementations imported from its dependencies with one that does what is wanted.

In other words, you would probably have to modify the pytorch-crf code.

If you are able to get that far, please let me know and we can discuss how to switch that alternative CRF with those used in medaCy.

swfarnsworth avatar Dec 08 '21 19:12 swfarnsworth

None of the CRFs used in medaCy are implemented within medaCy, so the only changes one would make to medaCy code would be replacing the CRF implementations imported from its dependencies with one that does what is wanted.

In other words, you would probably have to modify the pytorch-crf code.

If you are able to get that far, please let me know and we can discuss how to switch that alternative CRF with those used in medaCy.

Yea, I'm currently trying to use medaCy to do NER task on my dataset, but results not quite good. And I saw some cases like O -> I in the prediction. So I wanna use CRF with some constraint so that I can further improve the performance. I think I'm going to modify the pytorch-crf code, but I just not quite sure how to do it and confused about the index. Hope you can help, thank you in advance.

lkqnaruto avatar Dec 08 '21 20:12 lkqnaruto