BERT-NER icon indicating copy to clipboard operation
BERT-NER copied to clipboard

which performance is better when use crf loss or softmax loss?

Open OYE93 opened this issue 5 years ago • 12 comments

Hello, thanks for your job. I have a question about different loss function, is there any difference for performance when using different loss function, like crf_loss and softmax_loss? thanks.

OYE93 avatar Jul 25 '19 03:07 OYE93

I tried the code for crf_loss and softmax_loss, both using default param, the following are the results: **--crf=False **
Selection_041 **--crf=True **
Selection_040

OYE93 avatar Aug 06 '19 02:08 OYE93

Hey, i want to know the data set.The first raw is word ,the fourth raw is the label, what's the second and third raw meaning? Another question is the output label_test.txt , its second and third raw are same, does it have another meaning ?

zwd13122889 avatar Oct 21 '19 04:10 zwd13122889

Hi, I think you mean the col, I guess the 2nd col is part-of-speech(POS), the 3rd col is word segmentation. in label_test.txt, the third col should be the predicted tags, so the 2nd and 3rd col can not the the same. you can compare the 2nd and the 3rd col to evaluate the prediction

OYE93 avatar Oct 21 '19 05:10 OYE93

Thank you, I got it. Another question is that if my dataset don't have the 2nd col (POS) and the 3rd col (word segmentation) ,can this model run?Because in the BERT_NER.py ,i don't see any treatment about the 2nd column and the 3rd.

zwd13122889 avatar Oct 21 '19 05:10 zwd13122889

off course, only the 1st and 4th col are necessary, you just transform your data to 2 col format, you can use the code for training and testing

OYE93 avatar Oct 21 '19 05:10 OYE93

Thank you very much!!!

zwd13122889 avatar Oct 21 '19 05:10 zwd13122889

:)

OYE93 avatar Oct 21 '19 05:10 OYE93

Excuse me, i have another question. Where does the label_test.txt come from? Man made or machine generated?

zwd13122889 avatar Oct 21 '19 12:10 zwd13122889

微信截图_20191021215821 Did this program run successfully?

zwd13122889 avatar Oct 21 '19 14:10 zwd13122889

label_test.txt is generated, seems successful. now you can run this on your own dataset.

OYE93 avatar Oct 22 '19 07:10 OYE93

OK. I run my own data. But i have some problem show in the picture: 微信截图_20191030151556 the left is author's data ,the right is mine

zwd13122889 avatar Oct 30 '19 07:10 zwd13122889

I just found out crf=False does not work. You always use crf layer.

I raised an issue:

https://github.com/kyzhouhzau/BERT-NER/issues/88

gungor2 avatar Sep 06 '20 20:09 gungor2