mrc-for-flat-nested-ner icon indicating copy to clipboard operation
mrc-for-flat-nested-ner copied to clipboard

Reproducing the CoNLL2003 Results

Open EmanuelaBoros opened this issue 3 years ago • 8 comments

I was not able to reproduce the results reported in the ACL paper for the CoNLL 2003. Would it be possible to share the reproduce script for this dataset also? Thanks.

EmanuelaBoros avatar Mar 24 '21 11:03 EmanuelaBoros

One assumption that I have is that the performance is computed and reported on dev, instead of test. Any updates on the script? Thanks!

EmanuelaBoros avatar Apr 30 '21 11:04 EmanuelaBoros

We achieved 96.5+ F1 score on the dev set and 93.30 F1 on the test set. Please use our released data files link for CoNLL2003. Many thanks !

xiaoya-li avatar Jul 27 '21 13:07 xiaoya-li

We achieved 96.5+ F1 score on the dev set and 93.30 F1 on the test set. Please use our released data files link for CoNLL2003. Many thanks !

I could not achieve the best F1 score using the Conll2003 dataset you released. In stead the best reasult I got is 91.71 F1 score. I think the reason is uncorrect hyperparameter set. Due to the limitation of computing power, I can not achieve the best result in the paper. So could you please share the reproduce script for CONLL2003 also? Thanks very much.

lin-whale avatar Jul 30 '21 02:07 lin-whale

Hello, @xiaoya-li, my results are similar to @Lilin-whale with the your released data files for CoNLL2003. Any updates on the script?

EmanuelaBoros avatar Aug 02 '21 14:08 EmanuelaBoros

I found that the dataset is different from the original conll data. Did you do extra preprocessing? I found there are some modifications like lowercasing some letters

shizhediao avatar Sep 20 '21 05:09 shizhediao

Please use ./scripts/mrc_ner/reproduce/conll03.sh for reproduing our experimental results. MRC-NER format datasets for CoNLL03 are available at link Thanks.

xiaoya-li avatar Sep 22 '21 17:09 xiaoya-li

Hi @xiaoya-li, I found a similar issue with @shizhediao. The conll2003 dataset in the provided link is different from the originally released version. The differences are not about the formatting, but the total number of tags and also lowercase letters. May I ask which version of conll2003 you used?

xuuuluuu avatar Feb 11 '22 08:02 xuuuluuu

@EmanuelaBoros I have met the same problem, do you solved this problem later? I found conll03 datatset have some problems.

Senwang98 avatar Feb 22 '22 02:02 Senwang98