EQG-RACE Question about reproducing the result

Question about reproducing the result

Open Congregalis opened this issue 3 years ago • 2 comments

Thank you for your excellent work and sharing code.

I tried retrain your model, perhaps because my experimental setting is not very appropriate, so I can't reproduce the results in the paper. The results are as follows. There is a certain gap with the results of unified model (without Bert / ELMo).

scores: 

Bleu_1: 0.33566
Bleu_2: 0.18284
Bleu_3: 0.11764
Bleu_4: 0.07973

However, I notice that my experimental results are similar to those of the unified model (without key sense tagging) of Ablation Experiment in the paper. I don't know whether the code default setting is the ablation experimental result or my experimental setting leads to performance degradation.

Would you please provide the pre-training model of unified model (without Bert / Elmo), thank you again!

Oct 14 '21 18:10 Congregalis

Same problem, i tried many times, but the results of BLEU-4 is still between [0.70~0.80]

Mar 11 '22 04:03 leesurelye

Same problem, i tried many times, but the results of BLEU-4 is still between [0.70~0.80]

hi, how did you calculate your Bleu value，i used nltk.translate.bleu_score.sentence_bleu,and i choosed SmoothingFunction().method3,it achieved 10 without bert. Same as sacrebleu.sentence_bleu(). when i choose SmoothingFunction().method1,the result is the same as your result

Mar 27 '22 07:03 wangjiaqi886

EQG-RACE EQG-RACE copied to clipboard

Question about reproducing the result

EQG-RACE
EQG-RACE copied to clipboard