paraphraseGen Evaluation

Evaluation

Open gissemari opened this issue 5 years ago • 1 comments

Hi! Would you give us more details on the way you prepare the outputs of the test to measure the metrics. I'm using MULTEVAL and I'm getting this results for Quora 50k (METEOR is slightly different and TER is different in ~13 units n=5 BLEU (s_sel/s_opt/p) METEOR (s_sel/s_opt/p) TER (s_sel/s_opt/p) baseline 17.6 (0.4/0.1/-) 19.6 (0.2/0.0/-) 74.4 (0.4/0.3/-)

Apr 10 '19 14:04 gissemari

Hello,

I encountered similar results with you. Have you figured it out? I also run their codes on the MS COCO dataset, the results are terrible. Have you tried to run the code on the MS COCO dataset?

Jan 17 '21 19:01 jackyuanjie1990

paraphraseGen paraphraseGen copied to clipboard

Evaluation

paraphraseGen
paraphraseGen copied to clipboard