paraphraseGen
paraphraseGen copied to clipboard
Evaluation
Hi!
Would you give us more details on the way you prepare the outputs of the test to measure the metrics. I'm using MULTEVAL and I'm getting this results for Quora 50k (METEOR is slightly different and TER is different in ~13 units
n=5 BLEU (s_sel/s_opt/p) METEOR (s_sel/s_opt/p) TER (s_sel/s_opt/p)
baseline 17.6 (0.4/0.1/-) 19.6 (0.2/0.0/-) 74.4 (0.4/0.3/-)
Hello,
I encountered similar results with you. Have you figured it out? I also run their codes on the MS COCO dataset, the results are terrible. Have you tried to run the code on the MS COCO dataset?