ABCD-ACL2021 Problem of evaluating with bleu metric

Problem of evaluating with bleu metric

Open yangjingyi opened this issue 3 years ago • 0 comments

Hi,

In function EvalStrs(pred_strs, golds) in utils.py, I am sure if the use of bleu_score(candidate, references) is correct. I check torch text document, the inputs for bleu_score should be an iterable of candidate translations and an iterable of iterables of reference translations. But in current code, the inputs are like [['a','b'], ['c', 'd']] and [['e','f'], ['g', 'h']]. Then I test with two same inputs, the bleu score is 0. It seems that there are some problems. Is the inputs for bleu_score is correct when evaluating? Or there are some problems on my understanding?

Thanks.

Oct 08 '21 02:10 yangjingyi

ABCD-ACL2021 ABCD-ACL2021 copied to clipboard

Problem of evaluating with bleu metric

ABCD-ACL2021
ABCD-ACL2021 copied to clipboard