DSTC10-MOD Why BLEU is greater than 1?

Why BLEU is greater than 1?

Open LiHui1116 opened this issue 3 years ago • 1 comments

According to the results of table 4 and table 6 that you published in the paper Towards Expressive Communication with Internet Memes: A New Multimodal Conversation Dataset and Bechmark, the BLUE score is greater than 1. This is contrary to the definition of blue, which requires the values should be between 0 and 1. At the same time, I cheak the file task1_score.py and I didn't find the amplification factor multiplied in the Blue calculation. Look forward to your reply.

Jan 03 '22 10:01 LiHui1116

According to the results of table 4 and table 6 that you published in the paper Towards Expressive Communication with Internet Memes: A New Multimodal Conversation Dataset and Bechmark, the BLUE score is greater than 1. This is contrary to the definition of blue, which requires the values should be between 0 and 1. At the same time, I cheak the file task1_score.py and I didn't find the amplification factor multiplied in the Blue calculation. Look forward to your reply.

I notice that the authors incorrectly implemented BLEU score on the file task1_score.py. Specifically, they computed corpus BLEU (corpus bleu from NLTK) on a single pair (inference, hypothesis), and then they averaged them over the corpus (divide by the length of samples). That was totally wrong

Btw I think that the BLEU benchmark in the paper is multiplied by 100 to achieve percentage number

Apr 26 '22 11:04 Tuan-Lee-23

DSTC10-MOD DSTC10-MOD copied to clipboard

Why BLEU is greater than 1?

DSTC10-MOD
DSTC10-MOD copied to clipboard