torcheval fix: correct reference length calculation

fix: correct reference length calculation

Open yuxqiu opened this issue 9 months ago • 1 comments

Summary

This PR fixes the way brevity penalty (specifically the effective reference corpus length) is calculated in BLEU.

Previously, len_reference was calculated as min([len(ref) for ref in references_tokenized]). However, this is incorrect, because according to the paper, we need to find the "best match length", not the minimum reference length.

For more information, see wikipedia - brevity penalty and nltk implementation.

Test plan

I added another unit test to test_bleu.py and compared the results of the calculations to the results of the nltk.translate.bleu_score.corpus_bleu function to make sure the implementation is correct.

Apr 27 '24 13:04 yuxqiu

@JKSenthil has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

May 01 '24 20:05 facebook-github-bot

Hi, I am wondering if there is a way for me to get the reason why the test is failing so that I can fix the problem.

May 08 '24 01:05 yuxqiu

Hi @yuxqiu, thanks for this contribution! it seems some files unrelated to BLEU have been formatted in a way in which causes our linter to error, do you mind undo-ing those changes?

May 08 '24 22:05 JKSenthil

@JKSenthil I've finished undo-ing all those changes.

May 09 '24 03:05 yuxqiu

@JKSenthil has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

May 09 '24 20:05 facebook-github-bot

Hi @yuxqiu, thanks for reverting! We have identified the linter issue to be on our end, we'll land a fix first then rerun these tests again :)

May 10 '24 14:05 JKSenthil

@JKSenthil has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

May 14 '24 00:05 facebook-github-bot

torcheval torcheval copied to clipboard

fix: correct reference length calculation

Summary

Test plan

torcheval
torcheval copied to clipboard