nlg-eval Fix calculation error when ref is empty

Fix calculation error when ref is empty

Open voidful opened this issue 5 years ago • 3 comments

This pull request fixing the issue that any of the ref is empty

When the number of ref is inconsistent, i will fill a empty string as padding. which causing an error.

scores = n.compute_metrics(ref_list=[
            [
                "this is one reference sentence for sentence1",
                ""
            ],
            [
                "this is one more reference sentence for sentence1",
                "this is the second reference sentence for sentence2"
            ],
        ],
            hyp_list=[
                "this is the model generated sentence1 which seems good enough",
                "this is sentence2 which has been generated by your model"
            ]
        )

Jan 05 '20 08:01 voidful

All CLA requirements met.

Jan 05 '20 08:01 msftclas

Thanks for pointing this out. The references are the targets that the generated hypothesis should match. It's possible that a target would indeed be an empty string so I think we should correct what is actually causing the error instead of silently ignoring empty strings which could mean that a hypothesis is compared with a target that it wasn't meant to be compared with. For example:

References:

"Sentence 1"
"" (this one would get filtered out)
"Sentence 3"

Hypotheses:

"Sentence 1"
"Sentence 2"
"Sentence 3"

Jan 05 '20 21:01 juharris

I agree that we should correct what is actually causing the error To clear this in more general way: when one of the ref is empty or hyp is empty

ref=["this is a test",""],
hyp="this is a good test"

ref=["this is a good test"],
hyp=""

vectorize metric(Skip-thought/ glove_metrics) will cause error due to empty input to encode so the following commit will try to correct it.

Jan 06 '20 15:01 voidful

nlg-eval nlg-eval copied to clipboard

Fix calculation error when ref is empty

nlg-eval
nlg-eval copied to clipboard