geval icon indicating copy to clipboard operation
geval copied to clipboard

Fluency outcome different from prompt instructions

Open jealyvda opened this issue 1 year ago • 0 comments

Fluency is the only score that is rated 1-3 instead of 1-5 as the others as per the prompt instructions. The output in the summeval.json file however indicates that fluency is consistently rated on a 1-5 scale, not following the prompt instructions. Furthermore, averaging an overall score on a 1-5 scale can be misleading, since a fluency score of 1-3 will still bring the overall score below 5.

My suggestion would be to update the fluency score to become 1-5 and update the prompt accordingly.

jealyvda avatar Nov 19 '24 09:11 jealyvda