opencompass [Feature] Support pass@1 evaluation for multi predictions in MathEvaluator

[Feature] Support pass@1 evaluation for multi predictions in MathEvaluator

Open DELEnomore opened this issue 3 months ago • 0 comments

Describe the feature

When using a Hugging Face model with the parameter num_return_sequences set greater than 1, the output column “predictions” becomes a list instead of a string. As a result, the MathEvaluator always returns an accuracy of 0, regardless of whether the prediction is correct. It would be beneficial if the score function could handle list-type inputs and evaluate pass@1 using multiple predictions, similar to the approach mentioned in the DeepSeek-R1 technical report.

Will you implement it?

[x] I would like to implement this feature and create a PR!

Aug 28 '25 08:08 DELEnomore

opencompass opencompass copied to clipboard

[Feature] Support pass@1 evaluation for multi predictions in MathEvaluator

Describe the feature

Will you implement it?

opencompass
opencompass copied to clipboard