opencompass
opencompass copied to clipboard

Published 20 hours ago •

Reame
Issues

[Feature] 请问在使用VLLM测评模型humaneval时，batch_size 不同导致测评结果有区别是为什么？

Open noforit opened this issue 1 year ago • 0 comments

Describe the feature

在batch_size 分别为128，64，16的情况下，deepseek 1.3B 的P@1 分别是31.71、30.49、29.27 请问这是为什么？

Will you implement it?

[ ] I would like to implement this feature and create a PR!

Apr 26 '24 08:04 noforit