[Bug] 使用deepseek-v2-lite-chat进行humaneval测试时，出现问题，无论是在官网还是本地部署测试分数只有1.22，其他模型都正常

Open shuowoshishui opened this issue 9 months ago • 4 comments

I'm evaluating with the officially supported tasks/models/datasets.

使用该配置

出现的问题

。

No response

Mar 10 '25 10:03 shuowoshishui

其他的模型无论是在线测试还是我在离线测试分数都正常

Mar 10 '25 10:03 shuowoshishui

Thank you for the report. We will investigate this issue.

Mar 10 '25 10:03 tonysy

对了我使用evalscope+opencompass推理结果也是比较正常的

Mar 10 '25 10:03 shuowoshishui

I also find this problem. Plz fix it.

Apr 29 '25 05:04 qianweijiujiu