opencompass
opencompass copied to clipboard
[Feature] Support LiveCodeBench
LiveCodeBench
数据集优点:
- humaneval 与 mbpp 题目过于基础, 该数据集更难
- 来源于近期的code比赛,数据污染问题上还好很多
- 除了写代码任务,还有 结果预测, 代码修复, 代码执行。更加全面的衡量一个模型的代码能力
是否希望自己实现该功能?
- [ ] 我希望自己来实现这一功能,并向 OpenCompass 贡献代码!
We have supported APPS and TACO. Also we will release the LeetcodeBench(2023) built by ourself recently.
Cool!
Looking forward to your work!