opencompass [Feature] Support BigCodeBench

[Feature] Support BigCodeBench

Open terryyz opened this issue 5 months ago • 2 comments

Describe the feature

BigCode (Hugging Face and ServiceNow Research) released a new large-scale benchmark, BigCodeBench, for code generation with diverse function calls and complex instructions, covering 1140 expert-annotated tasks. It has been officially used by DeepSeek and CodeGeeX4. BigCodeBench is considered a better alternative for HumanEval and other function-level code generation benchmarks (see here).

Preprint
GitHub
Leaderboard

Will you implement it?

[ ] I would like to implement this feature and create a PR!

Aug 26 '24 18:08 terryyz

opencompass opencompass copied to clipboard

[Feature] Support BigCodeBench

Describe the feature

Will you implement it?

opencompass
opencompass copied to clipboard