Request for Generation Parameters and Benchmark Setup Details

Open ilyasoulk opened this issue 1 year ago • 2 comments

Hello,

I am trying to reproduce the benchmark results mentioned in the Qwen2.5-Coder technical report. However, I couldn’t find detailed information about the generation parameters (e.g., temperature, top-k, top-p, num beams etc...) or the specific setup used for the benchmarks specifically for HumanEval on the 7B model.

Could you please provide more details about the configurations and settings used during the evaluation?

Thank you for your help!

Nov 15 '24 14:11 ilyasoulk

For most evaluations, we adopt a greedy decoding. You can find all the evaluation details in our evaluation scripts.

https://github.com/QwenLM/Qwen2.5-Coder/tree/main/qwencoder-eval/instruct

Nov 17 '24 11:11 huybery

Thank you! I'll check that out

Nov 19 '24 18:11 ilyasoulk