Binyuan Hui
Binyuan Hui
🤔 I think we can track the contributions of each one under this PR, and for special contributors we can also discuss here.
HumanevalPack: https://github.com/OpenDevin/OpenDevin/pull/1908
GAIA: https://github.com/OpenDevin/OpenDevin/pull/1911
SWE-bench: https://github.com/OpenDevin/OpenDevin/pull/1468 (No doubt it will get more points!)
plz refer to: https://github.com/QwenLM/CodeQwen1.5/tree/main/evaluation/eval_plus
Some details can be found in the technical report: https://arxiv.org/abs/2409.12186
@emangamer We're currently aiming for rapid prototyping (and won't consider using a complex framework for now), so feel free to discuss future architectural options with us at slack.
Please give us some time to give you all a best practice.
Please pull the latest version of the model and provide a full prompt for us to reproduce if you still have problems.
👍 Will take a closer look, maybe need to resolve the conflict.