aider icon indicating copy to clipboard operation
aider copied to clipboard

Add more benchmarks

Open markokocic opened this issue 4 months ago • 7 comments

Add benchmarks for DeepSeek V3.1 and Qwen3-Coder

markokocic avatar Aug 31 '25 11:08 markokocic

Can I take this task?

AkshaySyal avatar Sep 02 '25 00:09 AkshaySyal

I have not seen gpt5 on the leaderboard either?

jakob1379 avatar Sep 02 '25 07:09 jakob1379

@jakob1379 👉 https://github.com/Aider-AI/aider/pull/4475

butaca avatar Sep 02 '25 15:09 butaca

GPT-5 is there now, but not gpt-5-mini

markokocic avatar Sep 03 '25 20:09 markokocic

Would love to see the gpt oss models as well

mattharrison avatar Sep 09 '25 05:09 mattharrison

Other new DeepSeek models are missing now too.

  • DeepSeek-3.1-Terminus
  • DeepSeek-3.2 as well ase the latest Claude
  • Claude Sonnet 4.5

I know that model vendors report Aider Polyglot benchmark results themselves, but since those benchmars are missing from the official benchmarks, it's not clear if user can trust them.

markokocic avatar Sep 30 '25 12:09 markokocic

GLM-4.6 and Claude Sonnet 4.5 benchmarks?

Other new DeepSeek models are missing now too.

* DeepSeek-3.1-Terminus

* DeepSeek-3.2
  as well ase the latest Claude

* Claude Sonnet 4.5

I know that model vendors report Aider Polyglot benchmark results themselves, but since those benchmars are missing from the official benchmarks, it's not clear if user can trust them.

deepseek benchies were merged

Kreijstal avatar Oct 04 '25 09:10 Kreijstal